[Date Index]
[Thread Index]
[Author Index]
Re: Complement replacement
*To*: mathgroup at smc.vnet.net
*Subject*: [mg58739] Re: Complement replacement
*From*: "Carl K. Woll" <carlw at u.washington.edu>
*Date*: Sat, 16 Jul 2005 01:03:46 -0400 (EDT)
*Organization*: University of Washington
*References*: <200507100912.FAA06500@smc.vnet.net> <datacb$mg2$1@smc.vnet.net>
*Sender*: owner-wri-mathgroup at wolfram.com
It occurred to me that SparseArray[list] needs to extract the nonzero
information from list in order to create its sparse array internal format.
We can use this mechanism to drop the zeros from a list:
list={1,0,3,0,4,2,3};
In[17]:=
SparseArray[list] /. SparseArray[_,_,_,x_]:>x[[3]]
Out[17]=
{1,3,4,2,3}
For large lists, we have
list=Table[Random[Integer],{10^6}];
In[19]:=
r1=DeleteCases[list,0];//Timing
r2=SparseArray[list]/.SparseArray[_,_,_,x_]->x[[3]];//Timing
r1===r2
Out[19]=
{0.265 Second,Null}
Out[20]=
{0.047 Second,Null}
Out[21]=
True
So, using SparseArrays is much quicker. Incorporating this mechanism for
deleting zeros we have
missing[list_] := Module[{r = Range[Length[list]]},
r[[list]] = 0;
SparseArray[r] /. SparseArray[_, _, _, x_] :> x[[3]]]
Let's compare this to Complement:
pp = Table[Random[Integer, {1, 2 10^6}], {2 10^6}];
In[26]:=
r1=missing[pp];//Timing
r2=Complement[Range[Length[pp]],pp];//Timing
r1===r2
Out[26]=
{0.235 Second,Null}
Out[27]=
{1.703 Second,Null}
Out[28]=
True
This new version of missing is about 3 times faster than my old one, and
almost an order of magnitude faster than using Complement for this size data
set.
Carl Woll
Wolfram Research
SparseArray[tst] /. SparseArray[_,_,_,x_]:>x[[3]]
"Carl K. Woll" <carl at woll2woll.com> wrote in message
news:datacb$mg2$1 at smc.vnet.net...
> ----- Original Message -----
> From: <konstantpi at mail15.com>
To: mathgroup at smc.vnet.net
> Subject: [mg58739] Complement replacement
>
>
>> hi
>> in the list:
>> pp=Table[Random[Integer, {1, 1000}], {i, 1000}];
>> how could i know which numbers from 1 to 1000 does not exist in the
>> pp List.
>> but without using:
>> Complement[Table[i,{i,1000}],pp]
>> regards
>>
>
> Here is a possibility that is faster than Complement on my machine for
> this
> particular problem:
>
> missing[list_] := Module[{len = Length[list], a},
> a = Range[len];
> a[[list]] = 0;
> DeleteCases[a, 0]]
>
> The above function scales as O(n), and Complement scales as O(n log n) due
> to sorting, so for large enough input my function ought to be faster. For
> example, with the data:
>
> n = 10^6;
> pp = Table[Random[Integer, {1, n}], {n}];
>
> We have the following comparison:
>
> In[26]:=
> c1=Complement[Range[Length[pp]],pp];//Timing
> c2=missing[pp];//Timing
> c1===c2
>
> Out[26]=
> {1.594 Second,Null}
>
> Out[27]=
> {0.625 Second,Null}
>
> Out[28]=
> True
>
> A further slight increase in speed ought to be possible if you compile
> DeleteCases[a,0] since a is a list of integers. For example, define
> deletezeros as:
>
> deletezeros = Compile[{{a, _Integer, 1}},
> Module[{b, len, j},
> len = Length[a];
> b = a;
> j = 0;
> Do[
> If[a[[i]] =!= 0, b[[++j]] = a[[i]]], {i, len}];
> Take[b, j]]]
>
> Then, replacing DeleteCases[a,0] with deletezeros[a] approximately doubles
> the speed.
>
> Carl Woll
> Wolfram Research
>
>
Prev by Date:
**Re: Re: Graphical modeling of extinct lifeforms with Mathematica**
Next by Date:
**Modeling and Array Problem**
Previous by thread:
**Re: Complement replacement**
Next by thread:
**Re: Complement replacement**
| |