MathGroup Archive 2006

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: position lists

  • To: mathgroup at smc.vnet.net
  • Subject: [mg68887] Re: [mg68748] Re: position lists
  • From: Daniel Lichtblau <danl at wolfram.com>
  • Date: Tue, 22 Aug 2006 05:20:24 -0400 (EDT)
  • References: <ebujvu$6lu$1@smc.vnet.net> <200608180711.DAA01933@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

rych wrote:
> Yes, thanks to everyone who replied, this indeed does it:
> In[469]:=
> a={1,1,2,2,1,4};
> Flatten@Position[a,#]&/@Range[4]
> 
> Out[470]=
> {{1,2,5},{3,4},{},{6}}
> 
> But it grows linear with the "number of bins" (=4). It searches for
> each bin among all the positions, even those that have been previously
> identified as belonging to differetn bins.
> 
> A more effective solution is to use linked-list arrays:
> In[473]:=
> PositionListsC=Compile[{{a,_Integer,1},{nBins,_Integer}},
>       Module[{heads =
> Table[0,{nBins}],linkedlist=Table[0,{Length@a}],bin},
>         Do[
>           bin = a[i];
>           linkedlist[i]=heads[bin];
>           heads[bin]=i,
>           {i,Length@a}];
>         (*To avoid the "Non-tensor object generated;proceeding with
> uncompiled evaluation"*)
>         Join[heads,linkedlist]]
>       ];
> (joined=PositionListsC[a,4] ;{Take[joined,4],Take[joined,-Length@a]})
> Out[474]=
> {{5,4,0,6},{0,1,0,3,2,0}}
> It gives two lists: heads -- the last point that fell in the bin and a
> linked list of all the other points in the same bin (terminated by 0).
> For example in the bin #1 we have 5->2->1. So we do get the same
> information, but in a different structure.
> 
> Let's see their performance
> In[478]:=
> nBins = 1000; b = Table[Random[Integer, {1, nBins}], {100000}];
> (joined = PositionListsC[b, nBins] ; {
>     Take[joined, nBins], Take[joined, -Length@b]};) // Timing
> Flatten@Position[b, #] & /@ Range[nBins]; // Timing
> Out[479]=
> {0.031 Second, Null}
> Out[480]=
> {15.329 Second, Null}
> 
> My questions was whether there's such a built-in list function. Or
> perhaps a package that could give lists of points sorted into bins.


I'm not sure what you actually did but the code you show above will not 
work as advertised. Array indexing requires double brackets, not single. 
Below is code that will give {{5,4,0,6},{0,1,0,3,2,0}} for your simple 
example.

PositionListsC = Compile[{{a,_Integer,1},{nBins,_Integer}},
   Module[
   {heads=Table[0,{nBins}], linkedlist=Table[0,{Length[a]}], bin},
   Do [
     bin = a[[i]];
     linkedlist[[i]] = heads[[bin]];
     heads[[bin]] = i
     , {i,Length[a]}
     ];
   Join[heads,linkedlist]]
   ];

Now the question is how to convert this to the same representation as is 
given by the slower code:

positionLists2[ll_List, n_Integer] :=
   Flatten@Position[ll,#]& /@ Range[n]]

Offhand I don't know, partly because I have not decrypted this form of 
result from PositionListsC.

[I will remark that I utterly despise this construct 
Flatten at Position[...]& ... because it is quite unobvious how the binding 
goes. Is it equivalent to

positionLists2b[ll_List, n_Integer] :=
   Flatten[Position[ll,#]]& /@ Range[n]

or to

positionLists2c[ll_List, n_Integer] :=
   Flatten[Position[ll,#]& /@ Range[n]]

??

It's the former, but really, who knew?]

Anyway, I advocate circumventing the whole thing using Reap[Sow[...]] as 
below.

positionLists3[ll_List, n_Integer] :=
   Map[Flatten,Last[Reap[MapIndexed[Sow[#2[[1]],#1]&, ll], Range[n]]]]

nBins = 1000;
b = Table[Random[Integer, {1, nBins}], {100000}];

Your reference timing:

In[55]:= Timing[(joined = PositionListsC[b, nBins] ; {
     Take[joined, nBins], Take[joined, -Length@b]};)]
Out[55]= {0.032002, Null}

The pedestrian method:

In[56]:= Timing[res2 = positionLists2[b,nBins];]
Out[56]= {11.8007, Null}

The Reap[Sow[]] method:

In[59]:= Timing[res3 = positionLists3[b,nBins];]
Out[59]= {0.452028, Null}

In[60]:= res2===res3
Out[60]= True

So it is 15x slower than your PositionListsC but requires no 
post-processing to get it into an explicit set of bins.


Daniel Lichtblau
Wolfram Research




  • Prev by Date: Re: benchmark...why don't you send it back?
  • Next by Date: Re: Re: List Help Needed
  • Previous by thread: Re: position lists
  • Next by thread: Re: position lists