MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Need to Speed up Position[]

  • To: mathgroup at smc.vnet.net
  • Subject: [mg110685] Re: Need to Speed up Position[]
  • From: Peter Pein <petsie at dordos.net>
  • Date: Fri, 2 Jul 2010 02:56:18 -0400 (EDT)
  • References: <i0i1kr$hq6$1@smc.vnet.net>

Am Thu, 1 Jul 2010 12:28:11 +0000 (UTC)
schrieb Garapata <warsaw95826 at mypacks.net>:

> I have a large nested list, "myList"
> 
> It has 3 sublists with the following dimensions:
> 
>    Dimensions/@ myList
> 
>    {{19808, 5}, {7952, 5}, {7952, 5}}
> 
> The 5th position (i.e., column) in each of the sublists has
> SQLDateTime[]s
> (This may or may not affect what I need, but I thought everyone should
> know).
> 
>    myIntersection = Intersection @@ (myList[[All, All, 5]]);
> 
> gives me the SQLDateTimes[]s common to all sublists.  I get 3954
> common elements.
> 
>    Length[myIntersection]
> 
>    3954
> 
> All of the above works great and runs very fast.
> 
> I then find the positions in myList where all the common
> SQLDateTimes[]s occur and then use Extract pull them out into a new
> list
> 
> 	myPositions = Drop[(Position[data, #] & /@ myIntersection),
> None, None, -1];
> 
> 	myOutput = Extract[myList, #] & /@ myPositions;
> 
> I end up with just what I need, which in this case gives me 3954 rows
> of {9, 5} sublists. This occurs because myList[[1]] has 5 occurrences
> of each common date element and sublists myList[[2]] and myList[[3]]
> each have 2 occurrences of each common date element.
> 
> The Extract[] runs very fast.
> 
> My problem =85. the Position[] runs very very slow (over 90 seconds
> on a dual core iMac).
> 
> All the code together:
> 
>    myIntersection = Intersection @@ (myList[[All, All, 5]]);
>    myPositions = Drop[(Position[data, #] & /@ myIntersection), None,
> None, -1];
>    myOutput = Extract[myList, #] & /@ myPositions;
> 
> So, does anyone know a way to speed up:
> 
>    myPositions = Drop[(Position[data, #] & /@ myIntersection), None,
> None, -1]; ?
> 
> Or can anyone suggest another approach to doing this that could run
> faster.
> 
> Patterns?
> ParallelMap?
> Parallelize?
> Sorting?
> Changing SQLDateTimes to DateList[]s before calculating myPositions?
> 
> Not clear what to try.
> Please advise.
> 
> Thanks.
> 

Hi,

if you are interested in myOutput only and do not need to keep
myPositions for later use, you can try something like:

In[1]:=
myList=RandomInteger[{1,5555},#]&/@{{19808,5},{7952,5},{7952,5}};

In[2]:= Length[myIntersection=Intersection@@myList[[All,All,5]]]
Out[2]= 3126
In[3]:=
Timing[Dimensions[
  myOutput=Split[Sort[Cases[myList,{___,Alternatives@@myIntersection},{2}],Last[#1]<Last[#2]&],Last[#1]===Last[#2]&]
]]
Out[3]= {11.28,{3126}}
In[4]:= myOutput[[1]]
Out[4]=
{{3830,4047,4200,3520,1},{4788,4153,2710,2938,1},{886,2560,5266,128,1},{143,218,3189,3672,1},{190,510,4701,212,1}}


Peter



  • Prev by Date: Re: A modified StyleSheet results in FontSize fluctuations
  • Next by Date: Re: Need to Speed up Position[]
  • Previous by thread: Need to Speed up Position[]
  • Next by thread: Re: Need to Speed up Position[]