Re: Need to Speed up Position[]
- To: mathgroup at smc.vnet.net
- Subject: [mg110685] Re: Need to Speed up Position[]
- From: Peter Pein <petsie at dordos.net>
- Date: Fri, 2 Jul 2010 02:56:18 -0400 (EDT)
- References: <i0i1kr$hq6$1@smc.vnet.net>
Am Thu, 1 Jul 2010 12:28:11 +0000 (UTC) schrieb Garapata <warsaw95826 at mypacks.net>: > I have a large nested list, "myList" > > It has 3 sublists with the following dimensions: > > Dimensions/@ myList > > {{19808, 5}, {7952, 5}, {7952, 5}} > > The 5th position (i.e., column) in each of the sublists has > SQLDateTime[]s > (This may or may not affect what I need, but I thought everyone should > know). > > myIntersection = Intersection @@ (myList[[All, All, 5]]); > > gives me the SQLDateTimes[]s common to all sublists. I get 3954 > common elements. > > Length[myIntersection] > > 3954 > > All of the above works great and runs very fast. > > I then find the positions in myList where all the common > SQLDateTimes[]s occur and then use Extract pull them out into a new > list > > myPositions = Drop[(Position[data, #] & /@ myIntersection), > None, None, -1]; > > myOutput = Extract[myList, #] & /@ myPositions; > > I end up with just what I need, which in this case gives me 3954 rows > of {9, 5} sublists. This occurs because myList[[1]] has 5 occurrences > of each common date element and sublists myList[[2]] and myList[[3]] > each have 2 occurrences of each common date element. > > The Extract[] runs very fast. > > My problem =85. the Position[] runs very very slow (over 90 seconds > on a dual core iMac). > > All the code together: > > myIntersection = Intersection @@ (myList[[All, All, 5]]); > myPositions = Drop[(Position[data, #] & /@ myIntersection), None, > None, -1]; > myOutput = Extract[myList, #] & /@ myPositions; > > So, does anyone know a way to speed up: > > myPositions = Drop[(Position[data, #] & /@ myIntersection), None, > None, -1]; ? > > Or can anyone suggest another approach to doing this that could run > faster. > > Patterns? > ParallelMap? > Parallelize? > Sorting? > Changing SQLDateTimes to DateList[]s before calculating myPositions? > > Not clear what to try. > Please advise. > > Thanks. > Hi, if you are interested in myOutput only and do not need to keep myPositions for later use, you can try something like: In[1]:= myList=RandomInteger[{1,5555},#]&/@{{19808,5},{7952,5},{7952,5}}; In[2]:= Length[myIntersection=Intersection@@myList[[All,All,5]]] Out[2]= 3126 In[3]:= Timing[Dimensions[ myOutput=Split[Sort[Cases[myList,{___,Alternatives@@myIntersection},{2}],Last[#1]<Last[#2]&],Last[#1]===Last[#2]&] ]] Out[3]= {11.28,{3126}} In[4]:= myOutput[[1]] Out[4]= {{3830,4047,4200,3520,1},{4788,4153,2710,2938,1},{886,2560,5266,128,1},{143,218,3189,3672,1},{190,510,4701,212,1}} Peter