Re: Need to Speed up Position[]
- To: mathgroup at smc.vnet.net
- Subject: [mg110677] Re: Need to Speed up Position[]
- From: Zach Bjornson <bjornson at mit.edu>
- Date: Fri, 2 Jul 2010 02:54:48 -0400 (EDT)
Hello, Without taking the time to fully understand your list structures, there are two easy ways to generally speed up Position. The first is to limit the search space, e.g. data[[All,1]] if all the things you are searching for are only in the first column (also limit the levelspec if you can). The second is to limit the number of hits (which it sounds like isn't an option for you), e.g. Position[data, #, Infinity, 1] to only find the first occurrence. On its own, Position is not parallelized (so your two cores don't really help). You can make it parallelized by putting Position into ParallelTable, and having the iterator for ParallelTable refer to the query that Position is searching for: myPositions = ParallelTable[Position[data, i],{i,myIntersection}] Hope that helps, Zach On 7/1/2010 8:28 AM, Garapata wrote: > I have a large nested list, "myList" > > It has 3 sublists with the following dimensions: > > Dimensions/@ myList > > {{19808, 5}, {7952, 5}, {7952, 5}} > > The 5th position (i.e., column) in each of the sublists has > SQLDateTime[]s > (This may or may not affect what I need, but I thought everyone should > know). > > myIntersection = Intersection @@ (myList[[All, All, 5]]); > > gives me the SQLDateTimes[]s common to all sublists. I get 3954 > common elements. > > Length[myIntersection] > > 3954 > > All of the above works great and runs very fast. > > I then find the positions in myList where all the common > SQLDateTimes[]s occur and then use Extract pull them out into a new > list > > myPositions = Drop[(Position[data, #]& /@ myIntersection), None, > None, -1]; > > myOutput = Extract[myList, #]& /@ myPositions; > > I end up with just what I need, which in this case gives me 3954 rows > of {9, 5} sublists. This occurs because myList[[1]] has 5 occurrences > of each common date element and sublists myList[[2]] and myList[[3]] > each have 2 occurrences of each common date element. > > The Extract[] runs very fast. > > My problem =85. the Position[] runs very very slow (over 90 seconds on a > dual core iMac). > > All the code together: > > myIntersection = Intersection @@ (myList[[All, All, 5]]); > myPositions = Drop[(Position[data, #]& /@ myIntersection), None, > None, -1]; > myOutput = Extract[myList, #]& /@ myPositions; > > So, does anyone know a way to speed up: > > myPositions = Drop[(Position[data, #]& /@ myIntersection), None, > None, -1]; ? > > Or can anyone suggest another approach to doing this that could run > faster. > > Patterns? > ParallelMap? > Parallelize? > Sorting? > Changing SQLDateTimes to DateList[]s before calculating myPositions? > > Not clear what to try. > Please advise. > > Thanks. > >