MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Need to Speed up Position[]

  • To: mathgroup at smc.vnet.net
  • Subject: [mg110677] Re: Need to Speed up Position[]
  • From: Zach Bjornson <bjornson at mit.edu>
  • Date: Fri, 2 Jul 2010 02:54:48 -0400 (EDT)

Hello,

Without taking the time to fully understand your list structures, there 
are two easy ways to generally speed up Position. The first is to limit 
the search space, e.g. data[[All,1]] if all the things you are searching 
for are only in the first column (also limit the levelspec if you can). 
The second is to limit the number of hits (which it sounds like isn't an 
option for you), e.g. Position[data, #, Infinity, 1] to only find the 
first occurrence.

On its own, Position is not parallelized (so your two cores don't really 
help). You can make it parallelized by putting Position into 
ParallelTable, and having the iterator for ParallelTable refer to the 
query that Position is searching for:
myPositions = ParallelTable[Position[data, i],{i,myIntersection}]

Hope that helps,
Zach

On 7/1/2010 8:28 AM, Garapata wrote:
> I have a large nested list, "myList"
>
> It has 3 sublists with the following dimensions:
>
>     Dimensions/@ myList
>
>     {{19808, 5}, {7952, 5}, {7952, 5}}
>
> The 5th position (i.e., column) in each of the sublists has
> SQLDateTime[]s
> (This may or may not affect what I need, but I thought everyone should
> know).
>
>     myIntersection = Intersection @@ (myList[[All, All, 5]]);
>
> gives me the SQLDateTimes[]s common to all sublists.  I get 3954
> common elements.
>
>     Length[myIntersection]
>
>     3954
>
> All of the above works great and runs very fast.
>
> I then find the positions in myList where all the common
> SQLDateTimes[]s occur and then use Extract pull them out into a new
> list
>
> 	myPositions = Drop[(Position[data, #]&  /@ myIntersection), None,
> None, -1];
>
> 	myOutput = Extract[myList, #]&  /@ myPositions;
>
> I end up with just what I need, which in this case gives me 3954 rows
> of {9, 5} sublists. This occurs because myList[[1]] has 5 occurrences
> of each common date element and sublists myList[[2]] and myList[[3]]
> each have 2 occurrences of each common date element.
>
> The Extract[] runs very fast.
>
> My problem =85. the Position[] runs very very slow (over 90 seconds on a
> dual core iMac).
>
> All the code together:
>
>     myIntersection = Intersection @@ (myList[[All, All, 5]]);
>     myPositions = Drop[(Position[data, #]&  /@ myIntersection), None,
> None, -1];
>     myOutput = Extract[myList, #]&  /@ myPositions;
>
> So, does anyone know a way to speed up:
>
>     myPositions = Drop[(Position[data, #]&  /@ myIntersection), None,
> None, -1]; ?
>
> Or can anyone suggest another approach to doing this that could run
> faster.
>
> Patterns?
> ParallelMap?
> Parallelize?
> Sorting?
> Changing SQLDateTimes to DateList[]s before calculating myPositions?
>
> Not clear what to try.
> Please advise.
>
> Thanks.
>
>    


  • Prev by Date: Re: Need to Speed up Position[]
  • Next by Date: Re: precedence for ReplaceAll?
  • Previous by thread: Re: Need to Speed up Position[]
  • Next by thread: Re: Need to Speed up Position[]