Re: Need to Speed up Position[]

• To: mathgroup at smc.vnet.net
• Subject: [mg110706] Re: Need to Speed up Position[]
• From: Leonid Shifrin <lshifr at gmail.com>
• Date: Sat, 3 Jul 2010 08:15:52 -0400 (EDT)

```Hi,

This is a follow-up to my previous post.

I did not quite get your original goal before and my previous post was
somewhat messy. Here is the code which will probably do what you need:

The test list
myList = RandomInteger[{1, 5555}, #] & /@ {{19808, 5}, {7952,
5}, {7952, 5}};

The code:

Clear[mergeSameDates];
mergeSameDates[lst_List] :=
With[{pos = PositionsOfSame @@ lst[[All, All, 5]]},
With[{trpos = Transpose[pos]},
Flatten[
Transpose@
Table[lst[[i, #]] & /@ trpos[[i]], {i, 1,
Length[lst]}], {{1}, {2, 3}}]]];

It uses the function PositionsOfSame, described in my previous post. You use
it as follows:

In[4]:= (res = mergeSameDates[myList]); // Timing

Out[4]= {0.831, Null}

Regards,
Leonid

On Thu, Jul 1, 2010 at 5:28 AM, Garapata <warsaw95826 at mypacks.net> wrote:

> I have a large nested list, "myList"
>
> It has 3 sublists with the following dimensions:
>
>   Dimensions/@ myList
>
>   {{19808, 5}, {7952, 5}, {7952, 5}}
>
> The 5th position (i.e., column) in each of the sublists has
> SQLDateTime[]s
> (This may or may not affect what I need, but I thought everyone should
> know).
>
>   myIntersection = Intersection @@ (myList[[All, All, 5]]);
>
> gives me the SQLDateTimes[]s common to all sublists.  I get 3954
> common elements.
>
>   Length[myIntersection]
>
>   3954
>
> All of the above works great and runs very fast.
>
> I then find the positions in myList where all the common
> SQLDateTimes[]s occur and then use Extract pull them out into a new
> list
>
>        myPositions = Drop[(Position[data, #] & /@ myIntersection), None,
> None, -1];
>
>        myOutput = Extract[myList, #] & /@ myPositions;
>
> I end up with just what I need, which in this case gives me 3954 rows
> of {9, 5} sublists. This occurs because myList[[1]] has 5 occurrences
> of each common date element and sublists myList[[2]] and myList[[3]]
> each have 2 occurrences of each common date element.
>
> The Extract[] runs very fast.
>
> My problem =85. the Position[] runs very very slow (over 90 seconds on a
> dual core iMac).
>
> All the code together:
>
>   myIntersection = Intersection @@ (myList[[All, All, 5]]);
>   myPositions = Drop[(Position[data, #] & /@ myIntersection), None,
> None, -1];
>   myOutput = Extract[myList, #] & /@ myPositions;
>
> So, does anyone know a way to speed up:
>
>   myPositions = Drop[(Position[data, #] & /@ myIntersection), None,
> None, -1]; ?
>
> Or can anyone suggest another approach to doing this that could run
> faster.
>
> Patterns?
> ParallelMap?
> Parallelize?
> Sorting?
> Changing SQLDateTimes to DateList[]s before calculating myPositions?
>
> Not clear what to try.