MathGroup Archive 2012

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Find Position of many elements in a large list.

  • To: mathgroup at smc.vnet.net
  • Subject: [mg127686] Re: Find Position of many elements in a large list.
  • From: Roland Franzius <roland.franzius at uos.de>
  • Date: Wed, 15 Aug 2012 03:32:45 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • Delivered-to: l-mathgroup@wolfram.com
  • Delivered-to: mathgroup-newout@smc.vnet.net
  • Delivered-to: mathgroup-newsend@smc.vnet.net
  • References: <k0d4bu$m8f$1@smc.vnet.net>

Am 14.08.2012 11:05, schrieb benp84 at gmail.com:
> I have a sorted, 1-dimensional list X of 1,000,000 integers, and a sorted, 1-dimensional list Y of 10,000 integers.  Most, but not all, of the elements of Y are also elements of X.  I'd like to know the positions of the elements in X that are also in Y.  What's the fastest way to compute this?
>
> I have an algorithm in mind but it requires lots of custom code and I'm wondering if there's a clever way to do it with built-in functions.  Thanks.
>

You dont need any "code" at all.

Condsider this example executed on an i7 quad core notebook

data = RandomInteger[10000000, 500000];
pattern = RandomInteger[10000000, 10000];

patterninuse = Intersection[data, pattern];

Timing[pointers = Position[data, #] & /@ patterninuse;]
{17.471999999999998`, Null}

Now we Parallize by splitting the target list in two.

Four kernels get launched

{rg1, rg2} = Partition[data, Length[data]/2];
t1 = AbsoluteTime[];
Parallelize[
  pt1 = Position[rg1, #] & /@ patterninuse;
  pt2 = Position[rg2, #] & /@ patterninuse;
  pt = Join[pt1, pt2 + Length[pt1]];
  ]
AbsoluteTime[] - t1

4.7736084`8.130391782549879


Timing[] with Parallelize[] doesnt work.  Splitting a list of 1000000 in 
four partitions (plus a little error r3 instead rg3) sends the machine 
into nowhere.

Since memory reaches the maximum of 8 Gb and after some minutes even 
mouse interrupt is freezing and screen becomes black there may be a 
parallel problem using the graphic processor.

-- 

Roland Franzius




  • Prev by Date: Problem with Fit command
  • Next by Date: Re: Removing Outliers from List
  • Previous by thread: Re: Find Position of many elements in a large list.
  • Next by thread: Re: Find Position of many elements in a large list.