Re: Find Position of many elements in a large list.
- To: mathgroup at smc.vnet.net
- Subject: [mg127686] Re: Find Position of many elements in a large list.
- From: Roland Franzius <roland.franzius at uos.de>
- Date: Wed, 15 Aug 2012 03:32:45 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- Delivered-to: l-mathgroup@wolfram.com
- Delivered-to: mathgroup-newout@smc.vnet.net
- Delivered-to: mathgroup-newsend@smc.vnet.net
- References: <k0d4bu$m8f$1@smc.vnet.net>
Am 14.08.2012 11:05, schrieb benp84 at gmail.com: > I have a sorted, 1-dimensional list X of 1,000,000 integers, and a sorted, 1-dimensional list Y of 10,000 integers. Most, but not all, of the elements of Y are also elements of X. I'd like to know the positions of the elements in X that are also in Y. What's the fastest way to compute this? > > I have an algorithm in mind but it requires lots of custom code and I'm wondering if there's a clever way to do it with built-in functions. Thanks. > You dont need any "code" at all. Condsider this example executed on an i7 quad core notebook data = RandomInteger[10000000, 500000]; pattern = RandomInteger[10000000, 10000]; patterninuse = Intersection[data, pattern]; Timing[pointers = Position[data, #] & /@ patterninuse;] {17.471999999999998`, Null} Now we Parallize by splitting the target list in two. Four kernels get launched {rg1, rg2} = Partition[data, Length[data]/2]; t1 = AbsoluteTime[]; Parallelize[ pt1 = Position[rg1, #] & /@ patterninuse; pt2 = Position[rg2, #] & /@ patterninuse; pt = Join[pt1, pt2 + Length[pt1]]; ] AbsoluteTime[] - t1 4.7736084`8.130391782549879 Timing[] with Parallelize[] doesnt work. Splitting a list of 1000000 in four partitions (plus a little error r3 instead rg3) sends the machine into nowhere. Since memory reaches the maximum of 8 Gb and after some minutes even mouse interrupt is freezing and screen becomes black there may be a parallel problem using the graphic processor. -- Roland Franzius