Mathematica 9 is now available
Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: A fast way to compare two vectors

  • To: mathgroup at smc.vnet.net
  • Subject: [mg121689] Re: A fast way to compare two vectors
  • From: "Oleksandr Rasputinov" <oleksandr_rasputinov at hmamail.com>
  • Date: Mon, 26 Sep 2011 04:13:54 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • References: <j5m44t$ple$1@smc.vnet.net> <j5mt5u$smm$1@smc.vnet.net>

On Sun, 25 Sep 2011 10:44:30 +0100, Oleksandr Rasputinov  
<oleksandr_rasputinov at hmamail.com> wrote:

> On Sun, 25 Sep 2011 03:37:17 +0100, Yasha Gindikin <gindikin at gmail.com>
> wrote:
>
>> I'm looking for a hyper-fast way to compare two vectors using
>> Mathematica built-in functions. The problem arises in the calculation of
>> the Hamiltonian matrix elements of a many-particle quantum system.
>>
>> Consider two rows of equal length n, a[[p1] and a[[p2]]. Their elements
>> are integers from 1 to M, put in an increasing order, without repeats.
>> E.g., a[[p1]]={1,2,3,6,7,9} and a[[p2]]={1,2,4,7,8,9}. I need to find
>> the number of different elements (4 in the example) and, if this number
>> is less or equal to 4, their positions ((3,4) and (3,5) in the example).
>> The simplest way looks like this:
>>
>> diff = Intersection[a[[p1]], a[[p2]]];
>> R = Dimensions[diff][[1]];
>> If[R<5,d1 = Complement[a[[p1]], diff];
>> d2 = Complement[a[[p2]], diff];
>> k1 = Position[a[[p1]], d1[[1]]]; k2 = Position[a[[p1]], d1[[2]]];
>> k3 = Position[a[[p2]], d2[[1]]]; k4 = Position[a[[p2]], d2[[2]]]];
>>
>> This is slow, since four search operations are used (and because the
>> ordering of the elements is not taken advantage of). Is there a built-in
>> function that quickly finds the differences between the two vectors?
>>
>> Thank you.
>> Regards,
>> Yasha.
>>
>
> If you need higher performance still, AppendTo is the prime candidate for
> replacement since it copies the entire accumulated list of differences on
> every call. However, I doubt that this will present much of a limitation
> in practice unless you have very long vectors with a large number of
> differences.
>

In fact, having just read Daniel Lichtblau's description of the  
undocumented internal functions for manipulating expression bags (see  
http://stackoverflow.com/questions/6691491/implementing-a-quadtree-in-mathematica),  
it is clear that bags are an ideal (and utterly trivial) substitute for  
AppendTo in this case:

hyperfastVectorCompareBag = Compile[{
   {vec1, _Integer, 1}, {vec2, _Integer, 1},
   {MAXDIFFS, _Integer, 0}
  },
  Block[{
    i1 = 1, i2 = 1,
    diffct = 0,
    diffs1 = Internal`Bag@Most[{0}], diffs2 = Internal`Bag@Most[{0}]
   },
   (* Run along the lists, recording differences as we go *)
   While[
    i1 <= Length[vec1] && i2 <= Length[vec2] && diffct < MAXDIFFS,
    Which[
     vec1[[i1]] == vec2[[i2]], i1++; i2++,
     vec1[[i1]] < vec2[[i2]], diffct++; Internal`StuffBag[diffs1, i1]; i1++,
     vec1[[i1]] > vec2[[i2]], diffct++; Internal`StuffBag[diffs2, i2]; i2++
    ]
   ];
   (* Fix up in case we ran off the end of one of the lists *)
   While[i1 <= Length[vec1] && diffct < MAXDIFFS,
    diffct++; Internal`StuffBag[diffs1, i1]; i1++
   ];
   While[i2 <= Length[vec2] && diffct < MAXDIFFS,
    diffct++; Internal`StuffBag[diffs2, i2]; i2++
   ];
   {Internal`BagPart[diffs1, All], Internal`BagPart[diffs2, All]}
  ]
];

This version is extremely fast. Using your test vectors,

vec1 = {1, 2, 3, 6, 7, 9};
vec2 = {1, 2, 4, 7, 8, 9};

but expanded a little (to give about 100,000 elements/400KB per vector)  
using

Do[
  vec1 = Join[vec1, vec1 + Max[vec1]];
  vec2 = Join[vec2, vec2 + Max[vec2]],
  {14}
];

we get:

Version                           Run time /s
vectorCompare                     92.36
hyperfastVectorCompare            1.375
hyperfastVectorCompareBag         0.031

More gains are available by translating to C ("CompilationTarget" -> "C")  
and compiling down into native code. Expanding the inputs a little more  
(to over 200 million elements/800MB per vector), we obtain:

Version                           Run time /s
hyperfastVectorCompareBag (MVM)   55.59
hyperfastVectorCompareBag (C)     8.578

which in my opinion is not too bad at all considering that this code is  
single-threaded and written wholly in the Mathematica language.




  • Prev by Date: Re: Calculus and InterpolatingFunction
  • Next by Date: Re: Integration error? Integrate[1/(x^3-1)]?
  • Previous by thread: Re: A fast way to compare two vectors
  • Next by thread: Re: A fast way to compare two vectors