MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: A fast way to compare two vectors

  • To: mathgroup at smc.vnet.net
  • Subject: [mg121678] Re: A fast way to compare two vectors
  • From: "Oleksandr Rasputinov" <oleksandr_rasputinov at hmamail.com>
  • Date: Sun, 25 Sep 2011 05:42:35 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • References: <j5m44t$ple$1@smc.vnet.net>

On Sun, 25 Sep 2011 03:37:17 +0100, Yasha Gindikin <gindikin at gmail.com>  
wrote:

> I'm looking for a hyper-fast way to compare two vectors using  
> Mathematica built-in functions. The problem arises in the calculation of  
> the Hamiltonian matrix elements of a many-particle quantum system.
>
> Consider two rows of equal length n, a[[p1] and a[[p2]]. Their elements  
> are integers from 1 to M, put in an increasing order, without repeats.  
> E.g., a[[p1]]={1,2,3,6,7,9} and a[[p2]]={1,2,4,7,8,9}. I need to find  
> the number of different elements (4 in the example) and, if this number  
> is less or equal to 4, their positions ((3,4) and (3,5) in the example).  
> The simplest way looks like this:
>
> diff = Intersection[a[[p1]], a[[p2]]];
> R = Dimensions[diff][[1]];
> If[R<5,d1 = Complement[a[[p1]], diff];
> d2 = Complement[a[[p2]], diff];
> k1 = Position[a[[p1]], d1[[1]]]; k2 = Position[a[[p1]], d1[[2]]];
> k3 = Position[a[[p2]], d2[[1]]]; k4 = Position[a[[p2]], d2[[2]]]];
>
> This is slow, since four search operations are used (and because the  
> ordering of the elements is not taken advantage of). Is there a built-in  
> function that quickly finds the differences between the two vectors?
>
> Thank you.
> Regards,
> Yasha.
>

No. But you can achieve high performance with a hand-coded implementation  
specific to the problem at hand.

Firstly, a few operations in your given code are redundant. Removing these  
and instituting a little good programming practice it becomes

vectorCompare = Function[
  {vec1, vec2, MAXDIFFS},
  Block[{
    d1 = Complement[vec1, vec2], d2 = Complement[vec2, vec1],
    k1, k2
   },
   If[Length[d1] + Length[d2] <= MAXDIFFS,
    k1 = Flatten@Position[vec1, Alternatives @@ d1];
    k2 = Flatten@Position[vec2, Alternatives @@ d2]
   ];
   {k1, k2}
  ]
];

which is called as vectorCompare[a[[p1]], a[[p2]], 4], giving {{3, 4}, {3,  
5}} as we expect.

But this is not where the big gains lie, since this problem can be  
attacked very efficiently in procedural code. Such code can even be  
compiled (from C, if necessary), in which case you will obtain the desired  
high performance:

hyperfastVectorCompare = Compile[{
   {vec1, _Integer, 1}, {vec2, _Integer, 1},
   {MAXDIFFS, _Integer, 0}
  },
  Block[{
    i1 = 1, i2 = 1,
    diffct = 0,
    diffs1 = Most[{0}], diffs2 = Most[{0}]
   },
   (* Run along the lists, recording differences as we go *)
   While[
    i1 <= Length[vec1] && i2 <= Length[vec2] && diffct < MAXDIFFS,
    Which[
     vec1[[i1]] == vec2[[i2]], i1++; i2++,
     vec1[[i1]] < vec2[[i2]], diffct++; AppendTo[diffs1, i1]; i1++,
     vec1[[i1]] > vec2[[i2]], diffct++; AppendTo[diffs2, i2]; i2++
    ]
   ];
   (* Fix up in case we ran off the end of one of the lists *)
   While[i1 <= Length[vec1] && diffct < MAXDIFFS,
    diffct++; AppendTo[diffs1, i1]; i1++
   ];
   While[i2 <= Length[vec2] && diffct < MAXDIFFS,
    diffct++; AppendTo[diffs2, i2]; i2++
   ];
   {diffs1, diffs2}
  ]
];

naturally, hyperfastVectorCompare[a[[p1]], a[[p2]], 4] also gives {{3, 4},  
{3, 5}} for your inputs, but traverses the vectors only once and takes  
advantage of the ordering of the elements. Note that if you choose  
MAXDIFFS to be odd and less than the actual number of differences the  
compiled code will throw an error due to {diffs1, diffs2} not being a  
full-rank tensor; this can be avoided by passing diffs1 and diffs2 out  
separately but I have not bothered with it here since requesting an odd  
number of differences seemed to me inherently ill-defined and as such a  
rather unlikely corner case. If you need the number of differences  
explicitly, you can either pass this out separately as well or otherwise  
just count the number of elements in the returned list of positions.

If you need higher performance still, AppendTo is the prime candidate for  
replacement since it copies the entire accumulated list of differences on  
every call. However, I doubt that this will present much of a limitation  
in practice unless you have very long vectors with a large number of  
differences.

Best,

O. R.




  • Prev by Date: Nonhomogeneous Wave Equation with NDSolve
  • Next by Date: Re: Elementwise Matrix Subtraction
  • Previous by thread: Re: A fast way to compare two vectors
  • Next by thread: Re: A fast way to compare two vectors