MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: A fast way to compare two vectors

  • To: mathgroup at smc.vnet.net
  • Subject: [mg121710] Re: A fast way to compare two vectors
  • From: "Oleksandr Rasputinov" <oleksandr_rasputinov at hmamail.com>
  • Date: Mon, 26 Sep 2011 20:06:13 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • References: <j5m44t$ple$1@smc.vnet.net> <j5mt5u$smm$1@smc.vnet.net>

On Mon, 26 Sep 2011 09:19:18 +0100, Yasha Gindikin <gindikin at gmail.com>  
wrote:

> Dear Oleksandr, thank you very much for your code. Actually, I was  
> trying to implement the same algorithm. My code was the following:
>
> Compare := Compile[{{a, _Integer, 1}, {b, _Integer, 1}},
>    {s1 = s2 = 0; k = n; d1 = d2 = Table[0, {k, 3}];
>     While[s1 < 3 && s2 < 3 && k - s1 > 0 && k - s2 > 0,
>      Which[a[[k - s1]] > b[[k - s2]], d1[[s1 + 1]] = a[[k - s1]]; s1++,
>        a[[k - s1]] < b[[k - s2]], d2[[s2 + 1]] = b[[k - s2]]; s2++,
>       a[[k - s1]] == b[[k - s2]], k--]]; Cm = Max[{s1, s2}];
>     Which[Cm > 2, False, s1 - s2 == 1, d2[[s1]] = b[[1]],
>      s2 - s1 == 1, d1[[s2]] = a[[1]], s1 - s2 == 2, d2[[1]] = b[[2]];
>      d2[[2]] = b[[1]], s2 - s1 == 2, d1[[1]] = a[[2]];
>      d1[[2]] = a[[1]]]; Cm}];
>
> But due to my very limited programming experience, the code turned out  
> to be not very efficient. It even works faster without the Compile  
> option, which puzzles me a lot. I will try to use your realization  
> instead. Thank you once again!
>
> Regards,
> Yasha.
>

Dear Yasha

The above example is slow for two reasons. The first is that SetDelayed  
(:=) is used instead of Set (=) in the definition. This means that the  
code is unnecessarily compiled afresh every time it is used. The second  
reason is that many of the values are not localised to the compiled code  
and such values are dealt with via Mathematica's interpreter. These  
problems can be addressed without changing the logic at all, simply by  
inserting a Block where local variable are defined, then passing the  
values of the localised variables out to the corresponding global  
variables at the end of the procedure:

compare = Compile[{
   {a, _Integer, 1}, {b, _Integer, 1}
  },
  Block[{
    s1 = 0, s2 = 0, k = Length[a],
    loc`Cm = 0, loc`d1 = Table[0, {3}], loc`d2 = Table[0, {3}]
   },
   While[s1 < 3 && s2 < 3 && k - s1 > 0 && k - s2 > 0,
    Which[
     a[[k - s1]] > b[[k - s2]], loc`d1[[s1 + 1]] = a[[k - s1]]; s1++,
     a[[k - s1]] < b[[k - s2]], loc`d2[[s2 + 1]] = b[[k - s2]]; s2++,
     a[[k - s1]] == b[[k - s2]], k--
    ]
   ];
   loc`Cm = Max[{s1, s2}];
   Which[
    loc`Cm > 2, False,
    s1 - s2 == 1, loc`d2[[s1]] = b[[1]],
    s2 - s1 == 1, loc`d1[[s2]] = a[[1]],
    s1 - s2 == 2, loc`d2[[1]] = b[[2]]; loc`d2[[2]] = b[[1]],
    s2 - s1 == 2, loc`d1[[1]] = a[[2]]; loc`d1[[2]] = a[[1]]
   ];
   d1 = loc`d1; d2 = loc`d2;
   Cm = loc`Cm
  ]
];

Now, the code compiles fully and is quite fast. Unfortunately, it does not  
appear to give correct results: compare[{1, 2, 3, 6, 7, 9}, {1, 2, 4, 7,  
8, 9}] returns Cm = 2 and sets d1 = {6, 3, 0} and d2 = {8, 4, 0}, whereas  
per your description I had understood that Cm = 4, d1 = {3, 4}, and d2 =  
{3, 5} were required. As I'm not sure whether the error lies in your  
procedure or in my understanding of your problem, I haven't attempted to  
change the logic of the above. If you need the function I gave previously  
to return values in global variables, you could use:

compareCompiled = Compile[{
   {vec1, _Integer, 1}, {vec2, _Integer, 1},
   {MAXDIFFS, _Integer, 0}
  },
  Block[{
    i1 = 1, i2 = 1,
    diffct = 0,
    db1 = Internal`Bag@Most[{0}], db2 = Internal`Bag@Most[{0}],
    d1 = Most[{0}], d2 = Most[{0}]
   },
   (* Run along the lists, recording differences as we go *)
   While[
    i1 <= Length[vec1] && i2 <= Length[vec2] && diffct < MAXDIFFS,
    Which[
     vec1[[i1]] == vec2[[i2]], i1++; i2++,
     vec1[[i1]] < vec2[[i2]], diffct++; Internal`StuffBag[db1, i1++],
     vec1[[i1]] > vec2[[i2]], diffct++; Internal`StuffBag[db2, i2++]
    ]
   ];
   (* Fix up in case we ran off the end of one of the lists *)
   While[i1 <= Length[vec1] && diffct < MAXDIFFS,
    diffct++; Internal`StuffBag[db1, i1++];
   ];
   While[i2 <= Length[vec2] && diffct < MAXDIFFS,
    diffct++; Internal`StuffBag[db2, i2++];
   ];
   (* Convert bags to vectors *)
   d1 = Internal`BagPart[db1, All];
   d2 = Internal`BagPart[db2, All];
   (* Return values *)
   compare`d1 = d1;
   compare`d2 = d2;
   diffct
  ]
];

With[{cc = compareCompiled},
  compare[vec1_, vec2_, MAXDIFFS_] :=
   Block[{tmp`Cm, compare`d1, compare`d2},
    If[(tmp`Cm = cc[vec1, vec2, MAXDIFFS + 1]) <= MAXDIFFS,
     d1 = compare`d1; d2 = compare`d2,
     d1 = d2 = {}
    ];
    Cm = tmp`Cm
   ]
];

so that compare[{1, 2, 3, 6, 7, 9}, {1, 2, 4, 7, 8, 9}, 4] returns Cm = 4  
and sets d1 = {3, 4} and d2 = {3, 5}, while compare[{1, 2, 3, 6, 7, 9},  
{1, 2, 4, 7, 8, 9}, 3] returns Cm = 4 and sets d1 = d2 = {}. One aspect of  
the problem I am not very clear about is whether you require the total  
number of differences even when it exceeds the maximum number for which  
you need the positions. The above simply stops accumulating differences  
when the specified number is exceeded, so the maximum value of Cm you will  
see is MAXDIFFS+1. Of course, this is easily changed if necessary.

I hope that these examples will be useful to you.

Best,

O. R.




  • Prev by Date: Re: Calculus and InterpolatingFunction
  • Next by Date: Re: Plot axis length and size ratio (TwoPlot revive)
  • Previous by thread: Re: A fast way to compare two vectors
  • Next by thread: Re: A fast way to compare two vectors