Re: Overlapping binning of differences of two lists

• To: mathgroup at smc.vnet.net
• Subject: [mg92729] Re: Overlapping binning of differences of two lists
• From: m.r at inbox.ru
• Date: Sat, 11 Oct 2008 06:47:04 -0400 (EDT)
• References: <gci2hv\$t8\$1@smc.vnet.net> <gckn7m\$nom\$1@smc.vnet.net>

```
Szabolcs Horvat wrote:
> Art wrote:
> > Given two sorted vectors a and b of different lengths, what is the
> > best way to count the number of elements in the set of all differences
> > between elements of a and b that fall in overlapping bins of [-bsize -
> > i, bsize - i) for i in Range[-n, n], where bsize >= 1.
> >
> > Below are 2 implementations I've tried which are two slow and memory
> > intensive. I haven't quite been able to do it using BinCounts,
> > Partition, and ListCorrelate.
> >
> > Was wondering if there is a faster way.
> >
> > (* Generate random a, b  *)
> > T = 500; bsize = 10; n = 20;
> > r := Rest@FoldList[Plus, 0, RandomReal[ExponentialDistribution[0.01],
> > {T}]]
> > a = r; b = r;
> >
> > bindiff1[a_, b_, bsize_, n_] :=
> >   With[{d = Flatten@Outer[Subtract, a, b]},
> >   	Table[Count[d, _?(-bsize <= # - i < bsize &)], {i, -n, n}]]
> >
> > bindiff2[a_, b_, bsize_, n_] :=
> >  Module[{os, i, j, s, tmp,
> >    d = Sort@Flatten@Outer[Subtract, a, b],
> >    c = ConstantArray[0, {2 n + 1}]},
> >   For[os = 0; j = 1; i = -n,  i <= n,  i++; j++,
> >    s = Flatten@Position[Drop[d, os], _?(# >= -bsize + i &), 1, 1];
> >    If[s == {}, Break[],
> >     os += s[[1]] - 1;
> >     tmp = Flatten@Position[Drop[d, os], _?(# > bsize + i &), 1, 1];
> >     c[[j]] = If[tmp == {}, Length[d] - os, First@tmp - 1]]];
> >   Return[c]]
> >
> > First@Timing@bindiff[a,b, bsize, n] is about 36 seconds.
> >
> > First@Timing@bindiff2[a, b, bsize, n] is about 3 seconds but still too
> > slow and d uses up too much memory.
> >
>
> The first thing that came to my mind was BinCounts and Partition.  What
> was the trouble you ran into when using them?
>
> bindiff3[a_, b_, bsize_, n_] :=
>   With[{diffs = Flatten@Outer[Subtract, a, b]},
>    Total /@
>     Partition[BinCounts[diffs, {-bsize - n, bsize + n, 1}], 2 bsize, 1]
>    ]
>
> In[7]:=
> Timing[r1 = bindiff1[a, b, bsize, n];]
> Timing[r2 = bindiff2[a, b, bsize, n];]
> Timing[r3 = bindiff3[a, b, bsize, n];]
>
> Out[7]= {43.7454, Null}
>
> Out[8]= {2.32665, Null}
>
> Out[9]= {0.479927, Null}
>
> In[10]:= r1 === r2 === r3
> Out[10]= True
>
> (I haven't measured memory use, but I am guessing that the critical
> operation is Outer[], so the memory requirements of the three functions
> are likely to be similar.)

This can be sped up some more:

bindiff4[a_, b_, bsize_, n_] :=
ListConvolve[ConstantArray[1, 2 bsize],
BinCounts[Subtract @@ Transpose@Tuples@{a, b},
{-bsize - n, bsize + n}]]

In[6]:= Timing[r3 = bindiff3[a, b, bsize, n];]

Out[6]= {0.312, Null}

In[7]:= Timing[r4 = bindiff4[a, b, bsize, n];]

Out[7]= {0.11, Null}

The advantages are that it doesn't unpack and that Subtract is called
only once rather than repeatedly for each pair. Using
Total[Partition[..., 2 bsize, 1], {2}] gives a comparable timing.

Maxim Rytin
m.r at inbox.ru

```

• Prev by Date: Re: Math Formulas
• Next by Date: Re: Unevaluated subtleties
• Previous by thread: Re: Overlapping binning of differences of two lists
• Next by thread: Re: Get Graphics Coordinates accuracy