Re: Re: Overlapping binning of differences of
- To: mathgroup at smc.vnet.net
- Subject: [mg92754] Re: [mg92717] Re: [mg92607] Overlapping binning of differences of
- From: danl at wolfram.com
- Date: Sun, 12 Oct 2008 04:33:27 -0400 (EDT)
- References: <200810081024.GAA00469@smc.vnet.net>
> Art wrote: >> Given two sorted vectors a and b of different lengths, what is the >> best way to count the number of elements in the set of all differences >> between elements of a and b that fall in overlapping bins of [-bsize - >> i, bsize - i) for i in Range[-n, n], where bsize >= 1. >> >> Below are 2 implementations I've tried which are two slow and memory >> intensive. I haven't quite been able to do it using BinCounts, >> Partition, and ListCorrelate. >> >> Was wondering if there is a faster way. >> >> (* Generate random a, b *) >> T = 500; bsize = 10; n = 20; >> r := Rest@FoldList[Plus, 0, RandomReal[ExponentialDistribution[0.01], >> {T}]] >> a = r; b = r; >> >> bindiff1[a_, b_, bsize_, n_] := >> With[{d = Flatten@Outer[Subtract, a, b]}, >> Table[Count[d, _?(-bsize <= # - i < bsize &)], {i, -n, n}]] >> >> bindiff2[a_, b_, bsize_, n_] := >> Module[{os, i, j, s, tmp, >> d = Sort@Flatten@Outer[Subtract, a, b], >> c = ConstantArray[0, {2 n + 1}]}, >> For[os = 0; j = 1; i = -n, i <= n, i++; j++, >> s = Flatten@Position[Drop[d, os], _?(# >= -bsize + i &), 1, 1]; >> If[s == {}, Break[], >> os += s[[1]] - 1; >> tmp = Flatten@Position[Drop[d, os], _?(# > bsize + i &), 1, 1]; >> c[[j]] = If[tmp == {}, Length[d] - os, First@tmp - 1]]]; >> Return[c]] >> >> First@Timing@bindiff[a,b, bsize, n] is about 36 seconds. >> >> First@Timing@bindiff2[a, b, bsize, n] is about 3 seconds but still too >> slow and d uses up too much memory. >> >> Thanks! > > Something similar to your bindiff2 can be sent through Compile. > > bindiff3=Compile[{{l1,_Real,1}, {l2,_Real,1}, {bsize,_Integer}, > {n,_Integer}}, > Module[{os, diffs, bins, j, k=-n, m}, > diffs = Sort[Flatten[Outer[Plus, l1, -l2]]]; > bins = ConstantArray[0, 2*n + 1]; > For[j=1, j<=Length[diffs], j++, > If [diffs[[j]]<-bsize+k,Continue[]]; > While[diffs[[j]]>bsize+k && k<=n, k++]; > m=k; > If[k>n, Return[bins]]; > While[-bsize+m<diffs[[j]]<bsize+m &&m<=n, bins[[m+n+1]]++; m++]; > ]; > bins > ]]; > > With this I get around an order of magnitude improvement in speed on > your example. > > Daniel Lichtblau > Wolfram Research Having read other responses I now realize that sorting the differences can be the main bottleneck, at least when there are not too many bins to consider. Here is code that avoids sorting. If each list is around 2000 elements then it is roughly twice as fast as the code above. bindiff4 = Compile[{{l1, _Real, 1}, {l2, _Real, 1}, {bsize, _Integer}, {n, _Integer}}, Module[{diffs, bins, j, m, dj, ld, mx = bsize + n}, bins = ConstantArray[0, 2*n + 1]; diffs = Flatten[Outer[Plus, l1, -l2]]; ld = Length[diffs]; For[j = 1, j <= ld, j++, dj = diffs[[j]]; If[Abs[dj] > mx, Continue[]]; m = -n; While[dj > bsize + m, m++]; While[-bsize + m < dj && m <= n, bins[[m + n + 1]]++; m++]; ]; bins]]; If there were many bins, it might be advantageous to use a binary search to determine the range of bins into which each difference falls. (Another response did something similar, but first sorted the differences and used binary search to find the range of differences falling into each bin.) Daniel Lichtblau Wolfram Research
- References:
- Overlapping binning of differences of two lists
- From: Art <grenander@gmail.com>
- Overlapping binning of differences of two lists