Re: Efficient search for bounding list elements
- To: mathgroup at smc.vnet.net
- Subject: [mg114202] Re: Efficient search for bounding list elements
- From: Yves Klett <yves.klett at googlemail.com>
- Date: Sat, 27 Nov 2010 03:35:16 -0500 (EST)
- References: <ico20m$ndb$1@smc.vnet.net>
Just a footnote: I tried the new CompilationTarget->"C" option for
Compile and was pleasantly surprised about a speedup factor of about 2
(not counting the compilation time for the function, which took a few
seconds on my machine)
Regards,
Yves
Am 26.11.2010 11:26, schrieb Daniel Lichtblau:
> ----- Original Message -----
>> From: "David Skulsky" <edskulsky at gmail.com>
>> To: mathgroup at smc.vnet.net
>> Sent: Thursday, November 25, 2010 4:58:49 AM
>> Subject: Efficient search for bounding list elements
>> I've been looking for a way to efficiently find the indices of the two
>> elements of a monotonically increasing list which bound a number. For
>> example, if
>>
>> a = Range[1000,5000,250]
>>
>> and
>>
>> x=1600
>>
>> then I'd like this function (e.g., searchFunction[x,a]) to return
>> {3,4}, which correspond to the 3rd and 4th elements of a, which are
>> 1500 and 1750, respectively.
>>
>> I can easily do this in a loop, but in my application a can be very
>> large (hundreds of thousands or millions of elements) and this
>> operation needs to repeated thousands of times, so efficiency is
>> critical.
>>
>> Any suggestions would be greatly appreciated!
>>
>> Thanks,
>>
>> David Skulsky
>
> If you are dealing with numbers of machine size (integers or reals), then a compiled binary search is fast. Even an uncompiled one seems reasonable.
>
> binarySearchC =
> Compile[{{ll, _Integer, 1}, {val, _Integer}},
> Module[{len = Length[ll], j = 1, lo = 1, hi, mid},
> If[len < 20,
> While[j < len && ll[[j]] < val, j++];
> {j, j + 1}
> ,
> hi = len;
> While[hi - lo > 1, mid = Round[(hi + lo)/2.];
> If[ll[[mid]] < val, lo = mid, hi = mid];];
> {lo, hi}]]];
>
> Test on a list of a million entries spaced 10 apart. We'll run 100,000 examples and time the lot.
>
> mylist = 10*Range[10^6];
> rand = RandomInteger[10^7, 10^5];
>
> In[36]:= Timing[indices = Map[binarySearchC[mylist, #] &, rand];]
> Out[36]= {1.201, Null}
>
> A version that avoids Compile took around 20 seconds on the same machine.
>
> If you intend to locate many inputs at once in the same ordered list, it is slightly faster to use a function that takes a list as a second argument.
>
> Daniel Lichtblau
> Wolfram Research
>
>
>