Re: Finding the closest number from a list

*To*: mathgroup at smc.vnet.net*Subject*: [mg39525] Re: [mg39510] Finding the closest number from a list*From*: Sseziwa Mukasa <mukasa at jeol.com>*Date*: Fri, 21 Feb 2003 04:08:10 -0500 (EST)*Sender*: owner-wri-mathgroup at wolfram.com

On Thursday, February 20, 2003, at 05:14 AM, Jacob Rome wrote: > Hi, > > I have a seemingly simple problem. I want to find the element in a > list that is closest to a number I specify. In my case, I have a list > of about 30,000 numbers (y[i]). I want to find the closest match for > each of about 10,000 other numbers (call them x[i]). Obviously, speed > is important. I've sorted the large list, and right now I'm going > through each y[i] from lowest to highest and testing it to see if x[i] > is less than that value. This takes about .1 seconds for each x[i]. > Are you doing a binary search? I tested searching for items from a list using a binary search routine as follows: binarysearch[item_, lst_] := Block[{left = 1, middle = Quotient[Length[lst], \ 2], right = Length[lst]}, If[ middle == 0, Return[1]]; While[True, If[lst[[middle]] == item, Return[middle], If[right - left ¡Â 1, If[lst[[right]] - item < item - lst[[left]], Return[right], Return[ left]], If[ item > lst[[middle]], left = middle, right = middle]]]; middle = Quotient[left + right, 2]]] and using map to search for 10000 values (ie binarysearch[#,lst]&/@items) and it takes about 6 seconds on a 1GHz G4 with a list of 30000 items. > I'm wondering if anyone has had a similar problem, and if there is a > better function built-in to Mathematica. Alternatetively, I could > build my own. I've just recently realized that I could also reduce the > search time considerably if I sort the x[i] list as well, and only > start my search from where I last left off. Any ideas on which > approach would be more efficient? Thanks. > > Since a binary search scales as log n, dropping items from the list to be searched may actually be slower due to overhead in manipulating the list. To get a factor of 2 improvement in the search you need to drop the list size to sqrt(n). This code takes about 11 seconds on my machine: Block[{mylist = lst, pos, dropped = 0}, Table[pos = binarysearch[items[[i]], mylist]; mylist = Drop[mylist, pos - 1]; dropped += pos - 1, {i, Length[items]}]] I tried a version using Mathematica's Select function which is far too slow to consider for your problem and a version in which I precomputed intervals in order to short circuit the search. That approach is negligibly faster than the straightforward search and 3 times more memory intensive. Regards Ssezi