MathGroup Archive 2003

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: Finding the closest number from a list

  • To: mathgroup at smc.vnet.net
  • Subject: [mg39559] Re: [mg39517] Re: Finding the closest number from a list
  • From: Dr Bob <drbob at bigfoot.com>
  • Date: Sun, 23 Feb 2003 05:00:07 -0500 (EST)
  • References: <b32ab1$b3o$1@smc.vnet.net> <200302210907.EAA16923@smc.vnet.net> <oprkzf4rdbamtwdy@smtp.cox-internet.com>
  • Reply-to: drbob at bigfoot.com
  • Sender: owner-wri-mathgroup at wolfram.com

I had a typo in my solution (list in place of sorted in the final return 
expression, but list had been sorted outside the routine).  In case that 
affected timings, here it is again, with an unsorted 'list':

closest[list_?VectorQ, things_?VectorQ] := Module[{sorted = Sort[list], 
lint, first, last, n = Length[list]},
    first = Max[Min[things], First[sorted]] - 1;
    last = Min[Max[things], Last[sorted]] + 1;
    lint = Interpolation[Transpose@{
      Join[{first}, sorted, {last}], Join[{1}, Range[
            n], {n}]}, InterpolationOrder -> 1];
    sorted[[Round@lint@things]]
    ]

n = 30000;
list = Table[Random[], {n}];
things = Table[Random[], {10000}];

Timing[ClosestMatches[list,     things]; ]
{0.8119999999999998*Second,   Null}
Timing[closest[list,     things]; ]
{0.266*Second, Null}

Take[things, 5]
Take[ClosestMatches[list, things], 5]
Take[closest[list, things], 5]

{0.5807002127146147,   0.7231743740277918,   0.3065812161844857, 
0.31065747252742193,   0.4325752310216415}
{0.5806328433277591,   0.723217016039161,   0.3065416642883134, 
0.31064073272181786,   0.4326087043388147}
{0.5807654279226884,   0.7231714516295773,   0.30659880784207205, 
0.3106581666143228,   0.43257232669984563}

Bobby

On Sat, 22 Feb 2003 00:25:29 -0600, Dr Bob <drbob at bigfoot.com> wrote:

> Robert's solution:
>
> GetClosest[lower_, upper_, item_, slist_] := Module[{dl, du, pos}, pos = 
> If[upper â?¤ lower,
> upper, {dl, du} = Abs[item - slist[[{lower, upper}]]];
> If[du > dl, upper, lower]];
> slist[[pos]]]
> ClosestMatches[
> list_?VectorQ, things_?VectorQ] :=
> Module[{sorted, lint, first, last, n = Length[list], lower, upper}, 
> sorted = Sort[list];
> first = Max[Min[things], First[sorted]] - 1;
> last = Min[Max[things], Last[sorted]] + 1;
> lint = ListInterpolation[N[Join[{1}, Range[n], {n}]], {Join[{
> first}, sorted, {last}]}, InterpolationOrder -> 1];
> estimates = lint[things];
> lower = Floor[estimates];
> upper = Ceiling[estimates];
> MapThread[GetClosest[##, sorted] &, {lower, upper, things}]]
>
> My solution (inspired by Robert's):
>
> closest[list_?VectorQ, things_?VectorQ] := Module[{sorted = Sort[list], 
> lint, first, last, n = Length[list]},
> first = Max[Min[things], First[sorted]] - 1;
> last = Min[Max[things], Last[sorted]] + 1;
> lint = Interpolation[Transpose@{
> Join[{first}, sorted, {last}], Join[{1}, Range[
> n], {n}]}, InterpolationOrder -> 1];
> list[[Round@lint@things]]
> ]
>
> Test data:
>
> n = 30000;
> list = Table[Random[], {n}];
> things = Table[Random[], {10000}];
>
> Timings:
>
> Timing[ClosestMatches[list, things]; ]
> {0.9679999999999893*Second,   Null}
> Timing[closest[list, things];]
> {0.5319999999999965*Second,   Null}
>
> The answers aren't the same:
>
> Take[things, 5]
> Take[ClosestMatches[list, things], 5]
> Take[closest[list, things], 5]
>
> {0.6384367070141534, 0.6250093104126428, 0.12184236611421798, 
> 0.09511197717954557, 0.679769352921642}
> {0.6383160624966754, 0.624879790611537, 0.12186365194457346, 
> 0.09509389371548846, 0.679694223307205}
> {0.6384510214302745, 0.6250442930656402, 0.12184191844340701, 
> 0.09512514224893226, 0.6797705952963301}
>
> Rounding the Interpolation output gets the closer result, and faster.
>
> Bobby
>
> On Fri, 21 Feb 2003 04:07:41 -0500 (EST), Robert Knapp 
> <rknapp at wolfram.com> wrote:
>
>> Jacob Rome wrote:
>>> Hi,
>>>
>>> I have a seemingly simple problem. I want to find the element in a
>>> list that is closest to a number I specify. In my case, I have a list
>>> of about 30,000 numbers (y[i]). I want to find the closest match for
>>> each of about 10,000 other numbers (call them x[i]). Obviously, speed
>>> is important. I've sorted the large list, and right now I'm going
>>> through each y[i] from lowest to highest and testing it to see if x[i]
>>> is less than that value. This takes about .1 seconds for each x[i].
>>>
>>> I'm wondering if anyone has had a similar problem, and if there is a
>>> better function built-in to Mathematica. Alternatetively, I could
>>> build my own. I've just recently realized that I could also reduce the
>>> search time considerably if I sort the x[i] list as well, and only
>>> start my search from where I last left off.  Any ideas on which
>>> approach would be more efficient? Thanks.
>>>
>>
>> Here is an approach suggested by Stephen Wolfram which is very 
>> efficient:
>> The idea is to use a first order interpolation to find an estimate to 
>> the closest index, then use actual comparison to determine which element 
>> is actually closest.  The reason this approach is fast is because 
>> InterpolatingFunctions in Mathematica use a fast method to determine 
>> where an input lies in a data list.
>>
>> This function finds which element is actually closest given an upper and 
>> lower bound on the index into the sorted list:
>>
>> In[1]:=
>> GetClosest[lower_, upper_,  item_, slist_] := Module[{dl, du, pos},
>> pos = If[upper <= lower,
>> upper,
>> {dl, du} = Abs[item - slist[[{lower, upper}]]];
>> If[du > dl,
>> upper,
>> lower]
>> ];
>> slist[[pos]]
>> ]
>>
>> This function returns the list of closest matches in the List list given 
>> a List things of items you want to find the closest matches to.  I have 
>> not included it in the argument checking, but this will only work if 
>> both lists are real or integer -- for complex numbers a more complicated 
>> scheme would have to be concocted.
>>
>> In[2]:=
>> ClosestMatches[list_?VectorQ, things_?VectorQ] :=
>> Module[{sorted, lint, first, last, n = Length[list], lower, upper},
>> sorted = Sort[list];
>> first = Max[Min[things], First[sorted]] - 1;
>> last = Min[Max[things], Last[sorted]] + 1;
>> lint =
>> ListInterpolation[
>> N[Join[{1},Range[n],{n}]],{Join[{first},sorted,{last}]},
>> InterpolationOrder\[Rule]1];
>> estimates = lint[things];
>> lower = Floor[estimates];
>> upper = Ceiling[estimates];
>> MapThread[GetClosest[##, sorted]&, {lower, upper, things}]
>> ]
>>
>> This sets up a random list and a random set of things.
>>
>> In[3]:=
>> n = 30000;
>> list = Table[Random[],{n}];
>> things = Table[Random[],{10000}];
>>
>> This finds the matches
>>
>> In[6]:=
>> Timing[ClosestMatches[list, things];]
>>
>> Out[6]=
>> {2.51 Second,Null}
>>
>>
>>
>> The timing was done on a 750MhZ PentiumIII system.
>>
>>
>>
>>
>
>
>



-- 
majort at cox-internet.com
Bobby R. Treat



  • Prev by Date: RE: How to use vector analysis without using suffix notation
  • Next by Date: how do i insert a diagram in mathematica?
  • Previous by thread: Re: Re: Finding the closest number from a list
  • Next by thread: RE: Finding the closest number from a list