MathGroup Archive 2006

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: RE: Re: Re: distance function

  • To: mathgroup at smc.vnet.net
  • Subject: [mg70084] Re: [mg70014] RE: [mg68805] Re: [mg68772] Re: distance function
  • From: Carl Woll <carlw at wolfram.com>
  • Date: Mon, 2 Oct 2006 00:34:25 -0400 (EDT)
  • References: <200609300912.FAA13156@smc.vnet.net>

Coleman, Mark wrote:

> 
>Out of curiosity, I tested these two approaches on a number of data sets
>for which I make frequent use. For some reason Jens code is running
>slower! I've been testing it out some lists of reals of size
>n=500,1000,2500, and 5000. Is it possible that the time for the
>conditional compares is exceeding the computational time of redundant
>calculations? Could someone try this out?
>
>(note: I'm working on some code for identifying outliers in large data
>sets. The efficient calculation of L-1 and L-2 distance matrices are
>important.)
>
>Thanks
>
>Mark
>
>  
>

The package Statistics`ClusterAnalysis` has a very quick DistanceMatrix 
function:

data=Table[Random[],{1000},{3}];

<<Statistics`ClusterAnalysis`

Sqrt[DistanceMatrix[data]];//Timing
{0.187 Second,Null}

The default distance function for DistanceMatrix is 
SquaredEuclideanDistance, hence the use of Sqrt. It is possible to use 
the option DistanceFunction->EuclideanDistance to obviate the need for Sqrt:

DistanceMatrix[data, DistanceFunction->EuclideanDistance];

However, this option seems to be about 25% slower. This is probably 
because it is faster to take the square root of a matrix and rely on 
vector operations rather than taking the square root of each matrix element.

Carl Woll
Wolfram Research

>-----Original Message-----
>From: Murray Eisenberg [mailto:murray at math.umass.edu] 
To: mathgroup at smc.vnet.net
>Subject: [mg70084] [mg70014] [mg68805] Re: [mg68772] Re: distance function
>
>Yes, I KNOW that I'm computing the distances twice in my solution: 
>that's why I said it's an "extravagant" solution!
>
>Jens-Peer Kuska wrote:
>  
>
>>Hi Murray,
>>
>>at least you should compute the distances not twice because the matrix
>>    
>>
>
>  
>
>>is symmetric with zero diagonal ...
>>
>>d[{p_,p_}]:=0.0
>>d[{q_,p_}]/; OrderedQ[{q,p}]:=d[{q,p}]= Norm[p - q] 
>>d[{q_,p_}]:=d[{p,q}]
>>
>>Regards
>>   Jens
>>
>>
>>Murray Eisenberg wrote:
>>    
>>
>>>If you don't mind an "extravagant" solution -- one that is 
>>>conceptually simple and short but is probably inefficient due to 
>>>redundant calculations -- then this works, I believe:
>>>
>>>   d[{p_, q_}] := Norm[p - q]
>>>   allDistances[pts_] := Union[Flatten[Outer[d, pts, pts]]]
>>>
>>>
>>>
>>>dimmechan at yahoo.com wrote:
>>>      
>>>
>>>>In the book of Gaylord et al. (1996) there is one exercise which 
>>>>asks (see page 113)
>>>>
>>>>"Given a list of points in the plane, write a function that finds 
>>>>the set of all distances between the points."
>>>>
>>>>Although there is one solution, that solution makes use of the Table
>>>>        
>>>>
>
>  
>
>>>>and Length commands.
>>>>
>>>>Is it a way to define the same function using Higher-Order functions
>>>>        
>>>>
>
>  
>
>>>>like Outer, MapThread etc?
>>>>
>>>>Thanks in advance for any help.
>>>>
>>>>
>>>>        
>>>>
>>    
>>
>
>  
>


  • Prev by Date: confusion about sampled points in NIntegrate[...,Method->Oscillatory]
  • Next by Date: unexpected operator error - 5.2 on linux
  • Previous by thread: Re: RE: Re: Re: distance function
  • Next by thread: Re: distance function