MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: DeleteDuplicates is too slow?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg107189] Re: [mg107150] DeleteDuplicates is too slow?
  • From: Leonid Shifrin <lshifr at gmail.com>
  • Date: Fri, 5 Feb 2010 03:21:25 -0500 (EST)
  • References: <201002041126.GAA29795@smc.vnet.net>

Hi,

you can use

Tally[data][[All, 1]]

although perhaps this is not the fastest. It seems like DeleteDuplicates is
slow here because it
uses the comparison function to  compare each element with all the others,
which takes quadratic time
in the size of the dataset. It is blazingly fast on packed arrays of
numbers, however.

Regards,
Leonid



On Thu, Feb 4, 2010 at 2:26 PM, Zeringue, Clint M Civ USAF AFMC AFRL/RDLAF <
Clint.Zeringue at kirtland.af.mil> wrote:

> Hello,
>
> Suppose you have the following.
>
> Data = RandomReal[1,{N,2}];
>
> sameQ[_,_]=False;
> sameQ[{x_,y_},{x_,z_}]=True;
>
> Timing[DeleteDuplicates[data,sameQ]][[1]];
>
> If N is a large number this takes an ungodly amount of time?
>
> Is there a more efficient way to delete the duplicate entries of Data ?
>
> ie.
>
> Data = {{1.,2.},{1.,3.},{2.,3.}};
>
> Would become:
> {{1.,2.},{ 2.,3.}};
>
>
> Thanks,
>
>
> Clint Zeringue
>
>



  • Prev by Date: Re: intersecting surfaces
  • Next by Date: Re: Diagonalizing
  • Previous by thread: DeleteDuplicates is too slow?
  • Next by thread: Re: DeleteDuplicates is too slow?