Re: DeleteDuplicates is too slow?
- To: mathgroup at smc.vnet.net
- Subject: [mg107189] Re: [mg107150] DeleteDuplicates is too slow?
- From: Leonid Shifrin <lshifr at gmail.com>
- Date: Fri, 5 Feb 2010 03:21:25 -0500 (EST)
- References: <201002041126.GAA29795@smc.vnet.net>
Hi,
you can use
Tally[data][[All, 1]]
although perhaps this is not the fastest. It seems like DeleteDuplicates is
slow here because it
uses the comparison function to compare each element with all the others,
which takes quadratic time
in the size of the dataset. It is blazingly fast on packed arrays of
numbers, however.
Regards,
Leonid
On Thu, Feb 4, 2010 at 2:26 PM, Zeringue, Clint M Civ USAF AFMC AFRL/RDLAF <
Clint.Zeringue at kirtland.af.mil> wrote:
> Hello,
>
> Suppose you have the following.
>
> Data = RandomReal[1,{N,2}];
>
> sameQ[_,_]=False;
> sameQ[{x_,y_},{x_,z_}]=True;
>
> Timing[DeleteDuplicates[data,sameQ]][[1]];
>
> If N is a large number this takes an ungodly amount of time?
>
> Is there a more efficient way to delete the duplicate entries of Data ?
>
> ie.
>
> Data = {{1.,2.},{1.,3.},{2.,3.}};
>
> Would become:
> {{1.,2.},{ 2.,3.}};
>
>
> Thanks,
>
>
> Clint Zeringue
>
>
- References:
- DeleteDuplicates is too slow?
- From: "Zeringue, Clint M Civ USAF AFMC AFRL/RDLAF" <Clint.Zeringue@kirtland.af.mil>
- DeleteDuplicates is too slow?