Re: DeleteDuplicates is too slow?
- To: mathgroup at smc.vnet.net
- Subject: [mg107212] Re: [mg107150] DeleteDuplicates is too slow?
- From: Tomas Garza <tgarza10 at msn.com>
- Date: Fri, 5 Feb 2010 03:25:36 -0500 (EST)
- References: <201002041126.GAA29795@smc.vnet.net>
Use Tally or, even better, GatherBy, to obtain very substantial reduc= tions in time: In[1]:= data=RandomInteger[{1,99},{100000,2}]; In[2]:= sameQ[_,_]=False; sameQ[{x_,y_},{x_,z_}]=True; In[4]:= Timing[t0=DeleteDuplicates[data,sameQ];] Out[4]= {7.987,Null} In[5]:= Timing[t1=#[[1]]&/@Tally[data,#1[[1]]==#2[[1]]&];][[1]] Out[5]= 0.063 In[6]:= Timing[t2=#[[1]]&/@GatherBy[data,First];][[1]] Out[6]= 0.016 In[7]:= t0===t1===t2 Out[7]= True Tomas > Date: Thu, 4 Feb 2010 06:26:02 -0500 > From: Clint.Zeringue at kirtland.af.mil > Subject: [mg107150] DeleteDuplicates is too slow? > To: mathgroup at smc.vnet.net > > Hello, > > Suppose you have the following. > > Data = RandomReal[1,{N,2}]; > > sameQ[_,_]=False; > sameQ[{x_,y_},{x_,z_}]=True; > > Timing[DeleteDuplicates[data,sameQ]][[1]]; > > If N is a large number this takes an ungodly amount of time? > > Is there a more efficient way to delete the duplicate entries of Data ? > > ie. > > Data = {{1.,2.},{1.,3.},{2.,3.}}; > > Would become: > {{1.,2.},{ 2.,3.}}; > > > Thanks, > > > Clint Zeringue >
- References:
- DeleteDuplicates is too slow?
- From: "Zeringue, Clint M Civ USAF AFMC AFRL/RDLAF" <Clint.Zeringue@kirtland.af.mil>
- DeleteDuplicates is too slow?