Re: Efficiency of repeatedly appending to a List
- To: mathgroup at smc.vnet.net
- Subject: [mg71148] Re: [mg71091] Efficiency of repeatedly appending to a List
- From: Sseziwa Mukasa <mukasa at jeol.com>
- Date: Thu, 9 Nov 2006 03:38:31 -0500 (EST)
- References: <200611081114.GAA22363@smc.vnet.net>
On Nov 8, 2006, at 6:14 AM, Andrew Moylan wrote: > Hi all, > > Recently I asked about efficient ways to insert items into a sorted > List such that the new list was also sorted [1]. I received > suggestions > to linearly search through the list and find the right place to insert > the new time (search time = O(N), where N is the size of the list), > and > also suggestions to use a faster binary search to locate the right > place to insert the new time (search time = O(Log[N])). I'm one of those who naively suggested an O(N) search, and I should know better. Also I somehow had never heard of the Insert command, the two together are probably more efficient than Appending then sorting since the sort should be O(N Log[N]). I had also never bothered to check whether a binary search would be in the Combinatorica package, which it is. So here's the code for a binary search and insertion: <<DiscreteMath`Combinatorica insertSorted[lst_,elem_]:=Insert[lst,elem,Ceiling[BinarySearch [lst,elem]]] > This caused me to wonder about the efficiency of appending items to > Lists in Mathematica in general. Is it an O(N) operation (i.e., > something akin to a realloc in C)? Or does Mathematica employ some > sort > of anticipatory memory allocation scheme to make adding new > elements to > lists usually faster than O(N)? One way to test this is to Append to lists of various sizes and look for discontinuities and a general trend, because of restrictions on the mailing list I can't append a plot, but running this code plots the timing of Append to lists of various sizes. Block[{data,lst,x},data=Table[{len,lst=Table[x,{len}];First[Timing [Append[lst,x]]/.Second->1]},{len,1000,30000}];ListPlot [data,PlotRange->All,PlotJoined->True,AxesLabel->{"Length","Time (s)"}]]; There is definitely an O(N) trend to the data, interestingly there is also an increasing variance in the data as the list size increases. There also seems to be a discontinuity right around a list length of 2000 when plotted on my machine (dual 2.7 GHz G5, Mac OS X). The discontinuity seems to indicate that at least for short lists extra memory is allocated, but in general appending to a list is an O(N) operation. Changing the type of x moves the discontinuity, setting x=1 moves the discontinuity to around 4000 elements and results in a much lower slope. I suppose if one is really interested in this kind of investigation there is room for futher research but given the times involved (less than 0.5ms) I'm not sure how useful this knowledge would be. Mathematica is not well suited to large scale data manipulation of this type in my experience and one should probably chose a different tool if the majority of operations are manipulation of large data structures in memory. Regards, Ssezi
- Follow-Ups:
- Re: Re: Efficiency of repeatedly appending to a List
- From: "Chris Chiasson" <chris@chiasson.name>
- Re: Re: Efficiency of repeatedly appending to a List
- References:
- Efficiency of repeatedly appending to a List
- From: "Andrew Moylan" <andrew.j.moylan@gmail.com>
- Efficiency of repeatedly appending to a List