Re: Efficiency of repeatedly appending to a List
- To: mathgroup at smc.vnet.net
- Subject: [mg71148] Re: [mg71091] Efficiency of repeatedly appending to a List
- From: Sseziwa Mukasa <mukasa at jeol.com>
- Date: Thu, 9 Nov 2006 03:38:31 -0500 (EST)
- References: <200611081114.GAA22363@smc.vnet.net>
On Nov 8, 2006, at 6:14 AM, Andrew Moylan wrote:
> Hi all,
>
> Recently I asked about efficient ways to insert items into a sorted
> List such that the new list was also sorted [1]. I received
> suggestions
> to linearly search through the list and find the right place to insert
> the new time (search time = O(N), where N is the size of the list),
> and
> also suggestions to use a faster binary search to locate the right
> place to insert the new time (search time = O(Log[N])).
I'm one of those who naively suggested an O(N) search, and I should
know better. Also I somehow had never heard of the Insert command,
the two together are probably more efficient than Appending then
sorting since the sort should be O(N Log[N]). I had also never
bothered to check whether a binary search would be in the
Combinatorica package, which it is. So here's the code for a binary
search and insertion:
<<DiscreteMath`Combinatorica
insertSorted[lst_,elem_]:=Insert[lst,elem,Ceiling[BinarySearch
[lst,elem]]]
> This caused me to wonder about the efficiency of appending items to
> Lists in Mathematica in general. Is it an O(N) operation (i.e.,
> something akin to a realloc in C)? Or does Mathematica employ some
> sort
> of anticipatory memory allocation scheme to make adding new
> elements to
> lists usually faster than O(N)?
One way to test this is to Append to lists of various sizes and look
for discontinuities and a general trend, because of restrictions on
the mailing list I can't append a plot, but running this code plots
the timing of Append to lists of various sizes.
Block[{data,lst,x},data=Table[{len,lst=Table[x,{len}];First[Timing
[Append[lst,x]]/.Second->1]},{len,1000,30000}];ListPlot
[data,PlotRange->All,PlotJoined->True,AxesLabel->{"Length","Time
(s)"}]];
There is definitely an O(N) trend to the data, interestingly there is
also an increasing variance in the data as the list size increases.
There also seems to be a discontinuity right around a list length of
2000 when plotted on my machine (dual 2.7 GHz G5, Mac OS X). The
discontinuity seems to indicate that at least for short lists extra
memory is allocated, but in general appending to a list is an O(N)
operation. Changing the type of x moves the discontinuity, setting
x=1 moves the discontinuity to around 4000 elements and results in a
much lower slope. I suppose if one is really interested in this kind
of investigation there is room for futher research but given the
times involved (less than 0.5ms) I'm not sure how useful this
knowledge would be. Mathematica is not well suited to large scale
data manipulation of this type in my experience and one should
probably chose a different tool if the majority of operations are
manipulation of large data structures in memory.
Regards,
Ssezi
- Follow-Ups:
- Re: Re: Efficiency of repeatedly appending to a List
- From: "Chris Chiasson" <chris@chiasson.name>
- Re: Re: Efficiency of repeatedly appending to a List
- References:
- Efficiency of repeatedly appending to a List
- From: "Andrew Moylan" <andrew.j.moylan@gmail.com>
- Efficiency of repeatedly appending to a List