       Re: how to be as efficient as Mean[list] - efficient summing of function applied to list

• To: mathgroup at smc.vnet.net
• Subject: [mg28547] Re: how to be as efficient as Mean[list] - efficient summing of function applied to list
• From: Tom Burton <tburton at cts.com>
• Date: Wed, 25 Apr 2001 19:21:55 -0400 (EDT)
• References: <9c5o3v\$iv1@smc.vnet.net>
• Sender: owner-wri-mathgroup at wolfram.com

```Hello Matt,

Paste the Notebook expression at the end of this note into Mathematica to see some of my ideas about this.

Regards,

Tom

On 25 Apr 2001 01:42:23 -0400, in comp.soft-sys.math.mathematica you wrote:

>I have a one dimensional list of 28,900 data points.  I want to apply a
>function, called f[x_], to each element of this list and sum the resulting
>list to obtain one number.  For the following example, f[x_] := -Log[x].  My
>naive idea was to do something like this:
>
>Length[data]
>28900
>
>Apply[Plus,Map[f[#1]&,data]]//Timing
>{32.136 Second,-69627.8}
>
>However, this seemed to take to long when compared to other built-in
>Mathematic functions that iterate over a list and perform a sum.  For
>example, the Mean function on the same data takes two orders of magnitude
>less time than my function:
>
>Mean[data]//Timing
>{0.13 Second,-0.0296526}
>
>Is the "Apply[Plus[...]]" method the right method to use for what I want to
>do?  Or, is it just that the time it takes my function f[x_] to perform one
>calculation, when amortized over the whole 28,900 data points, just adds up
>to a lot of time (32 seconds in my example).
>
>Regards, Matt

Notebook[{

Cell[CellGroupData[{
Cell[BoxData[
\(\$Version\)], "Input"],

Cell[BoxData[
\(TraditionalForm\`"4.1 for Microsoft Windows (November 2, 2000)"\
\)], "Output"]
}, Open  ]],

Cell[CellGroupData[{

Cell[BoxData[
\(\$OperatingSystem\)], "Input"],

Cell[BoxData[
}, Open  ]],

Cell["\<\
Consider a list of 300,000 real elements. (I have increased the \
number 10-fold from your example to obtain more reliable timing \
numbers.)\
\>", "Text"],

Cell[BoxData[
\(\(data = Table[Random[], {300000}];\)\)], "Input"],

Cell["\<\
The time to compute the sum of these elements is (on my PC 400 MHz)\
\>", "Text"],

Cell[CellGroupData[{

Cell[BoxData[
\(Timing[Plus @@ data]\)], "Input"],

Cell[BoxData[
150213.20944801893`}\)], "Output"]
}, Open  ]],

Cell[TextData[{
"This is far less than the 32 seconds you found, so I would have to \
guess that most of the time spent in the operation ",
StyleBox["f",
FontSlant->"Italic"],
". But, you are correct, the Mean function is quite a bit faster:"
}], "Text"],

Cell[BoxData[
\(<< Statistics`DescriptiveStatistics`\)], "Input"],

Cell[CellGroupData[{

Cell[BoxData[
\(Timing[Mean[data]]\)], "Input"],

Cell[BoxData[
0.5007106981600735`}\)], "Output"]
}, Open  ]],

Cell["\<\
The mean function, being a package, is written in high-level code, so \
I am a bit surprised at its speed. Let's look inside the Mean \
function to see how it does it.\
\>", "Text"],

Cell[CellGroupData[{

Cell[BoxData[
\(Attributes[Mean]\)], "Input"],

Cell[BoxData[
}, Open  ]],

Cell[BoxData[{
\(\(Unprotect[Mean];\)\), "\n",

Cell[CellGroupData[{

Cell[BoxData[
\(?? Mean\)], "Input"],

Cell[BoxData[
RowBox[{"\<\"Mean[list] gives the mean of the entries in \
list.\"\>", " ",
ButtonBox[
StyleBox["More\[Ellipsis]",
"SR"],
ButtonData:>"Statistics`DescriptiveStatistics`",
Active->True,
CellTags->"Info3197183060-5340671"],

Cell[BoxData[
InterpretationBox[GridBox[{
{GridBox[{
{\(Mean[
Statistics`DescriptiveStatistics`Private`list_] \
:= Tr[Statistics`DescriptiveStatistics`Private`list]\/Length[\
Statistics`DescriptiveStatistics`Private`list] /;
VectorQ[
Statistics`DescriptiveStatistics`Private`\
list] && Length[Statistics`DescriptiveStatistics`Private`list] > 0\)}
},
GridBaseline->{Baseline, {1, 1}},
ColumnWidths->0.999,
ColumnAlignments->{Left}]}
},
GridBaseline->{Baseline, {1, 1}},
ColumnAlignments->{Left}],
Definition[ "Mean"],
Editable->False]], "Print",
CellTags->"Info3197183060-5340671"]
}, Open  ]],

Cell["\<\
The Mean function uses the built-in function Tr (trace) to compute \
the sum.\
\>", "Text"],

Cell[CellGroupData[{

Cell[BoxData[
\(\(?Tr\)\)], "Input"],

Cell[BoxData[
RowBox[{
"\"\<Tr[list] finds the trace of the matrix or tensor list. \
Tr[list, f] finds a generalized trace, combining terms with f instead \
of Plus. Tr[list, f, n] goes down to level n in list.\>\"", " ",
ButtonBox[
StyleBox["More\[Ellipsis]",
"SR"],
ButtonData:>"Tr",
Active->True,
GeneratedCell->False,
CellAutoOverwrite->False,
CellTags->"Info3197182622-5672632"]
}, Open  ]],

Cell["Of course, it's the sum you want:", "Text"],

Cell[CellGroupData[{

Cell[BoxData[
\(Timing[Tr[data]]\)], "Input"],

Cell[BoxData[
150213.20944802207`}\)], "Output"]
}, Open  ]],

Cell[TextData[{
"So why is Plus@@list so slow? One possible reason is that ",
StyleBox["Mathematica",
FontSlant->"Italic"],
" is trying to evaluate all elements of the list before summing. We \
can try to prevent that:"
}], "Text"],

Cell[CellGroupData[{

Cell[BoxData[
\(Timing[\(data2 = HoldComplete @@ data;\)]\)], "Input"],

Cell[BoxData[
Null}\)], "Output"]
}, Open  ]],

Cell[CellGroupData[{

Cell[BoxData[
\(Timing[Plus @@ data2]\)], "Input"],

Cell[BoxData[
150213.20944801893`}\)], "Output"]
}, Open  ]],

Cell["\<\
HoldAllComplete cuts down the time a bit, but Plus@@ still much \
slower. Let's examine what happens with Trace:\
\>", "Text"],

Cell[CellGroupData[{

Cell[BoxData[
\(Trace[Plus @@ HoldAllComplete[1, 2, 3]]\)], "Input"],

Cell[BoxData[
FormBox[
RowBox[{"{",
RowBox[{
TagBox[\(Plus @@ HoldAllComplete[1, 2, 3]\),
HoldForm], ",",
TagBox[\(1 + 2 + 3\),
HoldForm], ",",
TagBox["6",
}, Open  ]],

Cell["\<\
Perhaps the time is taken forming the text expression \"1+2+3\". I \
suppose that the built-in function Tr dispenses with this step.\
\>", "Text"]
},
FrontEndVersion->"4.1 for Microsoft Windows",
ScreenRectangle->{{0, 1280}, {0, 979}},
WindowSize->{829, 740},
WindowMargins->{{0, Automatic}, {Automatic, 0}}
]

Tom

```

• Prev by Date: Getting stylized text with a palette button
• Next by Date: Q: How can I force ListPlot to show all data in graphics?
• Previous by thread: Re: how to be as efficient as Mean[list] - efficient summing of function applied to list
• Next by thread: Re: Re: how to be as efficient as Mean[list] - efficient summing of function applied to list