MathGroup Archive 2002

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Some speedups

  • To: mathgroup at smc.vnet.net
  • Subject: [mg37319] Re: [mg37292] Some speedups
  • From: Andrzej Kozlowski <andrzej at platon.c.u-tokyo.ac.jp>
  • Date: Wed, 23 Oct 2002 02:56:56 -0400 (EDT)
  • Sender: owner-wri-mathgroup at wolfram.com

Allan's posting made me realize a problem that affects all users of 
Mathematica and  Apple's Mail.app for MacOS X. If you use Mail.app and 
carry out Allan's instruction below you will not get the expected 
result. The reason is that Mail.app inserts an extra character into the 
text when pasting into a Carbon application (like the Mathematica Front 
End). This ruins Mathematica's ability to interpret the text as a 
Mathematica expression. One way around this problem is to paste the 
expression first into TextEdit and then copy it form TextEdit and paste 
into Mathematica.

I think this is a good place make a plea to Wolfram Research to 
seriously consider a Cocoa FrontEnd for MacOS X Mathematica. It is in 
fact somewhat surprising that it has not been done yet, given the 
advantages of Cocoa and the fact that it is essentially NextStep, with 
which WRI has a longer experience than most software developers.

Andrzej Kozlowski
Yokohama, Japan
http://www.mimuw.edu.pl/~akoz/
http://platon.c.u-tokyo.ac.jp/andrzej/


On Tuesday, October 22, 2002, at 05:47 PM, Allan Hayes wrote:

> The notebook code below contains ways of speeding up some uses of  
> Table,
> Sort and Split and avoiding building up inefficienlty large 
> intermediate
> expressions.
> I put them together while looking into the various solutions in the 
> thread
> "grouping and averaging {x,y} pairs of data" -- they helped give what 
> seems
> to be the fastest uncompiled solution yet.
>
> Allan
>
> ---------------------
> Allan Hayes
> Mathematica Training and Consulting
> Leicester UK
> www.haystack.demon.co.uk
> hay at haystack.demon.co.uk
> Voice: +44 (0)116 271 4198
> Fax: +44 (0)870 164 0565
>
>
> To make a notebook from the following, copy between the two lines 
> +++++++,
> past into a notebook, click "Yes" in the panel that pops up.
>
> +++++++
>
> Notebook[{
>
> Cell[CellGroupData[{
> Cell[TextData[{
>   "Speedups \n",
>   StyleBox["Table, Sort, Split",
>     FontSize->18]
> }], "Title"],
>
> Cell["Allan Hayes October 2002", "Subsubtitle"],
>
> Cell[BoxData[
>     \(Off[General::"\<spell1\>", General::"\<spell\>"]\)], "Input",
>   InitializationCell->True],
>
> Cell[CellGroupData[{
>
> Cell["General Techniques", "Section"],
>
> Cell[CellGroupData[{
>
> Cell["Table", "Subsection"],
>
> Cell[TextData[{
>   "The following illustrates a generally applicable point: that \
> inserting large expressions directly using a function or ",
>   StyleBox["With",
>     FontFamily->"Courier"],
>   " can lead to inefficiently large expressions. This can be avoided \
> by storing in a local variable."
> }], "Text"],
>
> Cell[BoxData[{
>     \(\(ranges = \ Table[{1, 200}, {10000}];\)\), "\n",
>     \(\(data\  = \ Range[100000];\)\)}], "Input"],
>
> Cell["This is quite slow.", "Text"],
>
> Cell[BoxData[
>     \(takesX[data_, \
>         ranges_] := \[IndentingNewLine]\(Take[data, #] &\) /@
>         ranges\)], "Input"],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(\(takesX[data, \ ranges];\) // Timing\)], "Input"],
>
> Cell[BoxData[
>     \({3.789999999999992`\ Second, Null}\)], "Output"]
> }, Closed]],
>
> Cell[TextData[{
>   "The use of the local variable ",
>   StyleBox["dd",
>     FontFamily->"Courier"],
>   " below gives much quicker code"
> }], "Text"],
>
> Cell[BoxData[
>     \(takes[data_, \
>         ranges_] := \[IndentingNewLine]Module[{dd =
>             data}, \(Take[dd, #] &\) /@ ranges]\)], "Input",
>   InitializationCell->True,
>   FontColor->RGBColor[0, 0, 1]],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(\(takes[data, \ ranges];\) // Timing\)], "Input"],
>
> Cell[BoxData[
>     \({1.4899999999999807`\ Second, Null}\)], "Output"]
> }, Closed]],
>
> Cell[TextData[{
>   "We might have expected that the direct insertion of ",
>   StyleBox["data",
>     FontFamily->"Courier"],
>   " into ",
>   StyleBox["Take[data,#]&",
>     FontFamily->"Courier New"],
>   " would be quicker, but it looks as if the very big expression that \
> is built up when using this technique slows things down.\nIn the \
> second form ",
>   StyleBox["dd",
>     FontFamily->"Courier"],
>   " is evaluated once as each expression\n\t\t",
>   StyleBox["Take[dd,#]&[{1,200}]",
>     FontFamily->"Courier New"],
>   "\nis evaluated."
> }], "Text"],
>
> Cell[TextData[{
>   "As an extra check, use ",
>   StyleBox["With",
>     FontFamily->"Courier"],
>   " to input directly:"
> }], "Text"],
>
> Cell[BoxData[
>     \(takesY[data_, \
>         ranges_] := \[IndentingNewLine]With[{dd =
>             data}, \(Take[dd, #] &\) /@ ranges]\)], "Input"],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(\(takesY[data, \ ranges];\) // Timing\)], "Input"],
>
> Cell[BoxData[
>     \({3.240000000000002`\ Second, Null}\)], "Output"]
> }, Closed]]
> }, Closed]],
>
> Cell[CellGroupData[{
>
> Cell["Sort", "Subsection"],
>
> Cell[BoxData[{
>     \(\(k = 2;\)\), "\[IndentingNewLine]",
>     \(\(data = \((SeedRandom[0];
>           Table[Random[
>               Integer, {1, 100}], {100000}, {k}])\);\)\)}], "Input"],
>
> Cell["Let's sort the data:", "Text"],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(\(Sort[data];\) // Timing\)], "Input"],
>
> Cell[BoxData[
>     \({1.9299999999999997`\ Second, Null}\)], "Output"]
> }, Closed]],
>
> Cell["\<\
> Sorting by the first coordinate might be expected to be quicker, but \
> the following  slows the calculation down markedly!\
> \>", "Text"],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(\(Sort[data, #1[\([1]\)] \[LessEqual] #2[\([1]\)] &];\) //
>       Timing\)], "Input"],
>
> Cell[BoxData[
>     \({150.16`\ Second, Null}\)], "Output"]
> }, Closed]],
>
> Cell[TextData[{
>   "But using ",
>   StyleBox["Ordering",
>     FontFamily->"Courier"],
>   ", as below will usually speed things up (the advantage increases \
> with the size of k)."
> }], "Text"],
>
> Cell[BoxData[
>     \(sortn[dat_,
>         n_] := \[IndentingNewLine]dat[\([Ordering[
>             dat[\([All, n]\)]]]\)]\)], "Input",
>   InitializationCell->True,
>   FontColor->RGBColor[0, 0, 1]],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(\(sortn[data, 1];\) // Timing\)], "Input"],
>
> Cell[BoxData[
>     \({1.4799999999999898`\ Second, Null}\)], "Output"]
> }, Closed]]
> }, Closed]],
>
> Cell[CellGroupData[{
>
> Cell["Split", "Subsection"],
>
> Cell[BoxData[{
>     \(\(data = \((SeedRandom[0];
>           Table[Random[
>               Integer, {1, 100}], {100000}, {2}])\);\)\), "\n",
>     \(\(sd = \ Sort[data];\)\)}], "Input"],
>
> Cell[TextData[{
>   "Split  ",
>   StyleBox["data",
>     FontFamily->"Courier"],
>   "  by the first coordinate"
> }], "Text"],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(\(Split[sd, \ #1[\([1]\)] \[Equal] #2[\([1]\)] &];\) //
>       Timing\)], "Input"],
>
> Cell[BoxData[
>     \({10.549999999999999`\ Second, Null}\)], "Output"]
> }, Closed]],
>
> Cell["The following (following Hartmut Wolf) is quicker ", "Text"],
>
> Cell[BoxData[
>     \(splitnX[dat_,
>         n_] := \[IndentingNewLine]Module[{s, p,
>           tks}, \[IndentingNewLine]s = Split[dat[\([All, n]\)]];
>         p = FoldList[\ #1 + #2 &,
>             0, \ \ Length /@ s]; \[IndentingNewLine]\(Take[
>               dat, #] &\) /@
>           Transpose[{Drop[p, \(-1\)] + 1,
>               Rest[p]}]\[IndentingNewLine]]\)], "Input"],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(\(splitnX[sd, 1];\) // Timing\)], "Input"],
>
> Cell[BoxData[
>     \({4.939999999999998`\ Second, Null}\)], "Output"]
> }, Closed]],
>
> Cell[TextData[{
>   "And using the trick we found above when looking at",
>   StyleBox[" Table ",
>     FontFamily->"Courier New"],
>   "it gets still quicker"
> }], "Text"],
>
> Cell[BoxData[
>     \(splitn[dat_,
>         n_] := \[IndentingNewLine]Module[{s, p, tks,
>           dd}, \[IndentingNewLine]ss = \(s =
>             Split[dat[\([All, n]\)]]\);
>         p = FoldList[\ #1 + #2 &,
>             0, \ \ Length /@ s]; \[IndentingNewLine]dd = \
>           dat; \[IndentingNewLine]\(Take[dd, #] &\) /@
>           Transpose[{Drop[p, \(-1\)] + 1,
>               Rest[p]}]\[IndentingNewLine]]\)], "Input",
>   InitializationCell->True,
>   FontColor->RGBColor[0, 0, 1]],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(\(splitn[sd, 1];\) // Timing\)], "Input"],
>
> Cell[BoxData[
>     \({2.09`\ Second, Null}\)], "Output"]
> }, Closed]]
> }, Closed]]
> }, Closed]],
>
> Cell[CellGroupData[{
>
> Cell["Application to Binned Averages", "Section"],
>
> Cell["\<\
> Daniel Lichtblau's elegant compiled solution is quickest. The \
> uncompiled version, 2.,  comes fairly close, but will probably fall \
> further behind with very big inputs.\
> \>", "Text"],
>
> Cell[TextData[{
>   "To make the timings more comparable, each section below starts \
> with ",
>   StyleBox["Quit",
>     FontFamily->"Courier New"],
>   ". This must be evaluated separately before the rest of the code. \
> You will be asked if you want to evaluate the initialization cells \
> when continuing with rest of the code - click \"Yes\". "
> }], "Text"],
>
> Cell[CellGroupData[{
>
> Cell["1. Hartmut Wolf", "Subsection",
>   CellDingbat->None,
>   CellMargins->{{10, Inherited}, {Inherited, Inherited}}],
>
> Cell[BoxData[
>     \(Quit\)], "Input"],
>
> Cell[BoxData[
>     \(\(data = \((SeedRandom[0];
>           Table[Random[
>               Integer, {1, 100}], {100000}, {2}])\);\)\)], "Input"],
>
> Cell[BoxData[
>     \(\(\(binnedAverage1[data_, max_] :=
>       Module[{v, i, ix, ixx, ixxx}, {i, v} =
>           With[{rr = Range[max]},
>             Transpose[
>               Sort[Join[data,
>                   Transpose[{rr,
>                       rr - rr}]]]]]; \[IndentingNewLine]ix =
>           Split[i]; \[IndentingNewLine]ixx =
>           FoldList[Plus[#1, Length[#2]] &, 0,
>             ix]; \[IndentingNewLine]ixxx =
>           Transpose[
>             Transpose[Partition[ixx, 2, 1]] + {1,
>                 0}]; \[IndentingNewLine]Transpose[{First /@
>               ix, \(\(\((Plus @@ #)\)/Max[Length[#] - 1, 1] &\)[
>                   Take[v, #]] &\) /@ ixxx}]]\)\(\n\)
>     \)\)], "Input"],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(Table[\(\(\((r1 = binnedAverage1[data, 100])\);\) // Timing\) //
>         First, {5}]\)], "Input"],
>
> Cell[BoxData[
>     \({3.790000000000001`\ Second, 4.609999999999999`\ Second,
>       4.17`\ Second, 5.`\ Second,
>       4.010000000000002`\ Second}\)], "Output"]
> }, Closed]]
> }, Closed]],
>
> Cell[CellGroupData[{
>
> Cell["2. Allan Hayes", "Subsection",
>   CellDingbat->None,
>   CellMargins->{{10, Inherited}, {Inherited, Inherited}}],
>
> Cell[TextData[{
>   "Note that ",
>   StyleBox["Tr[lst]",
>     FontFamily->"Courier"],
>   StyleBox[" is used ",
>     FontFamily->"Times New Roman"],
>   StyleBox["ins",
>     FontFamily->"Times New Roman"],
>   "tead o",
>   StyleBox["f ",
>     FontFamily->"Times New Roman"],
>   StyleBox["Plus@@lst",
>     FontFamily->"Courier"],
>   StyleBox[" ",
>     FontFamily->"Times New Roman"],
>   StyleBox["to sum",
>     FontFamily->"Times New Roman"],
>   StyleBox[" ",
>     FontFamily->"Times New Roman"],
>   StyleBox["lst",
>     FontFamily->"Courier"],
>   "."
> }], "Text"],
>
> Cell[BoxData[
>     \(Quit\)], "Input"],
>
> Cell[BoxData[
>     \(\(data = \((SeedRandom[0];
>           Table[Random[
>               Integer, {1, 100}], {100000}, {2}])\);\)\)], "Input"],
>
> Cell[BoxData[
>     \(binnedAverage2[data_, \
>         max_:  0] := \[IndentingNewLine]Module[{spl, avs, inds, dd,
>           zs}, \[IndentingNewLine]spl =
>           splitn[sortn[data, 1],
>             1]; \[IndentingNewLine]avs = \(\((Tr[#1[\([All, 2]\)]]/
>                   Length[#1])\) &\) /@
>             spl; \[IndentingNewLine]inds = \
>           spl[\([All, 1, 1]\)]; \[IndentingNewLine]dd\  = \
>           Range[Max[Last[inds], max]]; \[IndentingNewLine]zs = \
>           dd - dd; \[IndentingNewLine]zs[\([inds]\)] = \
>           avs; \[IndentingNewLine]Transpose[{dd,
>             zs}]\[IndentingNewLine]]\)], "Input"],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(Table[\(binnedAverage2[data] // Timing\) //
>         First, {5}]\)], "Input"],
>
> Cell[BoxData[
>     \({2.8100000000000005`\ Second, 3.0700000000000003`\ Second,
>       2.799999999999999`\ Second, 2.799999999999999`\ Second,
>       2.75`\ Second}\)], "Output"]
> }, Closed]]
> }, Closed]],
>
> Cell[CellGroupData[{
>
> Cell["3. Daniel Lichtblau", "Subsection",
>   CellDingbat->None,
>   CellMargins->{{10, Inherited}, {Inherited, Inherited}}],
>
> Cell[BoxData[
>     \(Quit\)], "Input"],
>
> Cell[BoxData[
>     \(\(data = \((SeedRandom[0];
>           Table[Random[
>               Integer, {1, 100}], {100000}, {2}])\);\)\)], "Input"],
>
> Cell[BoxData[
>     StyleBox[\(averageByBin3[
>           data : {{_, _} .. }]\  := \ \[IndentingNewLine]Module[\n\
> {res = Transpose[data]}, \n
>           res\  = \ averagebyBinC[res[\([1]\)], res[\([2]\)]]; \n
>           Transpose[{Range[Length[res]], res}]\n]\),
>       FormatType->StandardForm]], "Input"],
>
> Cell[BoxData[
>     StyleBox[\(\(\n\)\(\(averagebyBinC\  = \
>           Compile[{{xvals, _Integer, 1}, {yvals, _Real, 1}}, \n
>             Module[{len = Max[xvals], \ binsizes, \ averages,
>                 len2 = Length[xvals], \ indx}, \n
>               binsizes\  = \ Table[0, {len}]; \n
>               averages\  = \ Table[0. , {len}]; \n
>               Do\ [\nindx\  = \
>                   xvals[\([j]\)]; \n\(binsizes[\([indx]\)]++\); \n
>                 averages[\([indx]\)]\  += \ yvals[\([j]\)], \n{j,
>                   len2}\n]; \n
>               binsizes\  = \ Map[If\ [# == 0, 1, #] &, \ binsizes]; \n
>               averages\ /\ binsizes\n]\n];\)\(\n\)
>       \)\),
>       FormatType->StandardForm]], "Input"],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>     \(Table[\(\(averageByBin3[data];\) // Timing\) //
>         First, {5}]\)], "Input"],
>
> Cell[BoxData[
>     \({2.3099999999999996`\ Second, 2.1400000000000006`\ Second,
>       2.3599999999999994`\ Second, 2.3100000000000005`\ Second,
>       2.3599999999999994`\ Second}\)], "Output"]
> }, Closed]]
> }, Closed]]
> }, Closed]]
> }, Open  ]]
> },
> FrontEndVersion->"4.2 for Microsoft Windows",
> ScreenRectangle->{{0, 1024}, {0, 723}},
> AutoGeneratedPackage->None,
> WindowToolbars->"RulerBar",
> WindowSize->{459, 310},
> WindowMargins->{{250, Automatic}, {Automatic, -20}},
> ShowSelection->True
> ]
>
> +++++++
>
>
>
>
>
>



  • Prev by Date: Re: Raising Contour Plot Graphics to 3D
  • Next by Date: Re: Raising Contour Plot Graphics to 3D - II
  • Previous by thread: Problems during the start of mathematica 4.0
  • Next by thread: how to get a real fortran file