Re: Strange Compile Results
- To: mathgroup at smc.vnet.net
- Subject: [mg6857] Re: [mg6829] Strange Compile Results
- From: David Withoff <withoff>
- Date: Thu, 24 Apr 1997 02:44:35 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
> I am trying to compile a function that operates on a long data list (all > real values) to find the next maxima, given a certain starting point (it is a > function that is called many times by another function). To speed it up, I > have been attempting to compile it, and have had some strange results. Here > is the function: > > Findnextmax = Compile[{{list, _Real, 1}, {startpoint, _Integer}}, > Module[{i=0, total=0 ,start=0,end=0}, > {total = Length[list]-1; > If[startpoint >= total, Return[startpoint]]; > i = startpoint; > While[(i < (total) && (Part[list, i] >= Part[list, i+1])), i += 1]; > While[(i < (total) && (Part[list,i] < Part[list, i+1])), i += 1]; > If[Part[list,i]==Part[list,i+1], > start = i; > While[(i < total) && (Part[list,i] == Part[list,i+1]), i+=1]; > end = i; > i = Round[(start + end)/2]]; i], > {{Round[_], _Integer}}] > > (The function basically does this: go downhill if possible, then uphill > until the next maxima is reached. If it is flat on top, return the midpoint > of the flat section.) > > Now, given a long linear array of real values, I compare timings on 100 > calls of this function versus 100 calls of the same function uncompiled(using > Timing[Do[Findnextmax[data, 14], {100}]]). I get > Compiled 1.09s > Uncompiled: 0.37s > > This was surprising. So I played with commenting out various parts of the > function to see which was responsible for the slowdown. Surprisingly, a big > factor was the very last assigment statement (which originally used Floor > instead of Round). I played with it. The only thing that worked to speed it > up was Changing the '=' to ':=', which almost doubles the speed of the > compiled function!!! The new timing for the compiled function: 0.68s. > Stranger yet, if I change a different assignment, say "total := > Length[list]-1" from '=' to ':=', I get a big slowdown again! So it appears > to be inconsistent. There seems to be something going on with assignments > and compilation that I don't fully understand here. > > The other thing to be noticed is that even with the speedup provided by the > "correct" use of '=' and ':=', the compiled function is still almost 2x > slower in 100 calls. To see whether this was due to parameter passing > overhead, I moved the 100 loops inside the function. Now if I compare the > compiled to the uncompiled function, I get: > Compiled: 0.01s > Uncompiled: 0.15s > > So, the remaining speed difference was due to something in function > call/parameter passing. The only guess I had is there is some overhead > involved in converting this list into the appropriate form for use by > compiled functions. > > To test this, I compiled the external loop which calls the function, i.e. > test = Compile[{{mydata, _Real, 1}, {loops, _Integer}}, > Do[Findnextmax[mydata,14], {loops}], {{Findnextmax[_], _Integer}}] > > Results using this compiled function to call compiled vs. uncompiled > versions of Findnextmax using 100 loops(e.g. Timing[a[data, 100];]): > Compiled: 0.73 s > Uncompiled: 0.38 s > > So compiling the calling function to first get the list into appropriate > form before iterating the call has virtually no speedup effect. > > After all of this I can't think of anything else to make it run faster, > except putting it inline and compiling the whole calling function. The > problem is, the calling code is already large and messy, so that really isn't > a viable option. > > Can anyone explain the following two strange effects I have observed? > 1) 2x overhead calling a compiled function versus uncompiled > 2) location dependence of compiled function speed on use of '=' and ':=' in > assigment > > I just recently subscribed to the group and haven't recieved confirmation > yet, so please cc: me at giddings at whitewater.chem.wisc.edu. > > Thanks > --- > Michael Giddings > giddings at whitewater.chem.wisc.edu > giddings at barbarian.com > (608)258-1699 or (608) 692-2851 > http://smithlab.chem.wisc.edu/PersonalPages/giddings/giddings.html > http://www.barbarian.com Explanations for unexpected timing differences between compiled and uncompiled evaluation fall into one of two categories: 1) The CompiledFunction interpreter is using uncompiled evaluation; 2) The compiled and uncompiled functions aren't really equivalent. If the CompiledFunction interpreter is using uncompiled evaluation, then it would have been faster to call uncompiled evaluation in the first place, and to avoid the time-consuming detour through the CompiledFunction interpreter. That much is obvious. The hard part is to find out that that is what is happening, and then to figure out why it is happening, and what to do about it. The easiest way that I know of to find out that a CompileFunction expression contains calls to uncompiled evaluation is to at the list of pseudocode instructions in the CompiledFunction expression. Pseudocode instructions 31 and 32 are calls to uncompiled evaluation. If you are like me, and have a hard time remembering these sorts of details, just remember to look at the internal structure of the CompileFunction expression. The calls to uncompiled evaluation are usually pretty conspicuous. Here is what I found when I tried your example. The list of pseudocode instructions is the third element in the CompiledFunction expression. In[3]:= Findnextmax[[3]] Out[3]= {{1, 2}, {6, 1, 1, 3, 0}, {3, 2, 0}, {14, 0, 1}, {14, 0, 2}, > {14, 0, 3}, {14, 0, 4}, {107, 0, 5}, {14, -1, 6}, {33, 5, 6, 5}, > {19, 5, 2}, {99, 2, 0, 0}, {90, 0, 3}, {8, 0}, {91, 1}, {19, 0, 1}, > {97, 1, 2, 1}, {114, 0, 1, 0}, {14, 1, 5}, {33, 1, 5, 5}, > {114, 0, 5, 1}, {100, 1, 0, 2}, {101, 1, 2, 3}, {90, 3, 5}, {14, 1, 6}, > {33, 1, 6, 6}, {19, 6, 1}, {91, -11}, {97, 1, 2, 4}, {114, 0, 1, 2}, > {14, 1, 6}, {33, 1, 6, 6}, {114, 0, 6, 3}, {98, 2, 3, 5}, > {101, 4, 5, 6}, {90, 6, 5}, {14, 1, 7}, {33, 1, 7, 7}, {19, 7, 1}, > {91, -11}, {114, 0, 1, 4}, {14, 1, 7}, {33, 1, 7, 7}, {114, 0, 7, 5}, > {94, 4, 5, 0, 7}, {31, Function[{list, startpoint}, > If[list[[i]] == list[[i + 1]], > start = i; While[i < total && list[[i]] == list[[i + 1]], i += 1]; start + end > end = i; i = Round[-----------]]], {end, 2, 0, 4, Module}, 2 > {start, 2, 0, 3, Module}, {total, 2, 0, 2, Module}, > {i, 2, 0, 1, Module}, 6, 0, 17}, {8, 1}} All that Function[{list, startpoint}, etc., etc., etc. stuff in the middle of an otherwise innoccuous collection of integers is the call to uncompiled evaluation. The next step is to figure out why the CompiledFunction expression contains a call to uncompiled evaluation, and to figure out what to do about it. In general, there are dozens of possible explanations. The explanation that applies in this particular example is that this is basically a deficiency of the Compile function. Since the Compile function has no way of knowing the outcome of list[[i]] == list[[i + 1]], it has no way of knowing whether the return value of the enclosing If expression will be an integer or Null. The deficiency is that the Compile function isn't smart enough to figure out that it doesn't need to know the return value of the If expression. Your program doesn't use that return value. In any case, when the Compile function doesn't know what to do with an expression, it calls uncompiled evaluation. That is what happens in your example. If you use a delayed assignment in i := Round[(start + end)/2], then the return value of the If expression will always be Null, since the return value of the delayed assignment is Null. This will allow the Compile function to compile the If expression, but it still won't know what to do with the delayed assignment. You can get an even better effect by putting a semicolon after the immediate assignment, which also causes the return value of the If expression to be Null. Findnextmax = Compile[{{list, _Real, 1}, {startpoint, _Integer}}, Module[{i=0, total=0 ,start=0,end=0}, total = Length[list]-1; If[startpoint >= total, Return[startpoint]]; i = startpoint; While[(i < (total) && (Part[list, i] >= Part[list, i+1])), i += 1]; While[(i < (total) && (Part[list,i] < Part[list, i+1])), i += 1]; If[Part[list,i]==Part[list,i+1], start = i; While[(i < total) && (Part[list,i] == Part[list,i+1]), i+=1]; end = i; i = Round[(start + end)/2];]; i], {{Round[_], _Integer}}] After this change, I get In[13]:= Findnextmax[[3]] Out[13]= {{1, 2}, {6, 1, 1, 3, 0}, {3, 2, 0}, {14, 0, 1}, {14, 0, 2}, > {14, 0, 3}, {14, 0, 4}, {107, 0, 5}, {14, -1, 6}, {33, 5, 6, 5}, > {19, 5, 2}, {99, 2, 0, 0}, {90, 0, 3}, {8, 0}, {91, 1}, {19, 0, 1}, > {97, 1, 2, 1}, {114, 0, 1, 0}, {14, 1, 5}, {33, 1, 5, 5}, > {114, 0, 5, 1}, {100, 1, 0, 2}, {101, 1, 2, 3}, {90, 3, 5}, {14, 1, 6}, > {33, 1, 6, 6}, {19, 6, 1}, {91, -11}, {97, 1, 2, 4}, {114, 0, 1, 2}, > {14, 1, 6}, {33, 1, 6, 6}, {114, 0, 6, 3}, {98, 2, 3, 5}, > {101, 4, 5, 6}, {90, 6, 5}, {14, 1, 7}, {33, 1, 7, 7}, {19, 7, 1}, > {91, -11}, {114, 0, 1, 4}, {14, 1, 7}, {33, 1, 7, 7}, {114, 0, 7, 5}, > {94, 4, 5, 0, 7}, {90, 7, 26}, {19, 1, 3}, {97, 1, 2, 8}, > {114, 0, 1, 6}, {14, 1, 8}, {33, 1, 8, 8}, {114, 0, 8, 7}, > {94, 6, 7, 0, 9}, {101, 8, 9, 10}, {90, 10, 5}, {14, 1, 9}, > {33, 1, 9, 9}, {19, 9, 1}, {91, -11}, {19, 1, 4}, {15, 0.5, 8}, > {33, 3, 4, 9}, {14, 2, 10}, {24, 10, 9}, {45, 9, 10}, {24, 9, 11}, > {38, 11, 10, 9}, {34, 8, 9, 8}, {83, 8, 9}, {19, 9, 1}, {91, 1}, {8, 1}} This is what you want to see: a nice, clean collection of pseudocode instructions, with no calls to uncompiled evaluation. Returning again to more general issues, the overhead that you observed associated with sending arguments to a CompiledFunction expression is something that I would put in my second category of explanations for unexpected speed differences between compiled and uncompiled evaluation: the compiled and uncompiled functions aren't really equivalent. The CompiledFunction interpreter does indeed convert the input expression into an internal form (an array of numbers). This will speed up subsequent operations, but the conversion takes some time. Here is an example of a program that takes an array as an argument, but that doesn't do anything with the array, and instead just returns zero. In[14]:= Clear[f, fc] In[15]:= f[m_] = 0 Out[15]= 0 In[16]:= fc = Compile[{{m, _Real, 1}}, 0] Out[16]= CompiledFunction[{m}, 0, -CompiledCode-] In[17]:= data = Table[1., {1000000}]; In[18]:= Timing[f[data]] Out[18]= {0. Second, 0} In[19]:= Timing[fc[data]] Out[19]= {4.09344 Second, 0} As you can see, compiled evaluation takes a lot longer in this example than uncompiled evaluation, for the simple reason that it takes a certain amount of time to check that the argument is an array of real numbers, and to allocate space for 1000000 real numbers. This isn't really a fair comparison, since these programs aren't really equivalent. The definition f[m_] = 0 doesn't even check to see that the argument is a list of real numbers. A more fair comparison is with a definition that includes a pattern which checks to see that the argument is a list of real numbers. With this change, the time for compiled evaluation and the time for uncompiled evaluation are comparable, since both operations spend most of their time checking to see that the argument is a list of real numbers. In[20]:= fp[m:{___Real}] = 0 Out[20]= 0 In[21]:= Timing[fp[data]] Out[21]= {4.57812 Second, 0} Finally, regarding your experiment with test = Compile[{{mydata, _Real, 1}, {loops, _Integer}}, Do[Findnextmax[mydata,14], {loops}], {{Findnextmax[_], _Integer}}] this may not be doing quite wanted it to do. For the outer Compile, the function Findnextmax is just another unknown function, which will be handled by calling uncompiled evaluation. Your test function will convert mydata into an internal array of real numbers, but it will either convert it back into an expression (or it will use the original expression, I can't remember which) for the purpose of calling Findnextmax in an uncompiled evaluation. The evaluator will subsequently realize that Findnextmax is a CompiledFunction expression, and will call the CompiledFunction interpreter, but there is no reduction in the time spent doing conversions. These and other notes about the Compile function will soon be included in the technical support section of the Wolfram Research web site (http://www.wolfram.com/support), and will probably eventually be available as a technical report, or in other documentation. Dave Withoff Wolfram Research