Re: Block vs. Module
- To: mathgroup at smc.vnet.net
- Subject: [mg15632] Re: [mg15590] Block vs. Module
- From: David Withoff <withoff>
- Date: Sat, 30 Jan 1999 04:28:32 -0500 (EST)
- Sender: owner-wri-mathgroup at wolfram.com
> Recently, while profiling some rather large Mathematica 3.0 programs, my > colleagues and I discovered that simply exchanging all Module > statements in a package against Block statements reduced the total > computation time considerably. In one particular case with a large > number of function calls and local variables the execution time was > reduced by almost 50%. > > The Mathematica documentation describes Block as a structure that allows > for temporarily assigning local values to global names, as opposed to > Module which defines new local symbols (which obviously takes a lot > more time). Thus, rather than Module, Block allows to make use of some > side effects on global symbols, but this should not happen in > well-structured programs. Due to the extreme speed differences we > encountered we are now asking ourselves if there is actually any good > reason against generally using Block instead of Module in Mathematica > packages. Are there any internal implementation issues that need to be > taken in account (such as heap management or stack size limitations)? > > Best regards, > > Eckhard Hennig Block works by setting aside the values of local variables so that those variables can be used freely during evaluation of the body of the Block, and then restoring the original values of the local variables when the Block finishes. The overhead of variable localization in Block is therefore proportional to the number of variables. Module works by generating new symbols and inserting those symbols where ever they appear in the body of the Module. The overhead of variable localization in Module is therefore proportional to the size of the body of the Module as well as to the number of variables. If the body of the Block or Module is large, and the body evaluates quickly so that the overhead of variable localization is a significant component of the total time, then Block can be significantly faster than Module, at least in interpreted functions (uncompiled evaluation). The analysis is different in compiled code, where Module would in principle be faster, but that is a separate topic. Assuming that program correctness is more important than speed (that is, getting the wrong answer quickly is not an advantage), then the choice between Module and Block depends on the program. Block and Module offer different types of variable localization, and you would presumably choose the one that is correct for your application. If it doesn't matter, then it makes sense to choose Block, at least in uncompiled code, since Block will usually be faster. As you pointed out in your message, global variables can lead to problems in programs based on Block. For example, a definition such as In[1]:= f[p_] := p x that multiples the argument by a global variable x, might give unexpected behavior if it happens to be called from within a Block that uses x as a local variable: In[2]:= x = 2.7; f[5] Out[2]= 13.5 In[3]:= Block[{x = 99.5}, f[5]] Out[3]= 497.5 That conflict won't happen if Module is used: In[4]:= Module[{x = 99.5}, f[5]] Out[4]= 13.5 The other type of difficulty with global variables and Block occurs for functions that accept symbolic values for function arguments. Here, for example, is a function that behaves differently depending on the name of the argument: In[5]:= MultiplyByTwo[p_] := Block[{n = 2}, n p] In[6]:= MultiplyByTwo[z] Out[6]= 2 z In[7]:= MultiplyByTwo[n] Out[7]= 4 Again, that conflict doesn't come up if Block is replaced by Module: In[8]:= MultiplyByTwo[p_] := Module[{n = 2}, n p] In[9]:= MultiplyByTwo[n] Out[9]= 2 n Probably most programmers would argue that the first example is not, as you put it, a well-structured program, so you probably don't need to worry about that. The second example is of deeper concern, but if your packages don't use symbolic values for function arguments, or if your packages use contexts (such as through a package structure involving BeginPackage and Begin) so that the Block variables are in a private context, then this type of conflict is unlikely to come up except internal to the package, which you can watch out for, or if a malicious user decides to re-use the private package context. There are not any exotic internal issues (such as "heap management or stack size limitations") that merit consideration here. Both Block and Module are useful in different situations. It is certainly not uncommon to prefer Block over Module purely for the speed reason that you mentioned, but these functions are not terribly complicated, and the choice in most cases should be based primarily on an understanding of the type of variable localization that is appropriate for the application. Dave Withoff Wolfram Research