Re: Re: Compound Expressions
- To: mathgroup at smc.vnet.net
- Subject: [mg29225] Re: [mg29216] Re: Compound Expressions
- From: David Withoff <withoff at wolfram.com>
- Date: Tue, 5 Jun 2001 04:21:54 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
> Further notes on compound expressions like > > In[147]:= a = 3; Remove["Global`*"]; a = 3; a > > In[149]:= a = 3; Remove["Global`*"]; Print[a]; a = 3; Print[a] > > and how they behave, or should behave, including a few comments from: > David Withoff <withoff at wolfram.com>. > > It's certainly helpful -- may even be essential -- for users to have a > mental image of how a software program is going to process the inputs > that the user feeds to it. > > My mental image is obviously colored (some might say "irreparably > damaged") by experience with languages like FORTRAN, or even worse BASIC. > > In any case, the most common mental image for any language is certainly > that individual expressions, or statements, will normally be executed > *sequentially*, in the order in which they are encountered. FORTRAN and BASIC work just like Mathematica in their handling of the basic issues illustrated by these examples. They read the input to figure out what they are being asked to do, and then they do it. This universal truth applies to all computer languages, and in fact to all communication. > The simplest mental image for interpreting a cell containing > > CompoundExpression[expr1; expr2; expr3] > > will then be that the kernel will respond to this by doing the same > things it would do if it encountered expr1, expr2, expr3 in three > successive but independent cells. > > In other words, CompoundExpression[] is interpreted primarily as a way > of condensing or formatting expr1, expr2, expr3 into one cell -- not as > something that's going to change the *meaning*, or the *functioning* of > expr1, expr2, or expr 3. That "mental image" is wrong in any computer programming language. Programming constructs like compound expressions have meaning independent of formatting. As a simple example from Mathematica, consider CompoundExpression[Print[1],Goto[x],Print[2],Label[x],Print[3]] in which the meaning of Goto[x] and Label[x] is obviously changed by the fact that they occur in a compound expression. More generally, the purpose of these constructs is not to change the meaning of the arguments, but to control the way those arguments are executed in the program. For example, Do[x=1,{1000}] doesn't change the meaning of x=1, but it certainly changes that way that this statement is executed in the program. > David has written in an earlier posting: > > > . . . the computer obviously has to encounter and recognize "Remove" > > before it can know that it is being asked to remove something. > > Similarly, "a" is not and cannot be removed immediately when it is > > encountered, since at least a few such encounters have to take place > > before the computer can know what it is being asked to do. > > I guess I just don't understand what the last sentence here is saying. That statement is simply the observation that is is necessary to read an instruction before it is possible to do what the instruction is asking. > He then adds: > > > The subsequent "a=3" does not recreate "a". The "a" was already > > created when it was encountered when the computer looked at the input > > to figure out what it was being asked to do. > > But doesn't the Remove[] expression remove "a" (if it exists), and of > course all other symbols in the Global context, as soon as it's > encountered and executed? This certainly seems to be confirmed by the > result of the first "Print[a]" statement in Line[149] above. > > The immediately following "a=3" statement then recreates "a" and gives > it a value, as is confirmed by the output of the second "Print[a]" > statement. > > (Note the discussion of Remove[] on p. 1181 of The Mathematica Book: > > "Once you have removed a symbol, you will never be able to refer to it > again, unless you recreate it.") The output of Print[a] in that example is Removed["a"], indicating that the "a" in Print[a] is referring to the removed symbol, not creating a new one. The subsequent assignment also operates on the removed symbol, rather than creating a new symbol. The comment in the book points out that you can't refer to a symbol after it has been removed. In these examples, however, the input involving Remove already includes references to the symbol. Evaluation of Remove[a] does not and cannot go back and undo references that were set up before the symbol was removed. > A second question is, does CompoundExpression[] *have* to function in > this more complex way? -- or is this the consequence of certain design > decisions that could have been made otherwise?. Are there situations > where simply making e1;e2;e3 in a single cell function differently from > e1, e2, e3 in three successive cells would lead to inherently > unacceptable consequences? (all this assuming that e1, e2, e3 are > expressions that would be syntactically acceptable written as three > separate cells) Perhaps you can answer this question for yourself by considering how FORTRAN (or any other computer language) would handle this issue. If a FORTRAN compiler encountered the equivalent of a = 3; Remove["Global`*"]; a = 3; a it would first go through and parse the input, which involves identifying symbols and setting up a table of references between symbols and memory locations (a symbol table). With this information the compiler produces an executable program. In that executable program, every "a" in the program file corresponds to a reference to a particular memory location. Anything that the executable does will have to come to terms with the fact that, in the output of a normal FORTRAN compiler, the first "a" in the program refers to the same memory location as the last "a" in the program. That is the also how Mathematica works, and essentially every other programming language. The first "a" in the program refers to the same memory location as the last "a" in the program. So a partial answer to your question is that, if a programming language could be designed to work some other way, then it could not work the way FORTRAN is normally used, by first generating a symbol table and a compiled executable. The FORTRAN compiler would have to translate a=3 into a bit of code to do different things depending on what had happened earlier in the program, either putting 3 in the memory location already associated with "a", or dynamically allocating a new memory location and putting 3 in that location, and making an addition to a dynamic symbol table so that subsequent references to "a" will know where to find it. This change will make this one example work as you presumably had expected. Similar changes will be needed for every other aspect of the language. You might consider, for example, what you would expect, and what changes would be needed to get what you want, from statements such as e = 2 x; r = x -> 3; Remove[x]; e /. r or y=True;Label[x];Remove[x];If[y=!y,Goto[x]] or even from statements like Do[If[x == 3,Remove[x], Print[x]], {x, 5}] that don't involve compound expressions. With the way that Mathematica and all other programming languages handle these types of inputs, everything is very simple. The x in each of these inputs refers to the same x throughout evaluation of that input. Any alternative handling would be more complicated. > Finally, note that someone else has pointed out that > > In[149]:= a = 3; Remove["Global`*"]; Print[a]; a = 3; Print[a]; a > > In[150]:= a > > and > > In[149]:= a = 3; > Remove["Global`*"]; > Print[a]; > a = 3; > Print[a]; > a > > In[150]:= a > > function quite differently, which suggests that more than meets the eye > may be going on here. > > It also motivates one to ask, where is the ordinary user sternly warned > that this kind of unexpected behavior might occur? And, does it make > any difference whether the user actively inserts the line breaks, or > they are inserted by the line-wrapping action of the Mathematic Front > End? This is an unrelated issue. When an input is split across several lines the computer has to guess whether those lines should be considered as separate inputs or as lines in a single multi-line input. It is not possible to determine this unambiguously from the input alone, and there is indeed some non-trivial processing involved in handling ambiguous multi-line inputs like this, regardless of whether or not those inputs involve compound expressions. This does not, however, have anything to do with the programming language issues raised by the initial example. This is only an interface issue having to do with handling of ambiguous multi-line inputs. Dave Withoff Wolfram Research