[Date Index] [Thread Index] [Author Index]
prograMing: trees and indexes, 2/2

To: mathgroup@smc.vnet.net
Subject: [mg11376] prograMing: trees and indexes, 2/2
From: "Xah" <xah@best.com>
Date: Sat, 7 Mar 1998 02:06:45 -0500
Organization: Venus & Xah Love Factory
...continued from previous message.

----- ----- -----

Lisp vs. Mathematica expression:

Lisp expression system and Mathematica expression system are isomorphic,
but expression systems are not isomorphic to trees. There is a
one-to-one mapping between the set of all expressions and the set of
all trees, but there is NOT a one-to-one mapping between strings in an
expression and nodes of its corresponding tree. For example, a tree is
a set but an expression is not a set. However, there is a one-to-one
mapping between the atoms in an expression to leaves of its
corresponding tree. For example, consider the Mathematica expression
1[][2,3[4],5[6[]]]. The atoms correpsonds to the leaves index set
{{1},{0,0},{2,0},{2,1},{3,0},{3,1,0}}. This can be seen using the
following construct

 expr=1[][2,3[4],5[6[]]];
 (#->Part[expr,Sequence@@#])&/@Position[expr,_,{-1}]//Sort

returns

{{1} -> 2, {0, 0} -> 1, {2, 0} -> 3, {2, 1} -> 4, 
  {3, 0} -> 5, {3, 1, 0} -> 6}

In summary, an expressions (lisp or Mathematica) corresponds to tree
shapes, and the atoms of an expresion corresponds to the leaves of its
corresponding tree.

In lisp form, atoms' indexes are visually apparant. For example, in ((1)
2 (3 4) (5 (6))), 1 must be {0,0}, 2 is {1}, 3 and 4 are {2,0} and
{2,1}, 5 is {3,1}, and 6 is {3,1,0}. In the Mathematica counterpart
1[][2,3[4],5[6[]]], it is less obvious what is the index of 1. In
general, if the first child of any node has children (i.e. if we have
non-atomic heads), then it's difficult to "see" its structure. Here is
a comparison table:

 lisp    Mathematica
-------------------------
 0       0          atom

 (0)     0[]        appending (sibling birth)
 (0 0)   0[0]
 (0 0 0) 0[0 0 0]
 
 ((0))   0[][]      nesting (new generation birth)
 (((0))) 0[][][]

In the lisp form, any pair of matching parenthesis (...) encloses a
subexpression. (subexpression is a continuous partial string that is
itself an expression). In the Mathematica counterpart, the form of
subexpression is atom[...][...]...[...]. For example, f[][] is a
subexpression corresponding to lisp's ((f)).

The structure of lisp expression is more visually apparent. However,
Mathematica expression form has its advantages too. For example, sin(x)
is more familiar then (sin x). Similarly of f(f(x)) vs. (f (f x)). But
I believe there is a more important advantage which we'll discuss
below.

----- ----- -----

Heads->True/False Intepretations

It is desirable to have an expression system such that atoms corresponds
to all nodes of a tree, not just the leaves. This is inherently
"impossible", but we can "fake it" by intepreting our expression
differently.

Notice that all nonleaves has at least one child: its first child. (e.g.
{2,0} is the first child of {2}.) We can pretend that the first child
(and all its offspring) of every nonleave is actually the nonleave, and
the first child itself does not exist.

Example:

The complete index set for the expression
List[List[f[1,1]],List[f[2,1]]] is

{{0},{1},{2},{1,0},{1,1},{2,0},{2,1},{1,1,0},{1,1,1},
 {1,1,2},{2,1,0},{2,1,1},{2,1,2}}

In standard intepretation, the correspondences between leaves and atoms
are:

{{0} -> List,
{1, 0} -> List,
{2, 0} -> List,
{1, 1, 0} -> f,
{1, 1, 1} -> 1,
{1, 1, 2} -> 1,
{2, 1, 0} -> f,
{2, 1, 1} -> 2,
{2, 1, 2} -> 1}

The nonleaves {{},{1},{2},{1,1},{2,1}} have no corresponding atoms.

In a "Mathematica intepretation", we have a mapping between all nodes
and atoms:

{{} -> List,
{1} -> List,
{2} -> List,
{1, 1} -> f,
{1, 1, 1} -> 1,
{1, 1, 2} -> 1,
{2, 1} -> f,
{2, 1, 1} -> 2,
{2, 1, 2} -> 1}

The following nodes are now considered "non-existant":
{{0},{1,0},{2,0},{1,1,0},{2,1,0}}, because their atoms now "belong" to
their parents.

This new intepretation is actually more natural to Mathematica
programers. When we see f[1,2...], we think of f[] as a container, and
things are stored between its brackets. We don't think of f as the
first element of a container even though it IS, as clearly shown in
lisp form (f 1 2 ...). With the "Mathematica intepretation", things are
dandy because now we have data at all nodes, not just the leaves. This
is extremely useful. The price we pay is that when the heads are
nested, things quickly become confusing. Compare:

Mathematica form:
1[2][3[4]][5[6][7[8]]]

Lisp form:
(((1 2)(3 4))((5 6)(7 8)))

Complete index set:
{{0},{1},{0,0},{0,1},{1,0},{1,1},{0,0,0},{0,0,1},{0,1,0},{0,1,1},
 {1,0,0},{1,0,1},{1,1,0},{1,1,1}}

MinimumIndexSet:
{{0,0,1},{0,1,1},{1,0,1},{1,1,1}}

In the above, the expression's structure is clearly shown by lisp form.
However, the structure using the "Mathematica intepretation" is better
shown by Mathematica form: the new structure is equivalent to f[ g[
7[8] ] ], where f is 1[2][3[4]] and g is 5[6]. The dimensions of the
expression in stardard (lisp) intepretation is {2,2,2}, but in
"Mathematica intepretation" it is {1,1,1}, with each 1 correpsonding to
g, 7, and 8 respectively. (Note: This is not what Dimensions returns.
More on that later.)

The two different intepretation of expression is exactly what the
Heads->True/False option means in functions that accepts a level spec.
When Heads->True, it tell Mathematica to inteprete the expression
normally, so that atoms corresponds to leaves only. Heads->False tells
Mathematica to regards Heads as parts of the container, not part of the
expression.

Here is an analogy: We are familiar with the concept of folders and
files. Folders are containers of files. It is extremely convenient for
Folders to have names. In the "lisp intepretation", folders have no
names. A folder is a pure basket. You MAY think of the name of a folder
as the first thing in its basket. In the "Mathematica intepretation", a
folder has a name, and its name is the first item it contains.
Consequently, each folder has one less item. Furthermore, the first
item may happen to be a folder, thus the "name" of a folder may be
another folder.

Remember that the lisp and Mathematica expression are isomorphic. You
can use either one for whatever intepretation. It is in the Mathematica
form that makes the Heads->True/False bi-intepretation apparant, and
functions can have a Heads option for flexibility. This is what I
believe to be an advantage. The disadvantage is that the expression
form no longer clearly indicates its structure if heads are non-atomic,
and the Heads->True/False is quite confusing. Consequently, I believe
it is why we don't see much use of multiple nested heads.

With the above understanding, one may observe some idiosyncrasies of
Mathematica. Here are some examples.

Depth of an expression is the max number of digits in the expression's
complete index set plus 1. Plus 1 because the root {} counts as one
level. Now Depth@f[][][] returns 2, but the index of f is {0,0,0}. Of
course, if you read the Appendix, it explains that Depth[expr] is
effectively Depth[expr,Heads->False]. I wish there is a Heads option to
Depth.

Similarly, the Dimensions function is effectiely
Dimensions[expr,Heads->False]. For example, expr=0[0][0[0],0[0]], its
lisp form is ((0 0)(0 0)(0 0)), clearly of dimensions {3,2}. But
Dimensions[expr] does not return {3,2}. Since now Heads are ignored, in
order for Dimensions to be useful, it added one requirement: Dimensions
requires all heads to be the same. Observe:

 Dimensions@f[f[0,0]] --> {1,2}
 Dimensions@f[g[0,0]] --> {1}

This is unintuitive because we normally would not think of same
structure as having different dimensions. More on Dimension in later
posts.

Similarly, the Array function does not grow branches on Heads... and
there are others... In general, I think that Mathematica discourages
nested heads. I wish the Heads option will be added to more functions
in future version to give user the flexibility.

This message is getting way long. We'll come back next week. For now,
perhaps some would enjoy writing a LispToMathematica expression
(string) converter and vice versa. Solutions will be given next week if
more primitive subjects did't turn up.

 Xah, xah@best.com
 http://www.best.com/~xah/SpecialPlaneCurves_dir/specialPlaneCurves.html
 "morality abets evil"
Prev by Date: Re: inputstreams
Next by Date: Simplifying algebraic expr: howto?
Prev by thread: Re: [Q] how to use simple notation to combine many graphics
Next by thread: Simplifying algebraic expr: howto?