prograMing: trees and indexes, 1/2
- To: mathgroup@smc.vnet.net
- Subject: [mg11356] prograMing: trees and indexes, 1/2
- From: "Xah" <xah@best.com>
- Date: Sat, 7 Mar 1998 02:06:30 -0500
- Organization: Venus & Xah Love Factory
This is the 5th weekly posting of a series of expositions on Mathematica expressions, Trees, and Indexes. This "article" is long (17k) and is send as 2 posts. ----- ----- ----- Introduction: It's fairly easy to see how Mathematica expression are trees, and how each atom has an index. This is so until the heads of expressions are expression themselves. For example, the structure of the following expression is difficult to understand: 0[0,0][][0[0,0][],0[0,0][]] It is however a legal expression, and such forms could be exploited in applications (one simple example is the Derivative function). In this post, I hope to give a clear exposition of the isomorphism between Mathematica expressions, trees, and index sets. Once we understand the isomorphism of expressions and index set with atoms, we can explain the behavior of any structural manipulating functions in terms of how the function effects the indexes and atoms. The structural manipulating functions I had in mind are those covered in The Mathematica Book 2.1 to 2.2. We'll just call them tree functions for abbreviaton. We will write functions that convert expression to index set and vice versa. To simulate tree functions (e.g. Level, Through, Flatten), we can first convert their main arguments to index set, manipulate the index set, then convert it back into an expression. This approach exposes functions' behaviors, and it shows how they can be implemented in any language. (Our focus is not on implementation, but exploration on useful structural transformations of trees, as exemplified in Mathematica's Outer, Distribute, Transpose...etc.) ----- ----- ----- Trees and Index set: What is a tree? (I don't have a formal definition handy but we don't need one here.) Basically: A tree is an abstract idea of a structure. Starting with a node, called root. A node may have other nodes connected to it, called its children. Nodes that have no children are called leaves. If a node is not a leave, then it is a nonleave, or called a branching node. (terminology used here may differ from convention) In summary, a tree is a set of nodes, each has a parent/child relation with some other nodes, and by definition there is one node called root, which is the only node without parent. We can lable the nodes systematically. The root node will be labeled {} (empty braces). If it has two children, then they are labeled {1} and {2}. The children of {1} will be labeled {1,1}, {1,2} and so on. Similarly, children of {2} are {2,1},{2,2}...etc. In general, we add a digit in every new generation, counting children starting at 1. In this way, each node has a name, called its index. The number of digits in an index indicates its "generation ordinal" or Level. The root node is the generation 0, or Level 0, because its index does not contain any digits. Now notice that we can also use alphabets instead of numbers, or, we can start counting with 0 instead of 1. We will stick to the indexing system that starts counting with 0. This reason for this choice will be apparant later. A set of indexes defines a tree. The set that contain all indexes of a given tree minus the root index {} will be called the _complete index set_ of that tree. _Leaves index set_ is the set of indexes of all leaves of the tree. Similarly for _Nonleaves index set_ (minus the root {}). The union of leaves and nonleaves is the complete index set. Their intersection is empty. Examples: Here's the complete index set of a tree: {{0},{1},{2},{3},{0,0},{2,0},{2,1},{3,0},{3,1},{3,1,0}} its LeavesIndexSet is {{1},{0,0},{2,0},{2,1},{3,0},{3,1,0}} its NonleavesIndexSet is {{0},{2},{3},{3,1}} The MinimumIndexSet is {{0,0},{2,1},{3,1,0}} (MinimumIndexSet is the a subset of complete index set: its elements are indexes that can not be inferred from others. Discussed in a previous posting) A given list of indexes may not be a complete index set, but a complete index set can always be inferred. For example, given {{3},{1},{2,2}}, the {3} implies {2},{1},{0}, and {2,2} implies {2,1},{2,0},{2}. So the complete index set is {{0},{1},{2},{3},{2,0},{2,1},{2,2}}. A list of index implies a complete index set, which in turn defines a tree. Conversely, any tree corresponds to a unique complete index set. Now we have the isomorphism between trees and (complete) index sets. ----- ----- ----- Lisp Expresions: Now let's look at how a tree shape can be defined as a string (we'll call such string _expression_). The simplest way is using nested parenthesis. We'll use 0 as the atomic object. Starting with root we have 0, which correpsonds to the index set {{}}. First, notice that there are two types of tree growth: (1) birth of a new generation, e.g. {{0},{1}}-->{{0},{1},{1,0}}. (2) birth of a sibling, e.g. {{0},{1}}-->{{0},{1},{2}}. To grow a new generation, we add a pair of parenthesis. So, we have 0 --> (0), corresponding to {{}}-->{{},{0}}. To grow a sibling, we append a 0. We have (0)-->(0 0), corresponding to {{},{0}}-->{{},{0},{1}}. By this growing process, all trees have a corresponding expression, and vice versa. If you are a purest, you can use a pair of empty parenthesis as your atom, so the whole string consists of only parenthesis. Example: Here's a growth history of the tree {{0},{1},{2},{3},{0,0},{2,0},{2,1},{3,0},{3,1},{3,1,0}}, represented by both expressons and index sets. { 0, (0), (0 0), (0 0 0), (0 0 0 0), ((0) 0 0 0), ((0) 0 (0) 0), ((0) 0 (0 0) 0), ((0) 0 (0 0) (0)), ((0) 0 (0 0) (0 0)), ((0) 0 (0 0) (0 (0))) } { {}, {{0}}, {{0},{1}}, {{0},{1},{2}}, {{0},{1},{2},{3}}, {{0,0},{0},{1},{2},{3}}, {{0,0},{0},{1},{2,0},{2},{3}}, {{0,0},{0},{1},{2,0},{2,1},{2},{3}}, {{0,0},{0},{1},{2,0},{2,1},{2},{3,0},{3}}, {{0,0},{0},{1},{2,0},{2,1},{2},{3,0},{3,1},{3}}, {{0,0},{0},{1},{2,0},{2,1},{2},{3,0},{3,1,0},{3,1},{3}} } (the empty braces are omitted in index sets. This is so because otherwise an index set ALWAYS contains the {}.) We'll call the above the Lisp expression system, and it is not the only way to represent a tree shape with strings. In the following system, we'll use brackets instead of parenthesis for visual distinction from the lisp system, and we'll call it the Mathematica expression system. ----- ----- ----- Mathematica Expresions: In the Mathematica expression system: (1) As before, we start with an atom 0. (2) To grow a new generation, we add a pair of brackets after its parent. So, we have 0 --> 0[], corresponding to {{}}-->{{},{0}}. (3) To grow a sibling, we append a 0 inside the []. We have 0[]-->0[0], corresponding to {{},{0}}-->{{},{0},{1}}. Here is the growth history of the above tree in Mathematica expression: { 0, 0[], 0[0], 0[0,0], 0[0,0,0], 0[][0,0,0], 0[][0,0[],0], 0[][0,0[0],0], 0[][0,0[0],0[]], 0[][0,0[0],0[0]], 0[][0,0[0],0[0[]]] } Commas are used instead of spaces for separaters. continued in the next message... Xah, xah@best.com http://www.best.com/~xah/Wallpaper_dir/c0_WallPaper.html Mountain View, CA, USA