MathGroup Archive: December 2005 [00263]

[Date Index] [Thread Index] [Author Index]
Re: Re: Re: Types in Mathematica
To: mathgroup at smc.vnet.net
Subject: [mg62943] Re: Re: Re: Types in Mathematica
From: "Steven T. Hatton" <hattons at globalsymmetry.com>
Date: Fri, 9 Dec 2005 05:10:24 -0500 (EST)
References: <dn22fb$kum$1@smc.vnet.net> <200512070411.XAA23826@smc.vnet.net> <F505B3D5-9BCD-49F6-A894-D997D33F5C6D@jeol.com> <200512071343.34678.hattons@globalsymmetry.com> <dn8gf9$bt9$1@smc.vnet.net>
Sender: owner-wri-mathgroup at wolfram.com
Sseziwa Mukasa wrote:

> 
> On Dec 7, 2005, at 1:43 PM, Steven T. Hatton wrote:
> 
>> On Wednesday 07 December 2005 10:54, Sseziwa Mukasa wrote:

>>> I meant the value of evaluating the expression.
>>
>> In[1]:= Head[Head[{1,2,3}]]
>>
>> Out[1]= Symbol
> 
> I'm not sure what that's supposed to prove we know that of Head
> [{1,2,3}][[0]] is Symbol but Symbol[List] is not a list.

Well, that makes two of us.  I don't really understand what your example was
intended to demonstrate.  Perhaps you were trying to show that Mathematica
does not behave according to naive expectations in some (many) cases. 
Though I accept the proposition as a forgone conclusion, your example did
not work for me because the result conformed to my expectation.

>> Hmmm.... That's a very interesting statement.  In reality, at that
>> level of
>> detail the description of a Mathematica expression is distinct from
>> the Lisp
>> model.  Mathematica expressions are described as being arrays of
>> pointers to
>> other Expressions.
> 
> I believe the point about being arrays of pointers is related to
> implementation efficiency concerns, such as being able to take parts
> of lists in linear time eg:

Indeed.  This is exactly why iteration is typically faster in C and C++ than
is recursion.  More to the point, it is faster to iterate a pointer value
than it is to dereference a pointer when traversing a linked list.  Fact of
the matter is, when Lisp was originally conceived, such considerations were
still years away from being relevant.  This serves to highlight the dangers
of naively transferring "best practices" from languages such as C++ to
Mathematica.

An aside on Lisp.  I believe that Mathematica expressions are most similar
to Lisp a-lists in terms of how Mathematica compares to Lisp.
>
>>> The fact that we display
>>> and write 1 instead of Integer[1]  is syntactic sugar as far as I'm
>>> concerned because at that point we are typically more interested in
>>> the interpreting Integer[1] as a value which is the result of some
>>> computation which is really what one is interested in.
>>
>>
>> In[2]:= Integer[1]===1
>>
>> Out[2]= False
> 
> OK calling it syntactic sugar is a bit extreme, it doesn't change the
> validity of the statement.  You cannot create the argument that
> Integer[_] uses to represent an integer, but we know that Integer[_]
> represents an atomic expression so there's no point in wondering
> about what _ is without a head because we cannot create it.  Integer
> [1] is evaluated as Integer[Integer[(something)]] so of course Integer
> [1]/=1.

I suspect what we have here is an illdefined "rough edge" of Mathematica's
language "specification".  After having spent some time discussing issues
related to the C++ Standard ISO/IEC 14882:2003, I realize how difficult it
is to specify the details of a programming language, even when the language
was formally specified at every step of its evolution.  It may be virtually
impossible to give an expost facto specification for the Mathematica
programming language.

I believe your intent it to show that all statements in Mathematica are
internally represented in a nested tree structure containing nothing but
arrays of pointers with the first pointer pointing to the head of an
expression and zero or more pointers pointing to the arguments, which are
themselves similarly constructed. I.e., Mathematica is one big glorified
Lukasiewicz (Polish) symbolic calculator.

I agree.  Before I read revelant portions of The Mathematica Book carefully,
I was inclined to call that internal representation "the Mathematica
programming language".  It appears that such a limited view is not the
intent of Dr. Wolfram.  One might even draw a parallel to C++ and assembler
where the generalized input form of Mathematica corresponds to C++ source
code, and internal form corresponds to assembler instructions.

>>> Especially if you consider the heads of atoms as defining their type,
>>> which is why I think that is a useful paradigm for reasoning about
>>> types in Mathematica.
>>
>> I would say that the distinction has more to do with their fundamental
>> characteristics than with the name or symbol attached to them.  In
>> particular, the kinds of data they store.
>>
>> Head[Symbol]
>> SymbolName[Symbol]
>>
>> Head["string"]
>> Characters["string"]
>>
>> Head[12345]
>> IntegerDigits[12345]
>>
>> Head[10/3//N]
>> RealDigits[10/3//N]
>>
>> Head[25/17]
>> Numerator[25/17]
>> Denominator[25/17]
>>
>> Head[E^I GoldenRatio//N]
>> Re[E^I GoldenRatio//N]
>> Im[E^I GoldenRatio//N]
> 
> I'm not sure what you are expecting in these cases so I'm not really
> sure what the point of this exercise is, at any rate

The point was that Atoms in C++ represent the fundamental units of data
storage.  Each atom is a 'quantum' of data, to use the German manner of
speaking.  The head of an atom is a means of identifying its type, but the
"type" really has to do what "what's in it".  In general, the use of Head
to determin the type of an expression will not provide us this kind of
exact correspondence.  That is to say, knowing that an expression has the
head List does not tell us the nature of what is in the list.  Knowing that
an expression has the head Complex, for example, tells us that we can
extract exactly two meaningful values from it, and how to extract these
values (assuming we know the Mathematica rules for Complex).

> In[48]:=
> Head[1]
> 1[[0]]
> 1[[1]]
> 
> Out[48]=
> Integer
> 
> Out[49]=
> Integer
> 
>  From In[48]:=
> "Part specification 1[[ 1 ]] is longer than depth of object.
> 
> Out[50]=
> 1[[1]]
> 
>   we begin to understand what makes an expression atomic, it is an
> object of Depth 0.  But objects of Depth 0 still have a head, even
> though the arguments are inaccessible.
> 
>> I don't believe I suggested that A.9.2 documents exactly what
>> everything is.
>> I've been rather clear that I do not find the documentation complete.
> 
> You have, but I haven't seen an example of behavior that cannot be
> explained by A.9.2.  I will admit that the time complexity of
> algorithms is not what you'd expect from A.9.2 but the behavior with
> respect to types is.

There are many features which are not well communicated in The Mathematica
Book.  Some of this is probably due to the closed nature of the product,
and some is due to the difficulty in explaining so many subtle interacting
features, and their consequences.
 
>>> Some
>>> expressions are atomic, and if you want to assign a "type" to the
>>> atoms other than expression the best candidate I can see is the head.
>>
>> That's what Wolfram says in the appendix. "These [atomic] objects
>> have heads
>> which are symbols that can be thought of as 'tagging' their types. The
>> objects contain 'raw data', which can usually be accessed only by
>> functions
>> specific to the particular type of object. You can extract the head
>> of the
>> object using Head, but you cannot directly extract any of its other
>> parts."
> 
> So what's the difficulty?

Integer[1]===1 for starters.  But I really quoted the statement to affirm my
original position that Mathematica does have a rudementry type system.

>> C++, however does not have
>> a statement type.  It has statements, but "statement" is not a type.
> 
> I suppose it depends on your perspective, in order to compile or
> execute C++ code one has to be able to reason about the code and in
> order to reason about the code structures like for(;;){...} and void
> op() have to be represented by something and I'd call that something
> which represents them their type. 

Perhaps that is valid in a metalinguistic realm, but with C++ there is a
well defined meaning for the term "type" when applied to constructs of the
language.  The distinction between what is meant by "statement" and
"expression" in the ISO/IEC 14482:2003 _Programming languages - C++_ is
akin to the distinction between what is meant by "sentence" and "imperative
sentence".  

> Except for the compiler/ 
> interpreter though there is nothing within C++ which deals with such
> structures directly, but C++ also has typedef _ structures (and int,
> float, etc.)  which can be used for explicit manipulation of types of
> objects during execution (or evaluation) of a program. 

Not really.  These only come into play at compile time.  The only runtime
type-related mechanisms in C++ are virtual functions and RTTI, and these
only apply to a subset of user defined types.

> The 
> difference between C++, Java etc. and Mathematica (and to a certain
> extent ML, Haskell etc.) is that at evaluation type the structures
> that correspond to for(;;){...}, void op() etc. can be manipulated by
> the program itself.

Can I actually modify an expression as it is evaluated?  Let's consider a
For[] expression.  Can you give an example of modifying a For[] expression
as it it evaluated?

> I'm aware that clever hackers can write self modifying code in C++
> with injudicious use of pointers, but those programs typically rely
> on a well defined external environment of linkers, machine
> architecture etc. to work.

I do not consider that to be self-modifying "code".  The runtime image may
be self-modifying, but that is not the same as saying the source code is
modified as the program executes.  To do what Mathematica does in C++ we
would need to create a program that could read its own source code,
manipulate it, invoke a compiler to compile the result, and load the
resulting object code as part of the current execution image.  It is
certainly conceivable, and to some extent it is done with Java today.

>> "You can't inspect the structure of the function in C++ because it's a
>> function definition and there is no facility for programmatically
>> inspecting the contents of a definition in C++.  In Mathematica the
>> contents are stored as an expression in the DownValues which can be
>> modified as any other expression in Mathematica sans restrictions."
>>
>> How does that compare to rtti or having a virtual function table
>> and using
>> virtual funcitons to implement polymorphism?
> 
> There is no need for rtti in Mathematica, again because the type of
> everything is known because there is only one type of thing in
> Mathematica.

I do not agree with the assertion that there is only one type of thing in
Mathematica.  To say everything is an expression fails to address the
distinct nature of atomic objects with distinct heads.  It is meaningful
and clear to consider atomic objects to have distinct types.

In C++ I can do just about anything I want with RTTI.  The mechanism was
intentionally left openended.  But let's look at a more compelling analog. 
What about Java's facility for introspection?  I Java I can access
extensive details about the structure of an object at runtime.  I will
grant that I cannot (easily) modify the definition of an existing class in
Java.  But I have to ask, how often do we actually modify the original
expression in Mathematica?  Is it not far more common to produce a modified
copy of the original expression?

> A C++ program typically (if we're going to be pedantic 
> I'll allow for clever hacks) cannot rewrite its virtual function
> table at run time, during Mathematica evaluation I can rewrite the
> body of a function though.

Can you provide an example?

>>   You most certainly can get some
>> internal information from an object that way.
> 
> I didn't say you couldn't get any information, I said you can't get
> at the implementation.

It would be far more accurate to say you cannot get at the original
definition.  That is quite true, be cause the definition is part of the
sourcecode, and does not exist in the runtime image.  But that brings us
back to the fact that C++ is compiled and thus the sourcecode is not, in
general, available to the execution image at runtime. 
 
>>> (actually is that a requirement?  http://root.cern.ch/root/
>>> Cint.html)
>>
>> That would not pass muster among 99% of C++ programmers.
> 
> I actually posed that question to myself as I was writing then found
> Cint through Google.  I didn't even bother to evaluate it, I was just
> interested in the question of whether being compiled was a necessary
> part of the C or C++ specification.

All that is required is that the program behave "as if" it were run on the
"abstract machine" specified by the ISO/IEC 14482:2003.

>>> there are no facilities for
>>> programmatically altering statements.
>>
>> There certainly are at compile-time.  As a matter of fact that is
>> the entire
>> purpose and function of templates.  The are meaningless unless they
>> are
>> transformed into compilable specializations.
> 
> C++ templates don't modify statements, they are used to generate new
> statements at compile time. 

C++ templates are used to produce specialization which are modifications of
the original template.  The exact form of these specializations can be
computed at compile time.  This is called template metaprogramming.

> I cannot take a pointer to an if{a}else 
> {b} block and change the value of b.

Allowing for the incorrectness of the notion of taking a pointer to an if
statement, you most *certainly* *can* change the value of b at compile time
in your example.  Template metaprogramming is fairly similar to Mathematica
programming.

> I'm not arguing that is a good idea in general, merely that it isn't
> possible.

But it *is* possible.  It is the whole essence of generic programming.

>>> On the other hand, I've noticed a trend in OO graphics programs for
>>> defining 3 and 4D vectors and matrices as classes and then
>>> overloading +,-,* etc. which leads to inefficient code like
>>>
>>> t=a*b;
>>> d=c*t
>>>
>>> instead of
>>>
>>> d=a*b*c
>>
>> There is no reason you cannot use the latter expression if you have
>> overloaded
>> * with something reasonable.
> 
> And what is something reasonable?  The naive vector3D operator*(a
> vector3D, b vecto3D) {return Vector3D(a.x*b.x,a.y*b.y,a.z*v2.z)}
> requires the creation of an intermediate object in compiling a*b*c to
> hold the result of either a*b or b*c.  Mathematica gets around the
> problem using the Listable attribute, C++ gets around it with
> template metaprogramming

Or compiler optimizations, or composite function objects, valarrays, etc.  I
will grant that I like the way Mathematica handles such situations, and
have as an objective finding ways to formalize such techniques in C++.

> (and given your interest in ML perhaps you'd 
> appreciate this http://people.cs.uchicago.edu/~jacobm/pubs/
> templates.html) but as far as I'm aware Java can do neither and ends
> up being inefficient.

I'd better stick with Mathematica for now.  I took a very sharp turn by
dropping C++ after 18 months of careful study and returning to Mathematica. 
I obviously am interested in how programming languages work, and how
Mathematica works as a programming language in the purely academic sense. 
But, believe it or not, my primary motivation for spending so much time and
effort on this topic is because I was tired of trying to use Mathematica,
and not getting the results I expected.

This discussion has forced me to examine how I perceive Mathematica, and how
accurate those perceptions are.
-- 
The Mathematica Wiki: http://www.mathematica-users.org/
Math for Comp Sci http://www.ifi.unizh.ch/math/bmwcs/master.html
Math for the WWW: http://www.w3.org/Math/
References:
- Re: Re: Types in Mathematica
  - From: "Steven T. Hatton" <hattons@globalsymmetry.com>
Prev by Date: Re: Types in Mathematica, a practical example
Next by Date: Re: Types in Mathematica
Previous by thread: Re: Re: Re: Types in Mathematica
Next by thread: Types in Mathematica