MathGroup Archive 2005

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: Types in Mathematica thread

Jon Harrop wrote:
> Daniel Lichtblau wrote:
>>Jon Harrop wrote:
>>>f[{}] := 0
>>>f[{h_, t___}] := 1 + f[{t}]
>>This is exactly is it should be. The pattern is reforming a List of
>>length one less at every step of the match.
> In the current implementation, yes. That makes the current implementation
> have asymptotically more complex pattern matching than is theoretically
> necessary for an infinite number of patterns, many of which are useful.

Okay, yes, this is in fact implementation specific. More below.

>>>However, this may get fixed in the future.
>>My colleagues often pleasantly surprise me, but I rather doubt the
>>complexity of this can be improved.
> You just need to lazily evaluate the matched sublists ("t" in this case).
> This can be done by implementing a new kind of list that is internally a
> reference to a subarray of an existing list. Extracting subarrays (as in my
> example) is then O(1) instead of O(n). The only tricky part is making sure
> that semantics are unchanged, e.g. referring to all child sublists and
> forcing them to be copied explicitly if the parent is mutated.
> I've had a play around with this and I think there is a lot of scope for a
> radical departure in the internal representation of Mathematica's lists,
> using immutable balanced binary trees instead of mutable arrays. This is
> exactly the kind of improvement that could be made much more easily if
> Mathematica were coded in a functional language. It would be interesting to
> play around with a toy implementation, to get a better idea of which
> optimisations were most beneficial on real Mathematica programs.
> [...]

What you propose has its interest so I'll note what I see as some of the 
weak points.

First let's cover what it entails. One would need parents to keep track 
of all children (we'll regard them as a set of some unspecified 
implementation). Every child would need to know its parent. That way if 
an expression is altered the evaluator would know to copy parent or 
children. Also if a child changes it gets removed (after it is first 
copied) from the parent's child set. This could be an issue for 
efficiency in terms of how that set is to be implemented.

So what are the potential weak points? I do not think my list is close 
to comprehensive but I'll name some that came to mind.

(1) I doubt it can be used at all on packed arrays or anything that has 
storage layout requirements regarding contiguity for purposes of calling 
on BLAS or other libraries.

(2) It would complicate expression management/evaluation code. While you 
seem to claim there is good benefit, I'm not so sure.

(3) Memory management might become more complicated and less efficient. 
I don't think we want to adopt a "lazy" memory release strategy. So if a 
parent is freed in a loop that creates children, there could be a need 
to make copies of those children sublists. I think this could have an 
effect on the example you propose above where we no longer use the 
parent list in each iteration (so references drop to zero and it would 
be freed).

(4) From appendix A.9.2 of The Mathematica Book, second remark: "Each 
expression contains a special form of hash code which is used in both 
pattern matching and evaluation." Either this would need to be 
abandoned, or else it would need to be recomputed for children. Since 
this hash is computed function of head and elements it would be O(n) to 
compute it anyway, hence symptotic complexity of your example would 
still be O(n) albeit with smaller multiplier. Short of jettisoning the 
hash related evaluation shortcuts I see no way around this.

Apart from these (and I think the last at least is quite serious), I'm 
not sure there would be no other ramifications to evaluation that might 
lead to complexity issues or outright bugs. I do not see anything but 
maybe I'm not looking ahrd enough.

>>>The fact that Mathematica distinguishes between up- and down-values is
>>>really incidental (a historical artefact). So I think you're
>>>over-analysing it.
>>I don't see this as artifact. It is important that UpValues behave
>>differently in various aspects from DownValues. Maybe I'm
>>misunderstanding what you mean by an incidental distinction.
> It's been a while since I looked but I can see no objective reason to expose
> the distinction. I think the distinction between up- and down-values could
> be removed were it not for backward compatibility.

Others have addressed this but I'll repeat it. UpValues are important 
for efficiency. For example they are used heavily in Mathematica's 
implementation of Series[special functions]. This can give nice speed 
improvement to pattern matching.

As for backward compatibility, in addition to the efficiency hit were 
everything to be done by DownValues, there might also be rewrite rule 
ordering issues. With the present scheme one knoew that a 
BetaRegularized UpValue on Series[BetaRegularized[...]] will fire before 
any DonwValue on Series.

Daniel Lichtblau
Wolfram Research

  • Prev by Date: PDF/Illustrator Oddity
  • Next by Date: Re: Fedora 3 segmentation fault
  • Previous by thread: Re: Types in Mathematica thread
  • Next by thread: Re: Types in Mathematica thread