Re: Re: Types in Mathematica thread
- To: mathgroup at smc.vnet.net
- Subject: [mg63049] Re: [mg62997] Re: Types in Mathematica thread
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Sun, 11 Dec 2005 22:25:22 -0500 (EST)
- References: <dmp9na$hi2$1@smc.vnet.net> <200512060504.AAA02828@smc.vnet.net> <dn5ohr$nle$1@smc.vnet.net> <200512101103.GAA29373@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Jon Harrop wrote: > Daniel Lichtblau wrote: > >>Jon Harrop wrote: >> >>>f[{}] := 0 >>>f[{h_, t___}] := 1 + f[{t}] >> >>This is exactly is it should be. The pattern is reforming a List of >>length one less at every step of the match. > > > In the current implementation, yes. That makes the current implementation > have asymptotically more complex pattern matching than is theoretically > necessary for an infinite number of patterns, many of which are useful. Okay, yes, this is in fact implementation specific. More below. >>>However, this may get fixed in the future. >> >>My colleagues often pleasantly surprise me, but I rather doubt the >>complexity of this can be improved. > > > You just need to lazily evaluate the matched sublists ("t" in this case). > This can be done by implementing a new kind of list that is internally a > reference to a subarray of an existing list. Extracting subarrays (as in my > example) is then O(1) instead of O(n). The only tricky part is making sure > that semantics are unchanged, e.g. referring to all child sublists and > forcing them to be copied explicitly if the parent is mutated. > > I've had a play around with this and I think there is a lot of scope for a > radical departure in the internal representation of Mathematica's lists, > using immutable balanced binary trees instead of mutable arrays. This is > exactly the kind of improvement that could be made much more easily if > Mathematica were coded in a functional language. It would be interesting to > play around with a toy implementation, to get a better idea of which > optimisations were most beneficial on real Mathematica programs. > [...] What you propose has its interest so I'll note what I see as some of the weak points. First let's cover what it entails. One would need parents to keep track of all children (we'll regard them as a set of some unspecified implementation). Every child would need to know its parent. That way if an expression is altered the evaluator would know to copy parent or children. Also if a child changes it gets removed (after it is first copied) from the parent's child set. This could be an issue for efficiency in terms of how that set is to be implemented. So what are the potential weak points? I do not think my list is close to comprehensive but I'll name some that came to mind. (1) I doubt it can be used at all on packed arrays or anything that has storage layout requirements regarding contiguity for purposes of calling on BLAS or other libraries. (2) It would complicate expression management/evaluation code. While you seem to claim there is good benefit, I'm not so sure. (3) Memory management might become more complicated and less efficient. I don't think we want to adopt a "lazy" memory release strategy. So if a parent is freed in a loop that creates children, there could be a need to make copies of those children sublists. I think this could have an effect on the example you propose above where we no longer use the parent list in each iteration (so references drop to zero and it would be freed). (4) From appendix A.9.2 of The Mathematica Book, second remark: "Each expression contains a special form of hash code which is used in both pattern matching and evaluation." Either this would need to be abandoned, or else it would need to be recomputed for children. Since this hash is computed function of head and elements it would be O(n) to compute it anyway, hence symptotic complexity of your example would still be O(n) albeit with smaller multiplier. Short of jettisoning the hash related evaluation shortcuts I see no way around this. Apart from these (and I think the last at least is quite serious), I'm not sure there would be no other ramifications to evaluation that might lead to complexity issues or outright bugs. I do not see anything but maybe I'm not looking ahrd enough. >>>The fact that Mathematica distinguishes between up- and down-values is >>>really incidental (a historical artefact). So I think you're >>>over-analysing it. >> >>I don't see this as artifact. It is important that UpValues behave >>differently in various aspects from DownValues. Maybe I'm >>misunderstanding what you mean by an incidental distinction. > > > It's been a while since I looked but I can see no objective reason to expose > the distinction. I think the distinction between up- and down-values could > be removed were it not for backward compatibility. Others have addressed this but I'll repeat it. UpValues are important for efficiency. For example they are used heavily in Mathematica's implementation of Series[special functions]. This can give nice speed improvement to pattern matching. As for backward compatibility, in addition to the efficiency hit were everything to be done by DownValues, there might also be rewrite rule ordering issues. With the present scheme one knoew that a BetaRegularized UpValue on Series[BetaRegularized[...]] will fire before any DonwValue on Series. Daniel Lichtblau Wolfram Research
- References:
- Re: Types in Mathematica thread
- From: Jon Harrop <usenet@jdh30.plus.com>
- Re: Types in Mathematica thread
- From: Jon Harrop <usenet@jdh30.plus.com>
- Re: Types in Mathematica thread