Re: Algorithm Analysis Course: Should I use Mathematica for projects?

*To*: mathgroup at smc.vnet.net*Subject*: [mg127285] Re: Algorithm Analysis Course: Should I use Mathematica for projects?*From*: Richard Fateman <fateman at eecs.berkeley.edu>*Date*: Sat, 14 Jul 2012 01:28:28 -0400 (EDT)*Delivered-to*: l-mathgroup@mail-archive0.wolfram.com*Delivered-to*: mathgroup-newout@smc.vnet.net*Delivered-to*: mathgroup-newsend@smc.vnet.net*References*: <jtkuhh$2a$1@smc.vnet.net> <20120712085845.B341E6859@smc.vnet.net> <jtoglu$qk8$1@smc.vnet.net> <50006641.8080505@cs.berkeley.edu> <4C9B8992-4120-4D27-A049-93F3632F2182@gmail.com>

On 7/13/2012 12:47 PM, Andrzej Kozlowski wrote: > On 13 Jul 2012, at 20:17, Richard Fateman wrote: > >> On 7/12/2012 11:53 PM, Andrzej Kozlowski wrote: >>> On 12 Jul 2012, at 10:58, Richard Fateman wrote: >>> >>>> It would be better to use a programming language in which Lists are >>>> lists, not arrays and accessing elements in an array of length N is an >>>> O(1) operation, not O(N). >>> I have no idea what was meant by "accessing elements" in the quote above but "retrieving elements" >>> from Mathematica's lists ("arrays") is in fact constant time (i.e. O(1)). >> No, the cost for retrieving an element from a list/array depends on the length of the list. It does not depend on the index of the element. >> >> It is interesting a person who ordinarily displays evidence of substantial knowledge of Mathematica should be unaware of this situation. Here is the reason, so far as I can tell. >> >> If you mention a name like A, in A[[i]]=A[[i]]+1, Mathematica re-evaluates all elements in the list/array A. >> >> So changing an element has a cost that is proportional to the number of >> elements in A. >> >> >> Proof. >> >> M = Table[mm[i], {i, 1, 5}] >> mm[i_]:=Print[i] >> >> M[[2]]++ >> >> prints 1 2 3 4 5 >> >> Also confounding the situation is that, I think, something funny happens if the list/array is larger than some threshold >> beyond which Mathematica does not operate the same way. >> >> >> >>> In fact, this is the main reason why Mathematica uses this particular data structure. >> In almost any other programming language, that's the way it would operate. >> >>> The price of this is that one cannot efficiently extend Mathematica's lists, >>> but there are well discussed ways of avoiding this problem. >> If you want to build lists out of other structures, there are certainly well-known ways of doing this. For example, the language LISP >> makes lists of any length out of fixed-length (2-element) cells. >> >> RJF > Well, I don't think this proves anything of the kind you believe it does. As you often seem to do you stop your experiments too early so the conclusions you reach prove much less than you claim they do. > > It is indeed true (and, in fact, pretty obvious) that when Mathematica evaluates M[[i]] (i.e. Part[M,i]) it must evaluate M, since the attributes of Part are: > > Attributes[Part] > > {NHoldRest,Protected,ReadProtected} > > in other words, they don't include HoldFirst or HoldAll. We can see it simply here: > > l = Range[10]; > > Trace[l[[2]]] > > {{l,{1,2,3,4,5,6,7,8,9,10}},{1,2,3,4,5,6,7,8,9,10}[[2]],2} > > So l was indeed evaluated, in some sense anyway. It was completely evaluated, not "in some sense". > But then try some practical timings: > > l1 = Table[1, {10^8}]; // Timing > > {2.23099,Null} > > Took quite a while, right? But: > > l1[[10^4]] // Timing > > {0.00002,1} > > Pretty fast, wouldn't you say? What I would say is that 1. this is evidence that the Mathematica system has determined that the value of l1 depends on nothing that has changed, e.g. not the definition of any functions or the values of any variables) and therefore it need not be evaluated (again). This strategy is described in my review of Mathematica published in J. Symbolic Computing, and somewhere on my web site. 2. In fact, 20 microsecond is kind of slow for a memory reference. 3. But I am surprised there is such a high resolution timer being used. To show that this is a special case, let us alter just one of the 10^8 entries and show that it makes access to ANOTHER entry 10000X slower. Try this: l1[[50]]=mm[50]; mm[i_]:= total++ l1[10^4]]//Timing takes over 2 seconds. To be specific, *A single array reference takes over 2 seconds!* 5 orders of magnitude slower than your timing. > It does not look to me that Mathematica really reevaluated the list, certainly not in the same sense that it did the first time? I find the Mathematica "Trace" facility difficult to fathom, so I base my reasoning on other evidence. > Now, look at this: > > l[[2]] // Timing > > {0.000015,1} A similar trick with a small array/list like this doesn't affect the cost much, since only 10 elements need to be re-evaluated. From my own computer, it reports 0. time. > > Remember, l has 10 elements while l1 has 10^8. I would say, the evidence points to O(1) retrieval, at least for all practical purposes. It seems to me that the evidence is that except for certain special circumstances (well, not so special in, say, Fortran, where arrays can only hold constant fixed-length numbers or bit strings), but special circumstances for Mathematica, the cost of an array reference depends on the size of the array (as well as the complexity of elements which have nothing to do with the array element being referenced). While this may seem like a peculiar problem in isolation, it is a consequence of Mathematica's evaluation strategy. It is a feature, so to speak. It makes the use of Mathematica's runtime statistics a subject for some kind of advanced data-structures / trade-secret class, not an intro to algorithms for non-computer-science majors. RJF

**Follow-Ups**:**Re: Algorithm Analysis Course: Should I use Mathematica for projects?***From:*Andrzej Kozlowski <akozlowski@gmail.com>

**References**:**Re: Algorithm Analysis Course: Should I use Mathematica for projects?***From:*Richard Fateman <fateman@cs.berkeley.edu>