Re: Compile function and AppendTo for lists (vrs. 8.0.4)

*To*: mathgroup at smc.vnet.net*Subject*: [mg124699] Re: Compile function and AppendTo for lists (vrs. 8.0.4)*From*: "Oleksandr Rasputinov" <oleksandr_rasputinov at hmamail.com>*Date*: Wed, 1 Feb 2012 03:49:06 -0500 (EST)*Delivered-to*: l-mathgroup@mail-archive0.wolfram.com*References*: <jg8gg5$139$1@smc.vnet.net>

Dear Fred On Tue, 31 Jan 2012 10:41:09 -0000, Fred Simons <f.h.simons at tue.nl> wrote: > Dear Olek and Patrick, > > With a lot of interest I studied your contributions to the thread on > compilation of AppendTo. I have to admit that Bag's are completely new > to me, and that I am not familiar with modern programming languages > either, so it took some time before I understood the (very good!) > explanation of Daniel Lichtblau. My idea is that the result of a command > like a = Bag[...] is that the bag is stored somewhere as a raw > expression, whatever that may be, and that the value of a is only a > pointer to that expression. Please correct me if I am wrong. > > An irrelevant remark: my experience is that Do is normally slightly > faster than For, so I used the function > > AppendTo[$ContextPath, "Internal`"]; > > appendBag2a = Compile[{{k, _Integer, 0}}, Module[{p = Bag[Most[{0}]]}, > Do[StuffBag[p, Bag[{1, i}]], {i, 1, k}]; > Table[BagPart[BagPart[p, i], All], {i, 1, k}]]]; > > That works fine and fast, It's important to note that because you've created a Bag containing the tensor {1, i}, the creation and subsequent operations on this Bag are not done in compiled code but rather via MainEvaluate calls, while the Do loop remains compiled. You can verify this by looking at the InputForm of appendBag2a or running it through CompilePrint. As such, one should not expect spectacular performance from this implementation as the overhead of calling out of the Mathematica virtual machine is significant; certainly it will become the limiting factor for performance if done on every iteration of a Do loop as here. > apart from the fact that on my computer > Mathematica runs out of memory for k=10^6. I surmise that the reason is > that at the end the variable p has about 10^6 pointers to different raw > Bag expressions. So I tried to do it without all these pointers: > > uncompiled3[k_Integer] := > Module[{p = Bag[]}, Do[StuffBag[p, {1, i}], {i, 1, k}]; > BagPart[p, All]] > > I did not expect this function to be faster than appendBag2: > > In[8]:= appendBag2a[2 10^5]; // Timing > uncompiled3[2 10^5]; // Timing > Out[8]= {0.234,Null} > Out[9]= {0.125,Null} > > Of course the next step is trying to compile this function. I did not > manage to do that. The problem seems to be the initialization of the > locale variable p. I tried compilation with the definitions p=Bag[], > p=Bag[{}], p=Bag[Rest[{0}]], p=Bag[Rest[{{0,0}}]], or e.g. > p=Bag[{{0,0}}] and taking the rest at the end. In all cases a numerical > error was encounterd and the uncompiled function was used for evaluation. > The reason for this is that Bags can't contain tensors, including Lists, in compiled code. For un-compiled code, though, it's fine. Incdentally, I disagree with Patrick's statement that the type of Bag you're declaring is incorrect: looking at the compiled bytecode shows that a Bag of Bags of Integers is not of a different type than a Bag of Integers. As I wrote in a StackExchange posting here <http://mathematica.stackexchange.com/questions/845/internalbag-inside-compile>, Bags in compiled code can contain only scalars of the type they are declared as containing, scalars of types castable to that type, or other Bags (preferably of the same type, otherwise undesired results may be obtained). As such, the operations you perform with this code are simply not representable within the VM and so completely compiling this code (i.e. without MainEvaluate calls) is impossible. > But the following implementation works fine: > > compiled4= Compile[{{k,_Integer}}, > Module[{p=Bag[Rest[{0}]]},Do[StuffBag[p,1];StuffBag[p,i],{i,1,k}]; > Partition[BagPart[p,All],2]]]; > As a point of note, you can also use the third argument of StuffBag to write this code a little more concisely and with one call to (the internal kernel function) StuffBag rather than two StuffBagScalar calls: compiled4a = Compile[{{k, _Integer}}, Module[{p = Internal`Bag[Rest[{0}]]}, Do[Internal`StuffBag[p, {1, i}, 1];, {i, 1, k}]; Partition[Internal`BagPart[p, All], 2] ] ]; (The third argument of StuffBag is analogous to the second argument of Flatten, reducing the rank of a tensor by flattening out levels down to the one specified. In this case, we want to flatten out all of the levels and produce a bag containing a sequence of scalars rather than a tensor, and for a rank-1 tensor, an argument of 1 is sufficient.) > In[16]:= uncompiled3[2 10^5]; // Timing > compiled4[2 10^5]; // Timing > Out[16]= {0.109,Null} > Out[17]= {0.016,Null} > > Moreover, even for k = 10^7, this function does not crash Mathematica: > > In[19]:= compiled4[10^7]; // Timing > Out[19]= {0.687,Null} > I confess I don't understand what it is about Bags that makes some constructs consume an extremely large amounts of memory, so this is certainly a subject in need of further investigation. However, as regards timings, compiled4a is slightly faster than compiled4 due to combining two StuffBagScalar calls into one StuffBag call: k = 10^7 compiled4 -> 0.672s compiled4a -> 0.578s Best, O. R. > Kind regards, > > Fred Simons > Eindhoven University of Technology > > > Op 27-1-2012 12:12, Oleksandr Rasputinov schreef: >> Patrick >> >> I like your "Bag-within-a-Bag" approach to avoid requiring support of >> tensor bags. Indeed on further investigation it seems that these are not >> >> supported. >> >> Just to note that your example puts p in a Real register rather than an >> Integer one, so the results are not of the expected type. Also, it is >> not >> necessary to create and stuff Bags separately; one can create >> pre-stuffed >> bags using Internal`Bag[contents]. Thus, one can simplify your example >> to >> the following: >> >> appendBag2 = Compile[{{k, _Integer, 0}}, >> Module[{p = Bag@Most[{0}] (* put into integer register *), i = 1}, >> For[i = 1, i<= k, ++i, StuffBag[p, Bag[{1, i}]]]; >> >> Table[BagPart[BagPart[p, i], All], {i, 1, k}] >> ] >> ]; >> >> This does not gain (or lose) any performance over your code, but it is >> shorter to write and uses less memory (because integers take up half the >> >> space that reals do). The latter consideration might be important given >> that this function uses a tremendous amount of memory for long lists >> (using reals with k = 10^6 requires about 9.6GB, even though the result >> is >> only 16MB in size). >> >> Two miscellaneous notes about Bags that I am fairly sure have not been >> mentioned anywhere else so far: first, Internal`BagLength is not >> supported >> in compiled code; second, Internal`BagPart can take a third argument, >> which is a function (instead of List) to wrap around the sequence of >> parts, and this does work in compiled code provided that the function is >> >> either Plus or Times. >> >> Best, >> >> O. R. >> >> On Thu, 26 Jan 2012 08:30:01 -0000, Patrick Scheibe >> <pscheibe at trm.uni-leipzig.de> wrote: >> >>> Hi, >>> >>> using Oleks Internal`Bag suggestion, think of a list as a bag of >>> numbers >>> and you can use this as replacement for the lists in your compile. >>> Using >>> Oleks append-to version as comparison >>> >>> appendTo = >>> Compile[{{k, _Integer, 0}}, >>> Module[{mat = {{0, 0}}, i = 0}, >>> For[i = 1, i<= k, i++, AppendTo[mat, {1, i}]]; >>> Rest[mat]]] >>> >>> and here is the Internal`Bag implementation >>> >>> AppendTo[$ContextPath, "Internal`"]; >>> appendBag = Compile[{{k, _Integer}}, >>> Module[{p = Bag[], i = 1, tmpBag}, >>> For[i = 1, i<= k, ++i, >>> tmpBag = Bag[]; >>> StuffBag[tmpBag, 1]; >>> StuffBag[tmpBag, i]; >>> StuffBag[p, tmpBag]; >>> ]; >>> Table[BagPart[BagPart[p, i], All], {i, k}] >>> ] >>> ]; >>> >>> Speedtest shows >>> >>> In[55]:= First /@ {AbsoluteTiming[appendTo[10^5]], >>> AbsoluteTiming[appendBag[10^5]]} >>> >>> Out[55]= {14.559237, 0.149063} >>> >>> that the bag implementation is about 100 times faster. Unfortunately >>> there doesn't exist much documantation about this. One place to look is >>> here http://stackoverflow.com/a/6795762/1078614 where Daniel gives some >>> insights into the usage. >>> >>> Hope this helps. >>> >>> Cheers >>> Patrick >>> >>> >>> On Tue, 2012-01-24 at 05:06 -0500, kris wrote: >>>> Hi >>>> >>>> I have some trouble with my code and would like ask for your help. In >>>> general it is about appending data in form of a list to an existing >>>> list in a compiled function. As far as I understand Mathematica is not >>>> supporting what I am asking for. In order to understand the problem in >>>> more detail I present some "toy" code below. >>>> >>>> test=Compile[{{k,_Integer}}, >>>> Module[{mat={}}, >>>> For[i=1,i<=k,i++, >>>> AppendTo[mat,{1,i}]; >>>> ]; >>>> mat >>>> ] >>>> ]; >>>> >>>> test[2] >>>> (*which produces an error*) >>>> >>>> Appending data in form of numbers for example works just fine but not >>>> with lists. Can anybody explain why Mathematica does not support >>>> appending lists to lists for compiled function? As an alternative I >>>> tried Reap and Sow, which also produces an error. >>>> >>>> However, what seems funny is the following code: >>>> >>>> mat={}; >>>> test=Compile[{{k,_Integer}}, >>>> For[i=1,i<=k,i++, >>>> AppendTo[mat,{1,i}]; >>>> ]; >>>> ]; >>>> >>>> test[2];mat >>>> >>>> The above code produces the result that I was looking for in a >>>> cumbersome way. I would like to prefer compile code which produces the >>>> same result without calling the list mat again. >>>> >>>> Thanks for help I do appreciate it. >>>> Cheers, >>>> Kris >> >