RE: Re: Newbie Question

*To*: mathgroup at smc.vnet.net*Subject*: [mg32776] RE: [mg32728] Re: [mg32686] Newbie Question*From*: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>*Date*: Sat, 9 Feb 2002 05:11:50 -0500 (EST)*Sender*: owner-wri-mathgroup at wolfram.com

> -----Original Message----- > From: Sseziwa Mukasa [mailto:mukasa at jeol.com] To: mathgroup at smc.vnet.net > Sent: Thursday, February 07, 2002 11:10 AM > To: mathgroup at smc.vnet.net > Subject: [mg32776] [mg32728] Re: [mg32686] Newbie Question > > > "Brunsman, Kenneth J" wrote: > > > > > I can parse out the column I need using Take [data [[All, > Column #]]] --- no > > big deal. So far, so good. > > > > I'm not sure what you're using Take for, data[[All,Column #]] > should be > sufficient. > > > > > Now here's the problem --- How do I get % Differences > between any of 10,000 > > items in that list? I need to take % Differences between > adjacent pairs, > > i.e. n and n-1, as well as items n and n-m where m can > range from 1 to > > 10,000. > > > > Do you want all the differences n-m (m = 1..n-1) at once or just for a > particular m? For an individual m I got good results with > > f[x_,m_Integer]:=Block[{a=Drop[x,m]},(a-Drop[x,-m])/a] > > On a PowerMac with an 800MHz G4 processor it takes 0.02 > seconds to evaluate > f[x,1] where x is a list of length 10000. > Table[f[x,i],{i,Length[x]-1}] will > of course give you all the differences for m = 1..n-1. > > > > > Further, I need to make sure that this code runs fast > because I'm about to > > run this on a data set of 10^8 data points (financial times > series). I > > could do this standing on my head in Fortran, but I'm bound > and determined > > to learn Mathematica if it takes the rest of my unnatural life. > > > > The key difference between Fortran programming and > Mathematica in my opinion is > that writing explicit loops in Mathematica is generally > inefficient, so is > indexing elements of lists and arrays. Take advantage of the > fact that many > operators will automatically apply themselves to every > element in a list. Also > avoid making copies as much as possible by using lambda > functions and mapping. > Finally, variables are best used to eliminate common > subexpressions as in the > Block statement in the function above. > > Incidentally though, keeping a list of 10^8 elements in > memory is probably not > a good idea, the list should probably be broken up and > processed in smaller > pieces to avoid spending all your time swapping memory to disk. > > > This is my first time attempting list processing and all > I'm doing is > > > screwing up royally. > > > > Functional programming and pattern matching are not generally > practiced outside > of academic programming exercises and they are a very > different paradigm from > traditional imperative programming (object oriented techniques being a > different kettle of fish entirely). Practice makes perfect > of course. There > are many excellent books that not only teach efficient > programming style in > Mathematica but generally do so while applying to techniques > to domain specific > problems. I don't do any financial series analysis myself > but a quick search > on Amazon.com for "mathematica finance" turns up 6 titles on > using Mathematica > for economic and financial modeling. > > Regards, > > Sseziwa Mukasa > > Sseziwa, this is the right way to deal with the problem, I think. Here just a little modification of this idea, and an other one following a suggestion from Andrzej Kozlowski: In[1]:= ll = Table[Random[Real, {0, 100}], {300}]; your suggestion: In[2]:= f[x_, m_Integer] := Block[{a = Drop[x, m]}, (a - Drop[x, -m])/a] In[3]:= r1 = Table[f[ll, i], {i, Length[ll] - 1}]; my variant: In[4]:= r2 = (#1 - #2)/#1 & @@@ Drop[NestList[{Drop[First[#], 1], Drop[Last[#], -1]} &, {ll, ll}, Length[ll] - 1], 1]; In[5]:= r1 == r2 Out[5]= True The idea was: dropping the first or last element of a list is faster than dropping half of it. Another idea (trying to use efficient list operations): In[6]:= g[x_, y_] := (y - x)/y; g[_] := Sequence[]; In[7]:= r3 = Drop[ListCorrelate[ll, ll, {1, 1}, {}, g, List], 1]; In[8]:= r3 == r2 Out[8]= True Timing however excludes this variant; compare the other two: In[9]:= ll = Table[Random[Real, {0, 100}], {4000}]; In[16]:= (r = Table[f[ll, i], {i, Length[ll] - 1}]); // Timing Out[16]= {20.57 Second, Null} In[17]:= Remove[r]; In[18]:= (r = (#1 - #2)/#1 & @@@ NestList[{Drop[First[#], 1], Drop[Last[#], -1]} &, {Drop[ll, 1], Drop[ll, -1]}, Length[ll] - 2]); // Timing Out[18]= {13.359 Second, Null} In[19]:= Remove[r]; Your variant however is more economic with memory, and I was not able to run the test with a list of 10000. (400 MHz P II Notebook, 192 MB real memory). I estimate that 10000 will take about 3 minutes. This would mean 10^8 data will at least take several days, perhaps too much to make any predictions. Anyways I agree with you that this is not the right task for Mathematica, producing a vaste of numbers of little information content. From which you finally will have to read off something, what? how? -- Hartmut

**Follow-Ups**:**Re: RE: Re: Newbie Question***From:*Sseziwa Mukasa <mukasa@jeol.com>