Re;Re; Speeding Up Indexing and Joining

*To*: mathgroup at smc.vnet.net*Subject*: [mg52478] Re;[mg52453] Re; [mg52414]Speeding Up Indexing and Joining*From*: "Benedetto Bongiorno" <bbongiorno at attglobal.net>*Date*: Sun, 28 Nov 2004 01:07:27 -0500 (EST)*Sender*: owner-wri-mathgroup at wolfram.com

Fellow MathGroup, Here is a simple example. Again on large mattices, it takes many many hours Thank you In[31]:= a={{123,"bob",46,"fo"},{133,"harry",45,"fo"},{ 165,"pete",44,"fo"},{2569,"moe",56,"fo"},{6589,"ben",69,"fo"}}; In[32]:= b={{133,1,46,"go",6},{165,88,45,"mo",7},{ 166,53,44,"do",9},{25,82,56,"ho",9},{6589,77,69,"xo",11},{6570,77,69," xo",11},{6571,77,69,"xo",11},{6572,77,69,"xo",11}}; ROUTINE Create Index from Intersection of the first columns of matrix a and = matrix b firstCol=1; lastCol =1; aa = Transpose[Take[Transpose[a],{firstCol, lastCol}]]; bb = Transpose[Take[Transpose[b],{firstCol, lastCol}]]; idx=Intersection[aa,bb]; In[40]:= idx=Intersection[aa,bb] Out[40]= {{133},{165},{6589}} Select Records (Rows) from both matrix a and matrix b that equal the = idx. The idx consists of numerics Locate Position Of Each Record In each matrix using idx n=Length[idx]+1; ans01={}; For[i=1,i<n,i++, step1 = Position[aa,idx[[i]]]; AppendTo[ans01,step1]] n=Length[idx]+1; ans02={}; For[i=1,i<n,i++, step1 = Position[bb,idx[[i]]]; AppendTo[ans02,step1]] Extract a Records ans01 =Extract[a,ans01]; ans02 =Extract[b,ans02]; Sort each matrix by Column 1 using a defined function ans01=matsort[ans01,1]; ans02=matsort[ans02,1]; In[69]:= final=RowJoin[ans02,ans01] Out[69]= = {{133,1,46,go,6,133,harry,45,fo},{165,88,45,mo,7,165,pete,44,fo},{ 6589,77,69,xo,11,6589,ben,69,fo}} -----Original Message----- From: Tomas Garza [ <mailto:tgarza01 at prodigy.net.mx> To: mathgroup at smc.vnet.net mailto:tgarza01 at prodigy.net.mx] Subject: [mg52478] [mg52453] Re: [mg52414] Speeding UP Indexing and Joining ofDifferentSizeRectangular Matrixes Perhaps something could be done. Please explain your problem in more detail. Never mind your program (what is loc01?). What do you mean by joining rows? Give a small example with small matrices (say 5 x 2 or something like that). What are your present run times? Tomas Garza Mexico City ----- Original Message ----- From: "Benedetto Bongiorno" <bbongiorno at attglobal.net> To: mathgroup at smc.vnet.net Subject: [mg52478] [mg52453] [mg52414] Speeding UP Indexing and Joining of DifferentSizeRectangular Matrixes > > Fellow MathGroup, > > I have been using Mathematica for financial analysis purposes and have > been developing note book programs for about 5 years. > My skills at this are self taught with help from Wolfram training and > support. > The largest challenge has been the speed in the analysis of large data > sets. > The following is an example of a routine that takes many hours. > PLEASE HELP AND SHOW HOW I CAN IMPROVE THE ROUTINE TO MAKE THE RUN TIME > SHORTER. > > Equipment HP XP 3.24 processor 2 Gigs > Mathematica 5.01 > Data set a= 257470 by 40, Mixed numeric and string fields, but each > field > (column) is either or numeric or string > Data set b= 258705 by 5, All fields are numeric > > Objective: RowJoin the rows from each data set that have the same ID > field in their corresponding column one. > > Thank you and Happy Holidays > > ROUTINE > Create Index By Invoice ID > > firstCol=loc01[[1]]; > > lastCol =loc01[[1]]; > > aa = Transpose[Take[Transpose[a],{firstCol, lastCol}]]; > > Length[aa] > > 257470 > > firstCol=loc04[[1]]; > > lastCol =loc04[[1]]; > > bb = Transpose[Take[Transpose[b],{firstCol, lastCol}]]; > > Length[bb] > > 258705 > > idx=Intersection[aa,bb]; > > Length[idx] > > 257249 > > n=Length[idx]+1 > > 257250 > > Locate Position Of Each Record In aTable > > ans01={}; > > For[i=1,i<n,i++, > > step1 = Position[aa,idx[[i]]]; > > AppendTo[ans01,step1]] > > ans01=Flatten[ans01,1]; > > Locate Position Of Each Record In bTable > > ans02={}; > > For[i=1,i<n,i++, > > step1 = Position[bb,idx[[i]]]; > > AppendTo[ans02,step1]] > > ans02=Flatten[ans02,1]; > > Extract a Records by Index > > ans01 =Extract[currentBalance,ans01]; > > Dimensions[ans01] > > Flatten If Not A Matrix > > If[MatrixQ[ans01],ans01=ans01,ans01=Flatten[ans01,1]]; > > Dimensions[ans01] > > Extract b Records by Index > > ans02 =Extract[interestBalance,ans02]; > > Dimensions[ans02] > > Flatten If Not A Matrix > > If[MatrixQ[ans02],ans02=ans02,ans02=Flatten[ans02,1]]; > > Dimensions[ans02] > > ans01=matsort[ans01,loc01[[1]]]; > > ans02=matsort[ans02,loc04[[1]]]; > > noteTerms=RowJoin[ans02,ans01]; > > Dimensions[noteTerms] > > >

**Follow-Ups**:**Re: Re;Re; Speeding Up Indexing andJoining***From:*Tomas Garza <tgarza01@prodigy.net.mx>