Services & Resources / Wolfram Forums
-----
 /
MathGroup Archive
2004
*January
*February
*March
*April
*May
*June
*July
*August
*September
*October
*November
*December
*Archive Index
*Ask about this page
*Print this page
*Give us feedback
*Sign up for the Wolfram Insider

MathGroup Archive 2004

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Speeding UP Indexing and Joining ofDifferentSizeRectangular Matrixes

  • To: mathgroup at smc.vnet.net
  • Subject: [mg52471] Re: Speeding UP Indexing and Joining ofDifferentSizeRectangular Matrixes
  • From: David Bailey <dave at Remove_Thisdbailey.co.uk>
  • Date: Sun, 28 Nov 2004 01:07:05 -0500 (EST)
  • References: <co999c$h67$1@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

Tomas Garza wrote:
> Perhaps something could be done. Please explain your problem in more detail. 
> Never mind your program (what is loc01?). What do you mean by joining rows? 
> Give a small example with small matrices (say 5 x 2 or something like that). 
> What are your present run times?
> Tomas Garza
> Mexico City
> ----- Original Message ----- 
> From: "Benedetto Bongiorno" <bbongiorno at attglobal.net>
To: mathgroup at smc.vnet.net
> Subject: [mg52471]  Speeding UP Indexing and Joining of 
> DifferentSizeRectangular Matrixes
> 
> 
> 
>>Fellow MathGroup,
>>
>>I have been using Mathematica for financial analysis purposes and have 
>>been
>>developing note book programs for about 5 years.
>>My skills at this are self taught with help from Wolfram training and 
>>support.
>>The largest challenge has been the speed in the analysis of large data 
>>sets.
>>The following is an example of a routine that takes many hours.
>>PLEASE HELP AND SHOW HOW I CAN IMPROVE THE ROUTINE TO MAKE THE RUN TIME
>>SHORTER.
>>
>>Equipment HP XP 3.24 processor 2 Gigs
>>Mathematica 5.01
>>Data set a= 257470 by 40, Mixed numeric and string fields, but each field
>>(column) is either or numeric or string
>>Data set b= 258705 by 5, All fields are numeric
>>
>>Objective:  RowJoin the rows from each data set that have the same ID 
>>field
>>in their corresponding column one.
>>
>>Thank you and Happy Holidays
>>
>>ROUTINE
>>Create Index By Invoice ID
>>
>>firstCol=loc01[[1]];
>>
>>lastCol =loc01[[1]];
>>
>>aa = Transpose[Take[Transpose[a],{firstCol, lastCol}]];
>>
>>Length[aa]
>>
>>257470
>>
>>firstCol=loc04[[1]];
>>
>>lastCol =loc04[[1]];
>>
>>bb = Transpose[Take[Transpose[b],{firstCol, lastCol}]];
>>
>>Length[bb]
>>
>>258705
>>
>>idx=Intersection[aa,bb];
>>
>>Length[idx]
>>
>>257249
>>
>>n=Length[idx]+1
>>
>>257250
>>
>>Locate Position Of Each Record In aTable
>>
>>ans01={};
>>
>>For[i=1,i<n,i++,
>>
>>step1 = Position[aa,idx[[i]]];
>>
>>AppendTo[ans01,step1]]
>>
>>ans01=Flatten[ans01,1];
>>
>>Locate Position Of Each Record In bTable
>>
>>ans02={};
>>
>>For[i=1,i<n,i++,
>>
>>step1 = Position[bb,idx[[i]]];
>>
>>AppendTo[ans02,step1]]
>>
>>ans02=Flatten[ans02,1];
>>
>>Extract a Records by Index
>>
>>ans01 =Extract[currentBalance,ans01];
>>
>>Dimensions[ans01]
>>
>>Flatten If Not A Matrix
>>
>>If[MatrixQ[ans01],ans01=ans01,ans01=Flatten[ans01,1]];
>>
>>Dimensions[ans01]
>>
>>Extract b Records by Index
>>
>>ans02 =Extract[interestBalance,ans02];
>>
>>Dimensions[ans02]
>>
>>Flatten If Not A Matrix
>>
>>If[MatrixQ[ans02],ans02=ans02,ans02=Flatten[ans02,1]];
>>
>>Dimensions[ans02]
>>
>>ans01=matsort[ans01,loc01[[1]]];
>>
>>ans02=matsort[ans02,loc04[[1]]];
>>
>>noteTerms=RowJoin[ans02,ans01];
>>
>>Dimensions[noteTerms]
>>
>>
>>
> 
> 
> 
Just a quick observation. Mixing floating point numbers with strings in 
the same array is never a good idea in problems where performance 
matters because it prevents the system creating packed arrays - which 
can make a big difference.

David Bailey


  • Prev by Date: Re: Combining graphics and tabels in one cell
  • Next by Date: Re: Re: Re: Piecewise symbol in 5.1
  • Previous by thread: Re: Speeding UP Indexing and Joining ofDifferentSizeRectangular Matrixes
  • Next by thread: Changing CellMargins?