Services & Resources / Wolfram Forums
-----
 /
MathGroup Archive
2004
*January
*February
*March
*April
*May
*June
*July
*August
*September
*October
*November
*December
*Archive Index
*Ask about this page
*Print this page
*Give us feedback
*Sign up for the Wolfram Insider

MathGroup Archive 2004

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re;Re; Speeding Up Indexing and Joining

  • To: mathgroup at smc.vnet.net
  • Subject: [mg52478] Re;[mg52453] Re; [mg52414]Speeding Up Indexing and Joining
  • From: "Benedetto Bongiorno" <bbongiorno at attglobal.net>
  • Date: Sun, 28 Nov 2004 01:07:27 -0500 (EST)
  • Sender: owner-wri-mathgroup at wolfram.com

Fellow MathGroup,

Here is a simple example.

Again on large mattices, it takes many many hours

Thank you



In[31]:=

a={{123,"bob",46,"fo"},{133,"harry",45,"fo"},{

165,"pete",44,"fo"},{2569,"moe",56,"fo"},{6589,"ben",69,"fo"}};

In[32]:=

b={{133,1,46,"go",6},{165,88,45,"mo",7},{

166,53,44,"do",9},{25,82,56,"ho",9},{6589,77,69,"xo",11},{6570,77,69,"

xo",11},{6571,77,69,"xo",11},{6572,77,69,"xo",11}};

ROUTINE

Create Index from Intersection of the first columns of matrix a and =
matrix b

firstCol=1;

lastCol =1;

aa = Transpose[Take[Transpose[a],{firstCol, lastCol}]];

bb = Transpose[Take[Transpose[b],{firstCol, lastCol}]];
idx=Intersection[aa,bb];

In[40]:=

idx=Intersection[aa,bb]

Out[40]=

{{133},{165},{6589}}

Select Records (Rows) from both matrix a and matrix b that equal the =
idx.
The idx consists of numerics

Locate Position Of Each Record In each matrix using idx

n=Length[idx]+1;

ans01={};

For[i=1,i<n,i++,

step1 = Position[aa,idx[[i]]];

AppendTo[ans01,step1]]

n=Length[idx]+1;

ans02={};

For[i=1,i<n,i++,

step1 = Position[bb,idx[[i]]];

AppendTo[ans02,step1]]

Extract a Records

ans01 =Extract[a,ans01];

ans02 =Extract[b,ans02];



Sort each matrix by Column 1 using a defined function

ans01=matsort[ans01,1]; ans02=matsort[ans02,1];

In[69]:=

final=RowJoin[ans02,ans01]

Out[69]= =
{{133,1,46,go,6,133,harry,45,fo},{165,88,45,mo,7,165,pete,44,fo},{

6589,77,69,xo,11,6589,ben,69,fo}}

-----Original Message-----

From: Tomas Garza [ <mailto:tgarza01 at prodigy.net.mx>
To: mathgroup at smc.vnet.net
mailto:tgarza01 at prodigy.net.mx]



Subject: [mg52478] [mg52453] Re: [mg52414] Speeding UP Indexing and Joining
ofDifferentSizeRectangular Matrixes



Perhaps something could be done. Please explain your problem in more detail.


Never mind your program (what is loc01?). What do you mean by joining rows?

Give a small example with small matrices (say 5 x 2 or something like that).


What are your present run times?

Tomas Garza

Mexico City

----- Original Message -----

From: "Benedetto Bongiorno" <bbongiorno at attglobal.net>
To: mathgroup at smc.vnet.net


Subject: [mg52478] [mg52453] [mg52414] Speeding UP Indexing and Joining of

DifferentSizeRectangular Matrixes



>

> Fellow MathGroup,

>

> I have been using Mathematica for financial analysis purposes and have

> been developing note book programs for about 5 years.

> My skills at this are self taught with help from Wolfram training and

> support.

> The largest challenge has been the speed in the analysis of large data 

> sets.

> The following is an example of a routine that takes many hours.

> PLEASE HELP AND SHOW HOW I CAN IMPROVE THE ROUTINE TO MAKE THE RUN 
TIME

> SHORTER.

>

> Equipment HP XP 3.24 processor 2 Gigs

> Mathematica 5.01

> Data set a= 257470 by 40, Mixed numeric and string fields, but each

> field

> (column) is either or numeric or string

> Data set b= 258705 by 5, All fields are numeric

>

> Objective: RowJoin the rows from each data set that have the same ID

> field in their corresponding column one.

>

> Thank you and Happy Holidays

>

> ROUTINE

> Create Index By Invoice ID

>

> firstCol=loc01[[1]];

>

> lastCol =loc01[[1]];

>

> aa = Transpose[Take[Transpose[a],{firstCol, lastCol}]];

>

> Length[aa]

>

> 257470

>

> firstCol=loc04[[1]];

>

> lastCol =loc04[[1]];

>

> bb = Transpose[Take[Transpose[b],{firstCol, lastCol}]];

>

> Length[bb]

>

> 258705

>

> idx=Intersection[aa,bb];

>

> Length[idx]

>

> 257249

>

> n=Length[idx]+1

>

> 257250

>

> Locate Position Of Each Record In aTable

>

> ans01={};

>

> For[i=1,i<n,i++,

>

> step1 = Position[aa,idx[[i]]];

>

> AppendTo[ans01,step1]]

>

> ans01=Flatten[ans01,1];

>

> Locate Position Of Each Record In bTable

>

> ans02={};

>

> For[i=1,i<n,i++,

>

> step1 = Position[bb,idx[[i]]];

>

> AppendTo[ans02,step1]]

>

> ans02=Flatten[ans02,1];

>

> Extract a Records by Index

>

> ans01 =Extract[currentBalance,ans01];

>

> Dimensions[ans01]

>

> Flatten If Not A Matrix

>

> If[MatrixQ[ans01],ans01=ans01,ans01=Flatten[ans01,1]];

>

> Dimensions[ans01]

>

> Extract b Records by Index

>

> ans02 =Extract[interestBalance,ans02];

>

> Dimensions[ans02]

>

> Flatten If Not A Matrix

>

> If[MatrixQ[ans02],ans02=ans02,ans02=Flatten[ans02,1]];

>

> Dimensions[ans02]

>

> ans01=matsort[ans01,loc01[[1]]];

>

> ans02=matsort[ans02,loc04[[1]]];

>

> noteTerms=RowJoin[ans02,ans01];

>

> Dimensions[noteTerms]

>

>

>







  • Prev by Date: Re: Re: Re: Piecewise symbol in 5.1
  • Next by Date: Help with dynamics
  • Previous by thread: Re: a bug in many variables integration ?
  • Next by thread: Re: Re;Re; Speeding Up Indexing andJoining