Re: []Speeding Up Indexing and Joining
- To: mathgroup at smc.vnet.net
- Subject: [mg52490] Re: []Speeding Up Indexing and Joining
- From: Bill Rowe <readnewsciv at earthlink.net>
- Date: Mon, 29 Nov 2004 01:22:39 -0500 (EST)
- Sender: owner-wri-mathgroup at wolfram.com
On 11/28/04 at 1:07 AM, bbongiorno at attglobal.net (Benedetto
Bongiorno) wrote:
>Here is a simple example.
>Again on large mattices, it takes many many hours
>a={{123,"bob",46,"fo"},{133,"harry",45,"fo"},{
>165,"pete",44,"fo"},{2569,"moe",56,"fo"},{6589,"ben",69,"fo"}};
>b={{133,1,46,"go",6},{165,88,45,"mo",7},{
>166,53,44,"do",9},{25,82,56,"ho",9},{6589,77,69,"xo",11},{6570,77,
>69,"
>xo",11},{6571,77,69,"xo",11},{6572,77,69,"xo",11}};
>ROUTINE
>Create Index from Intersection of the first columns of matrix a and
>= matrix b
>firstCol=1;
>lastCol =1;
>aa = Transpose[Take[Transpose[a],{firstCol, lastCol}]];
>bb = Transpose[Take[Transpose[b],{firstCol, lastCol}]];
>idx=Intersection[aa,bb];
More efficient is
aa = a[[All,1]];
bb = b[[All,1]];
idx = Intersection[aa,bb];
>Select Records (Rows) from both matrix a and matrix b that equal
>the = idx. The idx consists of numerics
>Locate Position Of Each Record In each matrix using idx
>n=Length[idx]+1;
>ans01={};
>For[i=1,i<n,i++,
>step1 = Position[aa,idx[[i]]];
>AppendTo[ans01,step1]]
>n=Length[idx]+1;
>ans02={};
>For[i=1,i<n,i++,
>step1 = Position[bb,idx[[i]]];
>AppendTo[ans02,step1]]
>Extract a Records
>ans01 =Extract[a,ans01];
>ans02 =Extract[b,ans02];
More efficient is:
ans01 = Select[a, MemberQ[idx, First@#]&];
ans02 = Select[b, MemberQ[idx, First@#]&];
>Sort each matrix by Column 1 using a defined function
>ans01=matsort[ans01,1]; ans02=matsort[ans02,1];
You didn't say how you coded matsort. If matsort is coded using For it can be speeded up significantly by definiing it as:
matsort[x_,n_]:=x[[Ordering[x[[All,n]]]]]
And to show this yields the same solution
In[68]:=
RowJoin[matsort[ans02, 1], matsort[ans01, 1]]
Out[68]=
{{133, 1, 46, "go", 6, 133, "harry", 45, "fo"},
{165, 88, 45, "mo", 7, 165, "pete", 44, "fo"},
{6589, 77, 69, "xo", 11, 6589, "ben", 69, "fo"}}
Although, I probably would have done
RowJoin[matsort[ans02, 1], Rest/@matsort[ans01, 1]]
to eliminate the redundant column.
>final=RowJoin[ans02,ans01]
>Out[69]= =
>{{133,1,46,go,6,133,harry,45,fo},{165,88,45,mo,7,165,pete,44,fo},{
>6589,77,69,xo,11,6589,ben,69,fo}}
--
To reply via email subtract one hundred and four