MathGroup Archive: December 2002 [00106]

[Date Index] [Thread Index] [Author Index]

Re: Efficient Sorting Algorithm

To: mathgroup at smc.vnet.net
Subject: [mg38167] Re: [mg38140] Efficient Sorting Algorithm
From: Sseziwa Mukasa <mukasa at jeol.com>
Date: Wed, 4 Dec 2002 03:24:43 -0500 (EST)
Sender: owner-wri-mathgroup at wolfram.com

On Tuesday, December 3, 2002, at 04:35 AM, Brian Higgins wrote:

> Hi,
>
> I have two lists (in general of different length) that have the
> following structure
>
> s1 = Table[{FromCharacterCode[Table[Random[Integer, {65, 69}], {4}]],
>        Random[Integer, {1, 2000}]}, {100}];
> s2 = Table[{FromCharacterCode[Table[Random[Integer, {65, 69}], {4}]],
>        Random[Integer, {1, 2000}]}, {200}];
>
> I am then interested in obtaining  all the strings in s1 that match
> with those in s2. At the present time I use the following agorithm to
> find the matches
>
> myList = Flatten[Outer[List, s1, s2, 1], 1];Select[myList,
> (First[First[#]] == First[Last[#]]) &]
>
> This works fine, but when s1 and s2 are large  ( e.g. 3000 or more
> elements) then Outer seems inefficient.  My question: does anyone have
> a suggestion that would be more efficient than my kludge approach.
> Note I have tried Intersection, which finds all the matches, i.e.
>
> myList2 = Intersection[s1,s2, SameTest -> (#1[[1]] == #2[[1]] &)];
>
> But I have not been successful in selecting all the matched pairs
> using myList2
>

What you really want is a list of the common strings from both lists 
and then a list of the integers associated with that string from each 
list as well.  Does it matter if the structure returned is of the same 
form as myList?  If not then the following is pretty efficient:

Join[Cases[s1, {#, _}], Cases[s2, {#, _}]] & /@
       Intersection[Union[Transpose[s1][[1]]], Union[Transpose[s2][[1]]]]

A comparison on lists of length 3000 gives the following timings

In[33]=
Timing[Select[
   Flatten[Outer[
       List, s1, s2, 1], 1], (First[First[#]] == First[Last[#]]) &];]
Timing[Join[Cases[s1, {#, _}], Cases[s2, {#, _}]] & /@
Intersection[Union[Transpose[s1][[1]]], Union[Transpose[s2][[1]]]];]

Out[33]=
{140.71 Second,Null}

Out[34]=
{4.27 Second,Null}

It can be modified to return results in the same form as myList but 
with different ordering as follows:

(Sequence @@ Outer[
         Sequence, #[[1]], #[[2]], 1]) & /@ ({Cases[s1, {#, _}],
            Cases[s2, {#, _}]} & /@ 
Intersection[Union[Transpose[s1][[1]]],
               Union[Transpose[s2][[1]]]])

Regards,

Ssezi

Prev by Date: RE: Efficient Sorting Algorithm

Next by Date: RE: Copying/Exporting graphics to other applications

Previous by thread: RE: Efficient Sorting Algorithm

Next by thread: Re: Efficient Sorting Algorithm