MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: find and count partially identical sublist

  • To: mathgroup at smc.vnet.net
  • Subject: [mg85758] Re: find and count partially identical sublist
  • From: "Steve Luttrell" <steve at _removemefirst_luttrell.org.uk>
  • Date: Thu, 21 Feb 2008 17:59:24 -0500 (EST)
  • References: <fphd98$8mq$1@smc.vnet.net>

Define the list to be processed.

In[1]:= 
data={{"B","A",0,1},{"A","B",6,1},{"B","A",4,1},{"B","A",4,1},{"A","B",1,1},{"B","A",5,1},{"B","A",2,1},{"A","B",10,1}}
Out[1]= 
{{B,A,0,1},{A,B,6,1},{B,A,4,1},{B,A,4,1},{A,B,1,1},{B,A,5,1},{B,A,2,1},{A,B,10,1}}

Split the list into sublists containing runs of identical sublists (defined 
according to the stated criterion).

In[2]:= data2=Split[data,Take[#1,2]==Take[#2,2]&]
Out[2]= 
{{{B,A,0,1}},{{A,B,6,1}},{{B,A,4,1},{B,A,4,1}},{{A,B,1,1}},{{B,A,5,1},{B,A,2,1}},{{A,B,10,1}}}

Determine the lengths of the identical runs.

In[3]:= length2=Map[Length,data2]
Out[3]= {1,1,2,1,2,1}

Replace each sublist by an appropriately scaled version. The pattern 
{x1_,x2_,x3_,y_}?VectorQ matches a list of length 4 each of whose elements 
is not itself a list.

In[4]:= 
data3=MapThread[#1/.{x1_,x2_,x3_,y_}?VectorQ->{x1,x2,x3,y/#2}&,{data2,length2}]
Out[4]= 
{{{B,A,0,1}},{{A,B,6,1}},{{B,A,4,1/2},{B,A,4,1/2}},{{A,B,1,1}},{{B,A,5,1/2},{B,A,2,1/2}},{{A,B,10,1}}}

Stephen Luttrell
West Malvern, UK

<markus.roellig at googlemail.com> wrote in message 
news:fphd98$8mq$1 at smc.vnet.net...
> Hello group,
>
> I am trying to find and count sublists that are partially identical to
> each other and then modify parts of this sublist with the
> multiplicity. It's easier to understand if I give an example.
>
> Say I have an array (strings and numbers mixed) like:
>
> {{"B", "A", 0, 1}, {"A", "B", 6, 1}, {"B", "A", 4, 1}, {"B", "A", 4,
>  1}, {"A", "B", 1, 1}, {"B", "A", 5, 1}, {"B", "A", 2, 1}, {"A", "B",
> 10, 1}}
>
> I need to find successive sublists which have the same first two
> elements (here {3,4} and {7,6}). Depending on
> how many repetitions occur I want to divide the 4th element of each
> sublist by the number of repetitions. In the example the result would
> be:
>
> {{"B", "A", 0, 1}, {"A", "B", 6, 1}, {"B", "A", 4, 1/2}, {"B", "A", 4,
>   1/2}, {"A", "B", 1, 1}, {"B", "A", 5, 1/2}, {"B", "A", 2, 1/
>  2}, {"A", "B", 10, 1}}
>
> The code I came up with is:
>
>
> tst = Table[{RandomChoice[{"A", "B"}], RandomChoice[{"A", "B"}],
>    RandomInteger[{0, 10}], 1}, {i, 1, 30}];
> tstSplt = Split[tst, #1[[1 ;; 2]] === #2[[1 ;; 2]] &] // MatrixForm
> tab = Table[tstSplt[[1, i]] // Length, {i, 1, Length[tstSplt[[1]]]}]
> rpl = MapThread[#1[[All, 4]]/#2 &, {tstSplt[[1, All]], tab}] //
>  Flatten
> tst[[All, 4]] = tst[[#, 4]] & @@@ rpl;
> tst
>
>
> This works, but I am somewhat concerned with run speed (my actual
> array is much larger, roughly 50000x20). And I have the feeling that I
> am wasting too much memory.
>
>
> One additional comment: The above code only finds successive
> duplicates. How would I have to modify it to find all occurences ?
>
> Best regards
>
>
> Markus Roellig
>
> I.Physikalisches Institut der
> Universit=E4t zu K=F6ln
> Z=FClpicher Strasse 77
> D-50937 K=F6ln
> Tel.:  +49-221-470-3547
> Fax :  +49-221-470-5162
> 



  • Prev by Date: Re: A Use for Interpretation
  • Next by Date: squared norm in Mathematica
  • Previous by thread: find and count partially identical sublist
  • Next by thread: Re: find and count partially identical sublist