Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

find and count partially identical sublist

  • To: mathgroup at smc.vnet.net
  • Subject: [mg85740] find and count partially identical sublist
  • From: markus.roellig at googlemail.com
  • Date: Wed, 20 Feb 2008 07:04:35 -0500 (EST)

Hello group,

I am trying to find and count sublists that are partially identical to
each other and then modify parts of this sublist with the
multiplicity. It's easier to understand if I give an example.

Say I have an array (strings and numbers mixed) like:

{{"B", "A", 0, 1}, {"A", "B", 6, 1}, {"B", "A", 4, 1}, {"B", "A", 4,
  1}, {"A", "B", 1, 1}, {"B", "A", 5, 1}, {"B", "A", 2, 1}, {"A", "B",
10, 1}}

I need to find successive sublists which have the same first two
elements (here {3,4} and {7,6}). Depending on
how many repetitions occur I want to divide the 4th element of each
sublist by the number of repetitions. In the example the result would
be:

{{"B", "A", 0, 1}, {"A", "B", 6, 1}, {"B", "A", 4, 1/2}, {"B", "A", 4,
   1/2}, {"A", "B", 1, 1}, {"B", "A", 5, 1/2}, {"B", "A", 2, 1/
  2}, {"A", "B", 10, 1}}

The code I came up with is:


tst = Table[{RandomChoice[{"A", "B"}], RandomChoice[{"A", "B"}],
    RandomInteger[{0, 10}], 1}, {i, 1, 30}];
tstSplt = Split[tst, #1[[1 ;; 2]] === #2[[1 ;; 2]] &] // MatrixForm
tab = Table[tstSplt[[1, i]] // Length, {i, 1, Length[tstSplt[[1]]]}]
rpl = MapThread[#1[[All, 4]]/#2 &, {tstSplt[[1, All]], tab}] //
  Flatten
tst[[All, 4]] = tst[[#, 4]] & @@@ rpl;
tst


This works, but I am somewhat concerned with run speed (my actual
array is much larger, roughly 50000x20). And I have the feeling that I
am wasting too much memory.


One additional comment: The above code only finds successive
duplicates. How would I have to modify it to find all occurences ?

Best regards


Markus Roellig

I.Physikalisches Institut der
Universit=E4t zu K=F6ln
Z=FClpicher Strasse 77
D-50937 K=F6ln
Tel.:  +49-221-470-3547
Fax :  +49-221-470-5162


  • Prev by Date: Re: Troubles with HarmonicNumber, empty sums, and Zeta
  • Next by Date: Re: Re: Mathematica Book and Documentation
  • Previous by thread: Re: Using Mathematica figures in MS Word documents
  • Next by thread: Re: find and count partially identical sublist