Re: Extract values and multilpicities from list
- To: mathgroup at smc.vnet.net
- Subject: [mg64902] Re: [mg64849] Extract values and multilpicities from list
- From: "Carl K. Woll" <carl at woll2woll.com>
- Date: Tue, 7 Mar 2006 06:11:52 -0500 (EST)
- References: <200603050819.DAA09793@smc.vnet.net> <440C37AE.3050207@wolfram.com>
- Sender: owner-wri-mathgroup at wolfram.com
Carl K. Woll wrote:
> Dr. Wolfgang Hintze wrote:
>
>> Given a list of integers which may repeat, e.g.
>>
>> lstIn = {2,3,4,4,2,1,1,5,4}
>>
>> provide a list of different values and their respective multiplicities,
>> in the example,
>>
>> LstOut= {{1,2},{2,2},{3,1},{4,3},{5,1}}
>>
>> Who finds the shortest function doing this task in general?
>>
>> Thanks.
>>
>> Best regards,
>> Wolfgang
>
>
> For the special case where the integers are positive and not too large,
> then the following approach is probably close to the fastest.
>
> runs[d_] :=
> SparseArray[Transpose[{d, Range[Length[d]]}] -> Table[1, {Length[d]}]] /.
> SparseArray[_, _, _, {_, {p_, __}, _}] :>
> Transpose[{Range[Length[p] - 1], Rest[p] - Most[p]}]
>
> The simplest solution is probably:
>
> runs2[d_] := {First[#], Length[#]} & /@ Split[Sort[d]]
>
> The output of runs will include integers with 0 length runs, so we need
> to eliminate them when comparing results:
>
> In[22]:=
> r1=runs[data];//Timing
> r2=runs2[data];//Timing
> DeleteCases[r1,{_,0}]===r2
>
> Out[22]=
> {0.485 Second,Null}
>
> Out[23]=
> {0.781 Second,Null}
>
> Out[24]=
> True
>
> Carl Woll
> Wolfram Research
>
I forgot to mention the possibility of using Compile. Here is a version
with Compile that is a bit faster, and takes much less memory:
runs3[d_] := With[{max = Max[d]},
Transpose[{
Range[max],
Compile[{{dat, _Integer, 1}},
Module[{bins = Table[0, {max}], i},
Do[bins[[dat[[i]]]] += 1, {i, Length[dat]}];
bins
]][d]
}]
];
Here are timings on a data set with 10^7 elements:
In[42]:=
r1=runs3[data];//Timing
r2=runs[data];//Timing
r1===r2
Out[42]=
{3.218 Second,Null}
Out[43]=
{4.891 Second,Null}
Out[44]=
True
Carl Woll
Wolfram Research
- References:
- Extract values and multilpicities from list
- From: "Dr. Wolfgang Hintze" <weh@snafu.de>
- Extract values and multilpicities from list