Tally timing puzzle

• To: mathgroup at smc.vnet.net
• Subject: [mg80522] Tally timing puzzle
• From: Mark Fisher <particlefilter at gmail.com>
• Date: Fri, 24 Aug 2007 01:59:57 -0400 (EDT)

```Hi all,

I don't understand why the difference in timings using Tally is so
large:

data1 = RandomInteger[{1, 100}, 2*10^6];
data2 = Partition[data1, 1];

Timing[Tally[data1];][[1]] --> 0.031
Timing[Tally[data2];][[1]] --> 9.75

That's a factor of 300 slow down. Sorting the data takes the same
amount of time in both cases:

Timing[data1sorted = Sort[data1];][[1]] --> 0.75
Timing[data2sorted = Sort[data2];][[1]] --> 0.75

But the effect of sorting on the timings is different:

Timing[Tally[data1sorted];][[1]] --> 0.031
Timing[Tally[data2sorted];][[1]] --> 3.297

The total time of sorting and tallying data2 is much less than just
tallying data2.

For comparison, consider

Timing[{First[#], Length[#]} & /@ Split[data1sorted];][[1]] --> 0.25
Timing[{First[#], Length[#]} & /@ Split[data2sorted];][[1]] --> 0.719

Any ideas as to why Tally is so slow on data2?

--Mark

```

• Prev by Date: Re: Re: Unicode character property (Open, Close, etc)
• Next by Date: FindFit and complex function?