MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Tally timing puzzle

  • To: mathgroup at smc.vnet.net
  • Subject: [mg80522] Tally timing puzzle
  • From: Mark Fisher <particlefilter at gmail.com>
  • Date: Fri, 24 Aug 2007 01:59:57 -0400 (EDT)

Hi all,

I don't understand why the difference in timings using Tally is so
large:

data1 = RandomInteger[{1, 100}, 2*10^6];
data2 = Partition[data1, 1];

Timing[Tally[data1];][[1]] --> 0.031
Timing[Tally[data2];][[1]] --> 9.75

That's a factor of 300 slow down. Sorting the data takes the same
amount of time in both cases:

Timing[data1sorted = Sort[data1];][[1]] --> 0.75
Timing[data2sorted = Sort[data2];][[1]] --> 0.75

But the effect of sorting on the timings is different:

Timing[Tally[data1sorted];][[1]] --> 0.031
Timing[Tally[data2sorted];][[1]] --> 3.297

The total time of sorting and tallying data2 is much less than just
tallying data2.

For comparison, consider

Timing[{First[#], Length[#]} & /@ Split[data1sorted];][[1]] --> 0.25
Timing[{First[#], Length[#]} & /@ Split[data2sorted];][[1]] --> 0.719

Any ideas as to why Tally is so slow on data2?

--Mark



  • Prev by Date: Re: Re: Unicode character property (Open, Close, etc)
  • Next by Date: FindFit and complex function?
  • Previous by thread: Normal@GeometricTransformation
  • Next by thread: FindFit and complex function?