MathGroup Archive: August 2007 [00054]

[Date Index] [Thread Index] [Author Index]

Re: Unbearably slow plotting (v6)

To: mathgroup at smc.vnet.net
Subject: [mg79651] Re: Unbearably slow plotting (v6)
From: Bill Rowe <readnewsciv at sbcglobal.net>
Date: Wed, 1 Aug 2007 04:59:37 -0400 (EDT)

On 7/30/07 at 6:44 AM, thomas.muench at gmail.com (thomas) wrote:

>On Jul 29, 6:13 am, Bill Rowe <readnews... at sbcglobal.net> wrote:

>>With 1E5 and 1E6 points, this results in a plot that is
>>indistinguishable from a filled rectangle. That seems to be of very
>>little use. So, while I might be a bit impatient waiting for
>>Mathematica on my machine to plot 1E6 points, I don't see why I
>>would want to do that in the first place. What I want from ListPlot
>>is something to give me an idea of trends in my data. Given real
>>limits on display resolution and size, plotting 1E6 points
>>typically will not provide a useful plot regardless of how fast it
>>plots. So why do this?

>It is of course true that it doesn't make any sense to plot a
>million random numbers. It is easier to just plot a filled
>rectangle. But, given my application, these types of plots are quite
>common. Imagine I acquire data at a rate of 10 kHz, for 10 seconds
>(just as an example, this is actually on the short side), that's
>100,000 points. Repeat the experiment 9 times, and plot the 9 traces
>together with their average, and you have a million points to plot.
>More realistic, however, are even longer traces and more
>repetitions.

Currently, I am doing testing where a data set consists of
several runs data (10-20) sampled at 100-200 kHz for 1 second.
And yes, this obviously adds up to a million data points to plot
very quickly.

>I am aware that one could do some clever down-sampling to reduce the
>number of points to plot. I've done that, and the plot looks
>indistinguishable from the full plot. Given the screen resolution,
>one needs to print about 1000 points per trace in order for it to
>look good. If you have 100 or so traces, however, you again get into
>the regime of slow plotting (100,000 points).

>What I want is exactly what you mention in your post: I want to get
>an idea about the trend in my data by plotting it real quickly,
>without having to process the data. Once I know what's going on, I
>can go on to do all the applicable data analysis steps.

This is easily done in version 6. Simply using the new Span
construct will cause Mathematica to downsample with essentially
no processing on your part. That is:

ListPlot[data[[;; ;; 100]]]

will plot every 100th point. This is considerably simpler and
easier than what would have had to be done in version 5.2 to
downsample data.

And as far as making a great many similar plots, my experience
is the time Mathematica needs to read data in from the hard
drive and convert it to internal forms tends to be the
bottleneck, not the time spent in ListPlot.

>What bothers me most about this is that plotting speed is so much
>worse that in 5.2, especially considering the language that is used
>in the documentation describing the capabilities of the new
>Mathematica:

>Quote (from guide/NewIn60DataVisualization): "Building on
>Mathematica's strengths in large-scale data handling, numerical
>optimization, and geometric computation, Version 6.0 brings a new
>level of automation to data visualization - with major new original
>algorithms for graph layout, immediate surface reconstruction,
>automated labeling, seamless handling of unstructured data,
>geometrically driven interpolation, automatic date plotting - as
>well as major innovations in automated aesthetics."

>Note the words "large-scale" and "a new level of automation". If I
>need to add a (manual) down-sampling step to plot my large scale
>data in a reasonable amount of time, I would not call that "a new
>level of automation".

What little testing I've done to compare version 5.2 and version
6 on my machine indicates version 6 is faster than version 5.2
in terms of plotting data. But I admit, I only tried on the
order of 10,000 points. And the testing I did in response to
your post on my machine indicates version 6 starts taking
significatly more time around 100,000 points are more.

Whether 10,000 points is defined as a large scale data set or
not will obviously vary with the user. But my experience with
both 5.2 and 6 indicates 6 is definitely better for plotting
large data sets particularly given the new Span construct that
allows me to easily downsample data.
--
To reply via email subtract one hundred and four

Prev by Date: Re: Numerical integration

Next by Date: Ploting solution sets to compound inequalites

Previous by thread: Re: Unbearably slow plotting (v6)

Next by thread: Re: Unbearably slow plotting (v6)