Large data sets?
- To: mathgroup at christensen.cybernetics.net
- To: mathgroup at christensen.cybernetics.net
- Subject: [mg784] Large data sets?
- From: My Account <me at leidecker.gsfc.nasa.gov>
- Date: Wed, 19 Apr 1995 00:28:51 -0400
Several writers have recently mentioned "large data set" but do not always mention what means "large". My buddy down the hall is exploring data acquired during his tests of ball bearing assemblies. Each data set is a few megs of (t, r, c, i, a) where t = time, r = electrical resistance, c = electrical capacitance, i = motor drive current, a = shaft angle. Mma on his NeXTStation is fast enough for his impatient purposes, and the flexibility provided by Mma makes him prefer it to other environments for exploring various ideas he has for exploring the processes these data sets capture. He has developed one "standard scheme" that runs in about 30 minutes. He says he will, Real Soon Now, recode this in c or something; in the meantime, he starts this worksheet running and does something else. Like write the weekly report. I have been using Mma on my NeXT Cube for some gonzo manipulations of images acquired from a CCD camera. Each image is 512 x 480 bytes = 1/4 Mbytes. Mma imports these OK, extracts 50 x 50 subimages (= 2500 bytes) and displays them, processes them in ways that are sometimes "off the beaten path", and displays processed subimages as well as appropriate statistics. It is the ease with which Mma supports non-standard processing, and the convenience of being able to do "everything" within the same environment, that attracts me to Mma for these purposes. But of course, when I finally decide what processing I want to carry out on a full image, then I switch to compiled c programs. Mma takes 1 to 10 seconds to process and display a 50 x 50 image --- usually, this is much less time than it takes me to compose a scheme for investigating the image, and I can usually express this scheme in Mma faster than in other environments. Some of my pals work with some 5 k by 5 k images in twenty+ spectral bands, and these *are* too big for timely processing using Mma on our NeXT machines. And my bearing-buddy's full data sets are about a gigabyte each: these are also too big for Mma on his NeXT. These situations require a program that extracts subsets for Mma processing. What does "large" mean to you? Henning Leidecker