Re: Setting Negatives to Zero
- To: mathgroup at smc.vnet.net
- Subject: [mg82892] Re: Setting Negatives to Zero
- From: Jean-Marc Gulliet <jeanmarc.gulliet at gmail.com>
- Date: Fri, 2 Nov 2007 03:34:46 -0500 (EST)
- Organization: The Open University, Milton Keynes, UK
- References: <fg6qha$dj0$1@smc.vnet.net> <fg9nla$lc5$1@smc.vnet.net>
Jean-Marc Gulliet wrote: > Kevin J. McCann wrote: > >> I have a very large data set (64000 x 583) in which negative values >> indicate "no data", unfortunately these negatives are not all the same. >> I would like to efficiently set all these negatives to zero. I know that >> I will likely be embarrassed when I see how to do it, but I can't seem >> to remember or figure it out. I should emphasize that because of the >> size of the data set, this needs to be done efficiently. Another >> programming language does it as follows: >> >> x(x < 0) = 0; > > Here is a couple of solutions. They works fine but speaking about > efficiency they are about 70 times *slower* than the vectorization you > used with the other product. > > First, we create a small set of data to show the principle. > > data = RandomReal[{-10, 100}, {6, 4}] > > {{90.6031, 16.644, 15.2568, 88.4432}, {95.3404, -0.391179, 22.6264, > 41.0332}, {18.7866, 90.8717, 48.073, 59.3251}, {24.2224, 21.1771, > 91.7082, 50.719}, {96.9408, 27.4581, 56.9265, 2.22925}, {31.6366, > 0.266302, 68.7124, 7.80917}} > > Then we use a replacement rule, > > data /. x_ /; x < 0 -> 0. > > {{90.6031, 16.644, 15.2568, 88.4432}, {95.3404, 0., 22.6264, > 41.0332}, {18.7866, 90.8717, 48.073, 59.3251}, {24.2224, 21.1771, > 91.7082, 50.719}, {96.9408, 27.4581, 56.9265, 2.22925}, {31.6366, > 0.266302, 68.7124, 7.80917}} > > We can also do it we *Cases*, > > Cases[data, x_ /; x < 0 -> 0., {-1}] > > {0.} > > Now we test both method on a matrix of doubles of the size you > specified, and check the time spent in seconds. > > data = RandomReal[{-10, 100}, {64000, 583}]; > Timing[data /. x_ /; x < 0 -> 0.;][[1]] > Timing[Cases[data, x_ /; x < 0 -> 0., {-1}];][[1]] > > 62.046 > > 49.797 > > In comparison, a similar replacement on a similar matrix done with the > other product takes less than a second. > > >> x = -10 + (100 - (-10)).*rand(64000,583); > >> tic; x(x < 0) = 0; toc > Elapsed time is 0.867847 seconds. > >> whos x > Name Size Bytes Class Attributes > > x 64000x583 298496000 double > > I am confident that we can improve the performances for Mathematica; but > I draw a blank right now (though I suspect something is going on with > the packed array technology used by Mathematica). Thanks to Carl Woll's clipping method, Mathematica is now faster than the other product. In[2]:= x =. Timing[x = RandomReal[{-10, 100}, {64000, 583}];][[1]] Timing[x = Clip[x, {0, \[Infinity]}];][[1]] Out[3]= 2.031 Out[4]= 0.656 >> tic; x = -10 + (100 - (-10)).*rand(64000,583); toc Elapsed time is 2.225936 seconds. >> tic; x(x < 0) = 0; toc Elapsed time is 0.897022 seconds. (Tested on Windows with the latest version of both products as of today.) -- Jean-Marc