MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Setting Negatives to Zero

  • To: mathgroup at smc.vnet.net
  • Subject: [mg82892] Re: Setting Negatives to Zero
  • From: Jean-Marc Gulliet <jeanmarc.gulliet at gmail.com>
  • Date: Fri, 2 Nov 2007 03:34:46 -0500 (EST)
  • Organization: The Open University, Milton Keynes, UK
  • References: <fg6qha$dj0$1@smc.vnet.net> <fg9nla$lc5$1@smc.vnet.net>

Jean-Marc Gulliet wrote:
> Kevin J. McCann wrote:
> 
>> I have a very large data set (64000 x 583) in which negative values 
>> indicate "no data", unfortunately these negatives are not all the same. 
>> I would like to efficiently set all these negatives to zero. I know that 
>> I will likely be embarrassed when I see how to do it, but I can't seem 
>> to remember or figure it out. I should emphasize that because of the 
>> size of the data set, this needs to be done efficiently. Another 
>> programming language does it as follows:
>>
>> 		x(x < 0) = 0;
> 
> Here is a couple of solutions. They works fine but speaking about 
> efficiency they are about 70 times *slower* than the vectorization you 
> used with the other product.
> 
> First, we create a small set of data to show the principle.
> 
> data = RandomReal[{-10, 100}, {6, 4}]
> 
> {{90.6031, 16.644, 15.2568, 88.4432}, {95.3404, -0.391179, 22.6264,
>    41.0332}, {18.7866, 90.8717, 48.073, 59.3251}, {24.2224, 21.1771,
>    91.7082, 50.719}, {96.9408, 27.4581, 56.9265, 2.22925}, {31.6366,
>    0.266302, 68.7124, 7.80917}}
> 
> Then we use a replacement rule,
> 
> data /. x_ /; x < 0 -> 0.
> 
> {{90.6031, 16.644, 15.2568, 88.4432}, {95.3404, 0., 22.6264,
>    41.0332}, {18.7866, 90.8717, 48.073, 59.3251}, {24.2224, 21.1771,
>    91.7082, 50.719}, {96.9408, 27.4581, 56.9265, 2.22925}, {31.6366,
>    0.266302, 68.7124, 7.80917}}
> 
> We can also do it we *Cases*,
> 
> Cases[data, x_ /; x < 0 -> 0., {-1}]
> 
> {0.}
> 
> Now we test both method on a matrix of doubles of the size you 
> specified, and check the time spent in seconds.
> 
> data = RandomReal[{-10, 100}, {64000, 583}];
> Timing[data /. x_ /; x < 0 -> 0.;][[1]]
> Timing[Cases[data, x_ /; x < 0 -> 0., {-1}];][[1]]
> 
> 62.046
> 
> 49.797
> 
> In comparison, a similar replacement on a similar matrix done with the 
> other product takes less than a second.
> 
>   >> x = -10 + (100 - (-10)).*rand(64000,583);
>   >> tic; x(x < 0) = 0; toc
>   Elapsed time is 0.867847 seconds.
>   >> whos x
>     Name          Size                 Bytes  Class     Attributes
> 
>     x         64000x583            298496000  double
> 
> I am confident that we can improve the performances for Mathematica; but 
> I draw a blank right now (though I suspect something is going on with 
> the packed array technology used by Mathematica).

Thanks to Carl Woll's clipping method, Mathematica is now faster than 
the other product.

  In[2]:= x =.
  Timing[x = RandomReal[{-10, 100}, {64000, 583}];][[1]]
  Timing[x = Clip[x, {0, \[Infinity]}];][[1]]

  Out[3]= 2.031

  Out[4]= 0.656

  >> tic; x = -10 + (100 - (-10)).*rand(64000,583); toc
  Elapsed time is 2.225936 seconds.
  >> tic; x(x < 0) = 0; toc
  Elapsed time is 0.897022 seconds.

(Tested on Windows with the latest version of both products as of today.)
-- 
Jean-Marc


  • Prev by Date: Re: Re: get help info into my program
  • Next by Date: Re: FindRoot and Bose-Einstein distribution
  • Previous by thread: Re: Re: Setting Negatives to Zero
  • Next by thread: Re: Re: Re: Setting Negatives to Zero