Re: Re: Unexpected behaviour of HoldRest
- To: mathgroup at smc.vnet.net
- Subject: [mg45848] Re: [mg45821] Re: Unexpected behaviour of HoldRest
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Tue, 27 Jan 2004 04:50:44 -0500 (EST)
- References: <200401210954.EAA07748@smc.vnet.net> <buo2ml$h76$1@smc.vnet.net> <200401260653.BAA29804@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
> Thank you, Andrej and Hartmut, you confirm my suspicions that Sequence
> is tricky and dangerous to use. I do however still have problems that
> are not resolved by the alternatives as I understand them at present.
>
> I rather regret now focusing on HoldRest as the issue here. There is a
> rather more general problem with selection from lists. The optimum list
> selection strategy has been addressed most extensively in a thread on
> DeleteCases (started March 2001) and the basic options were well aired
> then, with the main options being:
>
> Select
> DeleteCases (or Cases)
> Position
>
> The primary discrimination between these was in terms of the Timing, and
> here Select still seems to be marginally preferable to DeleteCases/Cases
> and distinctly faster than Position. My present problem is however with
> the internal memory allocation, such as reported by MaxMemoryUsed[], and
> in this case I thought Sequence would help.
>
> The problem comes when there is a large amount of data to be processed,
> with only selected parts returned. The specific case I have found is
> when selecting immediately on calculation, so using a similar example to
> that used previously for DeleteCases:
>
> CASE 1 - Sequence[]
>
> In[1]:=
> Unprotect[If];
> SetAttributes[If,SequenceHold];
> Protect[If];
> Timing[SeedRandom[123456];
> ranpick1=Table[x=Random[];If[x>0.01,Sequence[],x],{1000000}];
> MaxMemoryUsed[]]
>
> Out[4]=
> {31.876 Second,6343872}
>
> CASE 2 - Hold[]
>
> In[1]:=
> Timing[SeedRandom[123456];
> ranpick2=Table[x=Random[];ReleaseHold[If[x>0.01,Hold[],x]],{1000000}];
> MaxMemoryUsed[]]
>
> Out[1]=
> {47.288 Second,35045032}
>
> CASE 3 - Select
>
> In[1]:=
> Timing[SeedRandom[123456];
> ranpick3=Select[Table[Random[],{1000000}],#<=0.01&];
> MaxMemoryUsed[]]
>
> Out[1]=
> {18.486 Second,35112256}
>
> CASE 4 - Cases
>
> In[1]:=
> Timing[SeedRandom[123456];
> ranpick4=Cases[Table[Random[],{1000000}],_?(#<=0.01&)];
> MaxMemoryUsed[]]
>
> Out[1]=
> {24.465 Second,35115264}
>
> All of the above of course return identical results, and as UnPacked
> arrays. Timing results are for a 266MHz Pentium II running Windows NT.
>
> advantages over Hold[]. For raw Timing, Select and Cases are quicker
> than both, but also have very high MaxMemoryUsed[] results.
>
> However the use of Sequence[] here does require that If is given the
> Attribute SequenceHold so, given the limited application of the
> technique and the risks involved, I have now regretfully abandoned this
> approach and I am also reviewing the use of Sequence elsewhere.
> Actually (to be honest) I was hoping for equivalent improvements in
> MaxMemoryUsed[] when operating on big PackedArrays of stored data, but
> here the results are far worse when using Sequence[] than when using
> Select:
>
> In[1]:=
> Unprotect[If];
> SetAttributes[If,SequenceHold];
> Protect[If];
> Timing[SeedRandom[123456];
> randata=Table[Random[],{1000000}];
> ranpick1a=If[#>0.01,Sequence[],#]& /@ randata;
> MaxMemoryUsed[]]
>
> Out[4]=
> {26.868 Second,63143384}
>
> Basically I do need a new computer with LOTS more memory...
>
> John Tanner.
> [...]
You might save on memory and speed using Compile. For example:
Timing[SeedRandom[123456];
ranpick3C = (
func = Compile[{},Select[Table[Random[],{10^6}],(#<=0.01&)]];
func[]);
MaxMemoryUsed[]]
Out[3]= {1.48 Second, 14137968}
In[4]:= In[4]:= Developer`PackedArrayQ[ranpick3C]
Out[4]= True
This is on a 1.4 GHz machine. Without Compile we lose a factor a bit
over 2 in speed and memory consumption.
In[6]:= Timing[SeedRandom[123456];
ranpick3=Select[Table[Random[],{10^6}],#<=0.01&];
MaxMemoryUsed[]]
Out[6]= {3.18 Second, 35216480}
In[7]:= Developer`PackedArrayQ[ranpick3]
Out[7]= False
If you are likely only to keep a small fraction of the values generated,
then you can save on memory at the expense of using an unpacked object
in intermediate computations. Use of Sow/Reap (which was introduced
after 2001) will allow this. It's also faster than the methods above for
this example.
In[1]:= Timing[SeedRandom[123456];
ranpick5C = Developer`ToPackedArray[First[Last[Reap[func = Compile[{},
Module[{x},Do[x=Random[]; If[x<=.01,Sow[x]], {10^6}]]];
func[]]]]];
MaxMemoryUsed[]]
Out[1]= {0.79 Second, 2159760}
Daniel Lichtblau
Wolfram Research
- References:
- RE: Unexpected behaviour of HoldRest
- From: "Wolf, Hartmut" <Hartmut.Wolf@t-systems.com>
- Re: Unexpected behaviour of HoldRest
- From: John Tanner <john@janacek.demon.co.uk>
- RE: Unexpected behaviour of HoldRest