RE: help on bootstrap sample

• To: mathgroup at smc.vnet.net
• Subject: [mg37357] RE: [mg37338] help on bootstrap sample
• From: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
• Date: Fri, 25 Oct 2002 02:46:42 -0400 (EDT)
• Sender: owner-wri-mathgroup at wolfram.com

```>-----Original Message-----
>From: Søren Merser [mailto:merser at image.dk]
To: mathgroup at smc.vnet.net
>Sent: Thursday, October 24, 2002 8:55 AM
>To: mathgroup at smc.vnet.net
>Subject: [mg37357] [mg37338] help on bootstrap sample
>
>
>Hi Mathematica freeks
>I've made two small rutines for sampling (bootstrap statistics)
>They are working , but at least 'SampleNoReplace' is rather slow
>I was wondering if any of you know of a faster way to do this
>Regards soren
>
>SampelReplace[data_List:{0, 1}, n_:1] :=
>  Module[{}, data[[Table[Random[Integer, {1, Length@data} ], {n}]]]]
>
>SampelNoReplace[data_List, n_] := Module[{idx, len, res, d, hi, i},
>    d = data;
>    res = {};
>    len = hi = Length@d;
>
>    For[i = 1, i <= n && i <= len, i++,
>      idx = Random[Integer, {1, hi--} ];
>      AppendTo[res, d[[idx]]];
>      d = Drop[d, {idx}];
>      ];
>    res
>    ]
>

Søren,

try

<< DiscreteMath`Combinatorica`

SampelNoReplace[data_List, n_] :=
With[{len = Length[data]},
data[[Take[RandomPermutation[len], Min[len, n]] ]] ]

Most probably this is good enough, if n is *very* small compared to Length
of data (and data is *very* *very* large), then other strategies are
conceivable, e.g.

SampelNoReplace[data_List, n_] /; n <= Length[data] :=
data[[Take[
NestWhile[
UnorderedUnion[
Join[Table[
Random[Integer, {1, Length[data]}], {2 n}], #]] &, {},
Length[#] < n &], n] ]]

where UnorderedUnion is a stroke of genius from Karl Woll, searchable in the
archive:

UnorderedUnion[li_List] :=
Block[{i, Sequence}, i[n_] := (i[n] = Sequence[]; n); i /@ li]

(You might like to tune the excess of probing over n ( a factor 2 here,
might be made much less), such that NestWhile most probably won't loop.)
Another, somewhat related idea would be:

SampelNoReplace[data_List, n_] :=
data[[Module[{hit, i, len = Length[data]},
hit[i_] := False;
Table[(While[hit[i = Random[Integer, {1, len}]]];
hit[i] = True; i), {Min[n, len]}]] ]]

This one however is no good if the the data are very large and n comes near
to the length, but then the first proposal is appropriate. With varying
requirements make conditional definitions.

--
Hartmut Wolf

```

• Prev by Date: RE: Problem with user defined functions
• Next by Date: Re: Re: PlotVectorField3D in Cylindrical Coordinates
• Previous by thread: RE: help on bootstrap sample
• Next by thread: Re: help on bootstrap sample