RE: help on bootstrap sample
- To: mathgroup at smc.vnet.net
- Subject: [mg37357] RE: [mg37338] help on bootstrap sample
- From: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
- Date: Fri, 25 Oct 2002 02:46:42 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
>-----Original Message----- >From: Søren Merser [mailto:merser at image.dk] To: mathgroup at smc.vnet.net >Sent: Thursday, October 24, 2002 8:55 AM >To: mathgroup at smc.vnet.net >Subject: [mg37357] [mg37338] help on bootstrap sample > > >Hi Mathematica freeks >I've made two small rutines for sampling (bootstrap statistics) >They are working , but at least 'SampleNoReplace' is rather slow >I was wondering if any of you know of a faster way to do this >Regards soren > >SampelReplace[data_List:{0, 1}, n_:1] := > Module[{}, data[[Table[Random[Integer, {1, Length@data} ], {n}]]]] > >SampelNoReplace[data_List, n_] := Module[{idx, len, res, d, hi, i}, > d = data; > res = {}; > len = hi = Length@d; > > For[i = 1, i <= n && i <= len, i++, > idx = Random[Integer, {1, hi--} ]; > AppendTo[res, d[[idx]]]; > d = Drop[d, {idx}]; > ]; > res > ] > Søren, try << DiscreteMath`Combinatorica` SampelNoReplace[data_List, n_] := With[{len = Length[data]}, data[[Take[RandomPermutation[len], Min[len, n]] ]] ] Most probably this is good enough, if n is *very* small compared to Length of data (and data is *very* *very* large), then other strategies are conceivable, e.g. SampelNoReplace[data_List, n_] /; n <= Length[data] := data[[Take[ NestWhile[ UnorderedUnion[ Join[Table[ Random[Integer, {1, Length[data]}], {2 n}], #]] &, {}, Length[#] < n &], n] ]] where UnorderedUnion is a stroke of genius from Karl Woll, searchable in the archive: UnorderedUnion[li_List] := Block[{i, Sequence}, i[n_] := (i[n] = Sequence[]; n); i /@ li] (You might like to tune the excess of probing over n ( a factor 2 here, might be made much less), such that NestWhile most probably won't loop.) Another, somewhat related idea would be: SampelNoReplace[data_List, n_] := data[[Module[{hit, i, len = Length[data]}, hit[i_] := False; Table[(While[hit[i = Random[Integer, {1, len}]]]; hit[i] = True; i), {Min[n, len]}]] ]] This one however is no good if the the data are very large and n comes near to the length, but then the first proposal is appropriate. With varying requirements make conditional definitions. -- Hartmut Wolf