Re: Element Extraction
- To: mathgroup at smc.vnet.net
- Subject: [mg19375] Re: [mg19240] Element Extraction
- From: "Allan Hayes" <hay at haystack.demon.co.uk>
- Date: Sat, 21 Aug 1999 00:04:44 -0400
- References: <199908110606.CAA01017@smc.vnet.net.> <37B47049.33AB607@fermi.phys.washington.edu> <37B54909.973D6F5E@hixnet.co.za> <37B53F8E.B16E04A2@fermi.phys.washington.edu> <001c01bee71f$5b311ba0$9b16989e@machine1> <7pipkh$39t$2@dragonfly.wolfram.com>
- Sender: owner-wri-mathgroup at wolfram.com
Carl, Re: > Length/@(Inner[SameQ,lis,tar,List]/.True->Sequence[]) > will be faster than any other attempt to count the number of False entries ... A slight improvement can be gained by using DeleteCases instead of /.True->Sequence[] (timings below) . But it seems to me that some lessons to be learned from these postings are: 1) operate on whole structures; 2) keep functions and tests simple particularly when they have to be used repeatedly. 3) look for suitable built in fuctions For example my original hamming[1][lis_, tar_, dist_] := Cases[lis, x_ /; Count[MapThread[SameQ, {target, x}], False] <= dist] failed on 2), with the complex test Count[MapThread[SameQ, {target, x}], False] <= dist -------------------------------- TIMINGS testlist = Table[Random[Integer, {0, 2}], {10000}, {10}]; target = Table[Random[Integer, {0, 1}], {10}]; dist = 2; TableForm[ Table[{DeleteCases[testlist, 0, {2}] // Timing // First, testlist /. 0 -> Sequence[] // Timing // First}/Second, {10}], TableHeadings -> {None, { "DeleteCases", "True->Sequence[]"}} ] DeleteCases True->Sequence[] 0.77 0.94 0.77 1.04 0.71 0.93 0.76 0.93 0.71 0.93 0.76 0.93 0.77 0.99 0.71 0.93 0.77 0.94 0.77 0.99 hamming[1.2][lis_, tar_, dist_] := lis[[Flatten[ Position[Length /@ DeleteCases[Inner[SameQ, lis, tar, List],True, {2}], Alternatives @@ Range[0, dist]]]]] hamming[6.1][lis_, tar_, dist_] := lis[[Flatten[ Position[Length /@ (Inner[SameQ, lis, tar, List] /. True -> Sequence[]), Alternatives @@ Range[0, dist]]]]] TableForm[Table[ {hamming[1.2][testlist, target, dist] // Timing // First, hamming[6.1][testlist, target, dist]; // Timing // First }/Second, {10}], TableHeadings -> {None, {hamming[1.2], hamming[6.1]}}] hamming[1.2] hamming[6.1] 2.52 2.75 2.53 2.74 2.64 2.75 2.52 2.8 2.53 2.75 2.52 2.8 2.59 2.74 2.53 2.8 2.53 2.8 2.58 3.95 Allan --------------------- Allan Hayes Mathematica Training and Consulting Leicester UK www.haystack.demon.co.uk hay at haystack.demon.co.uk Voice: +44 (0)116 271 4198 Fax: +44 (0)870 164 0565 Carl K.Woll <carlw at fermi.phys.washington.edu> wrote in message news:7pipkh$39t$2 at dragonfly.wolfram.com... > Allan, > > Very nice! Using Inner, I rewrote my original hamming[6]: > > hamming[6.1][lis_,tar_,dist_]:= > lis[[Flatten[ > Position[Length/@(Inner[SameQ,lis,tar,List]/.True->Sequence[]), > Alternatives@@Range[0,dist]]]]] > > which on my test is still about 30% faster than hamming[1.1]. I think > > Length/@(Inner[SameQ,lis,tar,List]/.True->Sequence[]) > > will be faster than any other attempt to count the number of False entries using > > f/@(Inner[SameQ,lis,tar,List] > > (or almost equivalently Inner[SameQ,lis,tar,f]). The hardest thing here is doing > something to the False entries (counting) or to the True entries (replacing). > Counting False entries of each row must be done as many times as the number of > rows, but replacing True can be done once for the whole Table. This is why the > first method is approximately twice as fast. > > Carl Woll > Physics Dept > U of Washington > > > Allan Hayes wrote: > > > Carl: > > Another approach, using Inner: > > > > hamming[1.1][lis_, tar_, dist_] := > > lis[[ > > Flatten[Position[Inner[SameQ, lis, tar > > , (Count[{##}, False] <= dist) &], True]]]] > > > > This is considerbly faster on your test than my original hamming[1]: > > > > hamming[1][lis_, tar_, dist_] := > > Cases[lis, x_ /; Count[MapThread[SameQ, {tar, x}], False] <= dist] > > > > testlist = Table[Random[Integer, {0, 2}], {10000}, {10}]; > > target = Table[Random[Integer, {0, 2}], {10}]; > > dist = 2; > > > > hamming[1.1][testlist, target, dist]; // Timing // First > > > > 3.73 Second > > > > hamming[1][testlist, target, dist]; // Timing // First > > > > 6.37 Second > > > > Allan > > --------------------- > > Allan Hayes > > Mathematica Training and Consulting > > Leicester UK > > www.haystack.demon.co.uk > > hay at haystack.demon.co.uk > > Voice: +44 (0)116 271 4198 > > Fax: +44 (0)870 164 0565 > > > > ----- Original Message ----- > > From: Carl K.Woll <carlw at fermi.phys.washington.edu> To: mathgroup at smc.vnet.net > > To: <kewjoi at hixnet.co.za> > > Cc: Allan Hayes <hay at haystack.demon.co.uk>; David Park <djmp at earthlink.net>; > > Andrzej Kozlowski <andrzej at tuins.ac.jp>; Bob Hanlon <BobHanlon at aol.com>; > > <mathgroup at smc.vnet.net> > > Sent: 14 August 1999 11:06 > > Subject: [mg19375] Re: [mg19240] Element Extraction > > > > Below Kew complains about the lack of generality of the fastest solution to > > his problem. I should have given some words of > > explanation about how the fastest solution worked. > > > > Basically, for my hamming[5] function, I took Allan Hayes solution, and > > tried to speed it up. There were two things that I > > did. > > > > 1. He used MapThread to thread SameQ over his two vectors. I temporarily > > gave SameQ the attribute Listable to achieve the > > same thing. This gave a 20-30% improvement in speed. I modified my previous > > hamming[5] function to do just this single > > change. > > > > 2. Rather than counting the number of false entries and seeing if it was > > less than some prescribed number, given the > > shortness of the vector I instead hardcoded the close enough possibilities > > (alternatively, I could have hardcoded the far > > enough away possibilities). This provided another 100% improvement in speed > > at the sacrifice of generality. > > > > For the hamming[6] function, I tried to attack the problem from a different > > angle. Since Position and Part are such > > extremely quick functions, I massaged the whole testlist function instead of > > each element individually. I also used the two > > things I noted above. > > > > If we want a general hamming function, then approach 2 above is obviously > > not the way to go. However, for the approach taken > > in hamming[6], I was able to come up with a similar method, which was still > > general enough. In this approach I used the ever > > useful Sequence[] function. At any rate, here is a revised version of my > > previous notebook with the above changes using your > > new syntax hamming[list,target,distance] (I also had to change David Park's > > solution a little bit more than the others to > > make it work, and in the process speeded it up a bit). > > > > In[1]:= > > (* Hayes *) > > hamming[1][lis_,tar_,dist_]:= > > Cases[lis,x_/;Count[MapThread[SameQ,{target,x}],False]<=dist] > > > > (* Park revised *) > > hamming[2][lis_,tar_,dist_]:=Select[lis,Count[#-tar,0]>Length[tar]-dist-1&] > > > > (* Kozlowski *) > > distance[a_, b_] := If[a === b, 0, 1] > > SetAttributes[distance, Listable] > > > > hamming[3][lis_,tar_,dist_]:= > > Select[lis, Apply[Plus, distance[tar, #]] <= dist &] > > > > (* Hanlon *) > > hammingDistance[x_?VectorQ, y_?VectorQ] /; Length[x] == Length[y] := > > Plus @@ MapThread[If[#1 == #2, 0, 1] &, {x, y}]; > > > > hamming[4][lis_,tar_,dist_]:=Select[lis, hammingDistance[tar, #] < dist+1&] > > > > (* Woll #1 *)hamming[5][lis_,tar_,dist_]:=Module[{ans}, > > SetAttributes[SameQ,{Listable}]; > > ans=Cases[lis,x_/;Count[SameQ[x,tar],False]<=dist]; > > ClearAttributes[SameQ,{Listable}]; > > ans] > > > > (* Woll #2 *) > > hamming[6][lis_,tar_,dist_]:=Module[{ans,t,len=Length[lis]}, > > SetAttributes[SameQ,{Listable}]; > > t=SameQ[testlist,Table[tar,{len}]]; > > ans=lis[[ > > Flatten[Position[Length/@(t/.True->Sequence[]), > > Alternatives@@Range[0,dist]]]]]; > > ClearAttributes[SameQ,{Listable}]; > > ans] > > In[27]:= > > target = Table[Random[Integer,{0,2}],{10}]; > > In[28]:= > > testlist = > > Table[Random[Integer, {0, 2}], {10000},{10}]; > > In[29]:= > > Do[Print[Timing[r[i]=hamming[i][testlist,target,2];]],{i,1,6}] > > > > r[1]==r[2]==r[3]==r[4]==r[5]==r[6] > > {3.734 Second,Null} > > {4.547 Second,Null} > > {9.281 Second,Null} > > {9.109 Second,Null} > > {3.063 Second,Null} > > {2.187 Second,Null} > > Out[30]= > > True > > > > Carl Woll > > Dept of Physics > > U of Washington > > > > Kew Joinery wrote: > > > > > Hello, > > > I'd like to thank to all replies again. > > > Only to add few words: > > > To define universal Humming function it should be of the form : > > > Hamming[someList , target , distance] > > > Obviously all solutions from the different authors (excluding the > > "fastest" solution > > > of Mr.K.Woll )could be easy adapted to this form (they are at the moment > > of the form > > > hamming[someList ,target] . > > > What about the fastest solution? At the moment it detects only lists with > > elements of > > > length=3 and distance <=1. Consider example when testing > > > > > someList={{0,1,2,3,4,5,6,7,8,9},{0,0,1,2,3,4,5,6,7,8},{0,0,0,1,2,3,4,5,6,7}, > > {0,0,0,0,1,2,3,4,5,6}.{0,0,0,0,0,0,0,0,0,0}} > > > > > > According to Mr.Woll I should define all 10! "success[.]=0 " in advance to > > detect > > > someList in Hamming Distance <=9 for example.I thing this solution is > > inpractical . > > > Regards, > > > Eugene > > > > > > Carl K.Woll wrote: > > > > > > > Hi all, > > > > > > > > I thought I would give a comparison of the various solutions provided, > > along with > > > > a couple of new ones by yours truly. > > > > > > > > (* Hayes *) > > > > hamming[1][lis_,tar_]:= > > > > Cases[lis,x_/;Count[MapThread[SameQ,{target,x}],False]<=1] > > > > > > > > (* Park *) > > > > hammtest[p : {_, _, _}, t : {_, _, _}] := Count[Abs[t - p], 0] > 1; > > > > > > > > hamming[2][lis_,tar_]:=Select[lis,hammtest[#,tar]&] > > > > > > > > (* Kozlowski *) > > > > dist[a_, b_] := If[a === b, 0, 1] > > > > SetAttributes[dist, Listable] > > > > > > > > hamming[3][lis_,tar_]:=Select[lis, Apply[Plus, dist[tar, #]] <= 1 &] > > > > > > > > (* Hanlon *) > > > > hammingDistance[x_?VectorQ, y_?VectorQ] /; Length[x] == Length[y] := > > > > Plus @@ MapThread[If[#1 == #2, 0, 1] &, {x, y}]; > > > > > > > > hamming[4][lis_,tar_]:=Select[lis, hammingDistance[tar, #] < 2 &] > > > > > > > > (* Woll #1 *) > > > > Clear[closeEnough]; > > > > closeEnough[{True,True,True}]=True; > > > > closeEnough[{True,True,False}]=True; > > > > closeEnough[{True,False,True}]=True; > > > > closeEnough[{False,True,True}]=True; > > > > > > > > hamming[5][lis_,tar_]:=Module[{ans}, > > > > SetAttributes[SameQ,{Listable}]; > > > > ans=Select[lis,closeEnough[SameQ[#,tar]]&]; > > > > ClearAttributes[SameQ,{Listable}]; > > > > ans] > > > > > > > > (* Woll #2 *) > > > > Clear[success] > > > > success[True,True,True]=0; > > > > success[True,True,False]=0; > > > > success[True,False,True]=0; > > > > success[False,True,True]=0; > > > > > > > > hamming[6][lis_,tar_]:=Module[{ans,t,len=Length[lis]}, > > > > SetAttributes[SameQ,{Listable}]; > > > > t=SameQ[testlist,Table[tar,{len}]]; > > > > ans=lis[[Flatten[Position[Apply[success,t,1],0]]]]; > > > > ClearAttributes[SameQ,{Listable}]; > > > > ans] > > > > In[22]:= > > > > target = {0, 0, 1}; > > > > In[263]:= > > > > testlist = > > > > Table[{Random[Integer, {0, 2}], Random[Integer, {0, 2}], > > > > Random[Integer, {0, 2}]}, {10000}]; > > > > In[381]:= > > > > Do[Print[Timing[r[i]=hamming[i][testlist,target];]],{i,1,6}] > > > > > > > > r[1]==r[2]==r[3]==r[4]==r[5]==r[6] > > > > {2.703 Second,Null} > > > > {3.937 Second,Null} > > > > {3.969 Second,Null} > > > > {5.234 Second,Null} > > > > {1.125 Second,Null} > > > > {0.969 Second,Null} > > > > Out[382]= > > > > True > > > > > > > > Carl Woll > > > > Dept of Physics > > > > U of Washington > > > > > > > > Kew Joinery wrote: > > > > > > > > > Hello everyone, > > > > > I have a problem extracting list of list. > > > > > Given some list, consisting of elements (not all of them distinct) > > with > > > > > the same length and some target list of the same length, for example: > > > > > In[26]:= > > > > > > > someList={{0,0,0},{0,0,1},{0,0,2},{0,1,0},{0,1,1},{0,1,2},{0,2,0},{0,2,1},{0 > > , > > > > > > > > > > > > 2,2},{1,0,0},{1,0,1},{1,0,2},{1,1,0},{1,1,1},{1,1,2},{1,2,0},{1,2,1},{1, > > > > > > > > > > > > 2,2},{2,0,0},{2,0,1},{2,0,2},{2,1,0},{2,1,1},{2,1,2},{2,2,0},{2,2,1},{2, > > > > > > > > > > 2,2}}; > > > > > > > > > > In[27]:= > > > > > target={0,0,1}; > > > > > > > > > > Hamming distance between two bit strings means the number of bit > > > > > positions in which they differ. For example consecutive elements of > > the > > > > > Gray code list have Hamming distance = 1. > > > > > I need to extract these cases(elements of someList) which differ from > > > > > target in one coordinate (Hamming distance = 1) or less (the > > > > > element===target itself//Hamming distance=0), so the answer should be: > > > > > > > > > > In[33]:= > > > > > answer={{0,0,0},{0,0,1},{0,0,2},{0,1,1},{0,2,1},{1,0,1},{2,0,1}}; > > > > > > > > > > How could I achieve the extraction efficiently? > > > > > Thank you in advance for any suggestions. > > > > > Eugene > > > >
- References:
- Element Extraction
- From: kewjoi@hixnet.co.za (Kew Joinery)
- Element Extraction