Re: Element Extraction

*To*: mathgroup at smc.vnet.net*Subject*: [mg19331] Re: [mg19240] Element Extraction*From*: "Carl K.Woll" <carlw at fermi.phys.washington.edu>*Date*: Mon, 16 Aug 1999 02:14:59 -0400*Organization*: Department of Physics*References*: <199908110606.CAA01017@smc.vnet.net.> <37B47049.33AB607@fermi.phys.washington.edu> <37B54909.973D6F5E@hixnet.co.za> <37B53F8E.B16E04A2@fermi.phys.washington.edu> <001c01bee71f$5b311ba0$9b16989e@machine1>*Sender*: owner-wri-mathgroup at wolfram.com

Allan, Very nice! Using Inner, I rewrote my original hamming[6]: hamming[6.1][lis_,tar_,dist_]:= lis[[Flatten[ Position[Length/@(Inner[SameQ,lis,tar,List]/.True->Sequence[]), Alternatives@@Range[0,dist]]]]] which on my test is still about 30% faster than hamming[1.1]. I think Length/@(Inner[SameQ,lis,tar,List]/.True->Sequence[]) will be faster than any other attempt to count the number of False entries using f/@(Inner[SameQ,lis,tar,List] (or almost equivalently Inner[SameQ,lis,tar,f]). The hardest thing here is doing something to the False entries (counting) or to the True entries (replacing). Counting False entries of each row must be done as many times as the number of rows, but replacing True can be done once for the whole Table. This is why the first method is approximately twice as fast. Carl Woll Physics Dept U of Washington Allan Hayes wrote: > Carl: > Another approach, using Inner: > > hamming[1.1][lis_, tar_, dist_] := > lis[[ > Flatten[Position[Inner[SameQ, lis, tar > , (Count[{##}, False] <= dist) &], True]]]] > > This is considerbly faster on your test than my original hamming[1]: > > hamming[1][lis_, tar_, dist_] := > Cases[lis, x_ /; Count[MapThread[SameQ, {tar, x}], False] <= dist] > > testlist = Table[Random[Integer, {0, 2}], {10000}, {10}]; > target = Table[Random[Integer, {0, 2}], {10}]; > dist = 2; > > hamming[1.1][testlist, target, dist]; // Timing // First > > 3.73 Second > > hamming[1][testlist, target, dist]; // Timing // First > > 6.37 Second > > Allan > --------------------- > Allan Hayes > Mathematica Training and Consulting > Leicester UK > www.haystack.demon.co.uk > hay at haystack.demon.co.uk > Voice: +44 (0)116 271 4198 > Fax: +44 (0)870 164 0565 > > ----- Original Message ----- > From: Carl K.Woll <carlw at fermi.phys.washington.edu> To: mathgroup at smc.vnet.net > To: <kewjoi at hixnet.co.za> > Cc: Allan Hayes <hay at haystack.demon.co.uk>; David Park <djmp at earthlink.net>; > Andrzej Kozlowski <andrzej at tuins.ac.jp>; Bob Hanlon <BobHanlon at aol.com>; > <mathgroup at smc.vnet.net> > Sent: 14 August 1999 11:06 > Subject: [mg19331] Re: [mg19240] Element Extraction > > Below Kew complains about the lack of generality of the fastest solution to > his problem. I should have given some words of > explanation about how the fastest solution worked. > > Basically, for my hamming[5] function, I took Allan Hayes solution, and > tried to speed it up. There were two things that I > did. > > 1. He used MapThread to thread SameQ over his two vectors. I temporarily > gave SameQ the attribute Listable to achieve the > same thing. This gave a 20-30% improvement in speed. I modified my previous > hamming[5] function to do just this single > change. > > 2. Rather than counting the number of false entries and seeing if it was > less than some prescribed number, given the > shortness of the vector I instead hardcoded the close enough possibilities > (alternatively, I could have hardcoded the far > enough away possibilities). This provided another 100% improvement in speed > at the sacrifice of generality. > > For the hamming[6] function, I tried to attack the problem from a different > angle. Since Position and Part are such > extremely quick functions, I massaged the whole testlist function instead of > each element individually. I also used the two > things I noted above. > > If we want a general hamming function, then approach 2 above is obviously > not the way to go. However, for the approach taken > in hamming[6], I was able to come up with a similar method, which was still > general enough. In this approach I used the ever > useful Sequence[] function. At any rate, here is a revised version of my > previous notebook with the above changes using your > new syntax hamming[list,target,distance] (I also had to change David Park's > solution a little bit more than the others to > make it work, and in the process speeded it up a bit). > > In[1]:= > (* Hayes *) > hamming[1][lis_,tar_,dist_]:= > Cases[lis,x_/;Count[MapThread[SameQ,{target,x}],False]<=dist] > > (* Park revised *) > hamming[2][lis_,tar_,dist_]:=Select[lis,Count[#-tar,0]>Length[tar]-dist-1&] > > (* Kozlowski *) > distance[a_, b_] := If[a === b, 0, 1] > SetAttributes[distance, Listable] > > hamming[3][lis_,tar_,dist_]:= > Select[lis, Apply[Plus, distance[tar, #]] <= dist &] > > (* Hanlon *) > hammingDistance[x_?VectorQ, y_?VectorQ] /; Length[x] == Length[y] := > Plus @@ MapThread[If[#1 == #2, 0, 1] &, {x, y}]; > > hamming[4][lis_,tar_,dist_]:=Select[lis, hammingDistance[tar, #] < dist+1&] > > (* Woll #1 *)hamming[5][lis_,tar_,dist_]:=Module[{ans}, > SetAttributes[SameQ,{Listable}]; > ans=Cases[lis,x_/;Count[SameQ[x,tar],False]<=dist]; > ClearAttributes[SameQ,{Listable}]; > ans] > > (* Woll #2 *) > hamming[6][lis_,tar_,dist_]:=Module[{ans,t,len=Length[lis]}, > SetAttributes[SameQ,{Listable}]; > t=SameQ[testlist,Table[tar,{len}]]; > ans=lis[[ > Flatten[Position[Length/@(t/.True->Sequence[]), > Alternatives@@Range[0,dist]]]]]; > ClearAttributes[SameQ,{Listable}]; > ans] > In[27]:= > target = Table[Random[Integer,{0,2}],{10}]; > In[28]:= > testlist = > Table[Random[Integer, {0, 2}], {10000},{10}]; > In[29]:= > Do[Print[Timing[r[i]=hamming[i][testlist,target,2];]],{i,1,6}] > > r[1]==r[2]==r[3]==r[4]==r[5]==r[6] > {3.734 Second,Null} > {4.547 Second,Null} > {9.281 Second,Null} > {9.109 Second,Null} > {3.063 Second,Null} > {2.187 Second,Null} > Out[30]= > True > > Carl Woll > Dept of Physics > U of Washington > > Kew Joinery wrote: > > > Hello, > > I'd like to thank to all replies again. > > Only to add few words: > > To define universal Humming function it should be of the form : > > Hamming[someList , target , distance] > > Obviously all solutions from the different authors (excluding the > "fastest" solution > > of Mr.K.Woll )could be easy adapted to this form (they are at the moment > of the form > > hamming[someList ,target] . > > What about the fastest solution? At the moment it detects only lists with > elements of > > length=3 and distance <=1. Consider example when testing > > > someList={{0,1,2,3,4,5,6,7,8,9},{0,0,1,2,3,4,5,6,7,8},{0,0,0,1,2,3,4,5,6,7}, > {0,0,0,0,1,2,3,4,5,6}.{0,0,0,0,0,0,0,0,0,0}} > > > > According to Mr.Woll I should define all 10! "success[.]=0 " in advance to > detect > > someList in Hamming Distance <=9 for example.I thing this solution is > inpractical . > > Regards, > > Eugene > > > > Carl K.Woll wrote: > > > > > Hi all, > > > > > > I thought I would give a comparison of the various solutions provided, > along with > > > a couple of new ones by yours truly. > > > > > > (* Hayes *) > > > hamming[1][lis_,tar_]:= > > > Cases[lis,x_/;Count[MapThread[SameQ,{target,x}],False]<=1] > > > > > > (* Park *) > > > hammtest[p : {_, _, _}, t : {_, _, _}] := Count[Abs[t - p], 0] > 1; > > > > > > hamming[2][lis_,tar_]:=Select[lis,hammtest[#,tar]&] > > > > > > (* Kozlowski *) > > > dist[a_, b_] := If[a === b, 0, 1] > > > SetAttributes[dist, Listable] > > > > > > hamming[3][lis_,tar_]:=Select[lis, Apply[Plus, dist[tar, #]] <= 1 &] > > > > > > (* Hanlon *) > > > hammingDistance[x_?VectorQ, y_?VectorQ] /; Length[x] == Length[y] := > > > Plus @@ MapThread[If[#1 == #2, 0, 1] &, {x, y}]; > > > > > > hamming[4][lis_,tar_]:=Select[lis, hammingDistance[tar, #] < 2 &] > > > > > > (* Woll #1 *) > > > Clear[closeEnough]; > > > closeEnough[{True,True,True}]=True; > > > closeEnough[{True,True,False}]=True; > > > closeEnough[{True,False,True}]=True; > > > closeEnough[{False,True,True}]=True; > > > > > > hamming[5][lis_,tar_]:=Module[{ans}, > > > SetAttributes[SameQ,{Listable}]; > > > ans=Select[lis,closeEnough[SameQ[#,tar]]&]; > > > ClearAttributes[SameQ,{Listable}]; > > > ans] > > > > > > (* Woll #2 *) > > > Clear[success] > > > success[True,True,True]=0; > > > success[True,True,False]=0; > > > success[True,False,True]=0; > > > success[False,True,True]=0; > > > > > > hamming[6][lis_,tar_]:=Module[{ans,t,len=Length[lis]}, > > > SetAttributes[SameQ,{Listable}]; > > > t=SameQ[testlist,Table[tar,{len}]]; > > > ans=lis[[Flatten[Position[Apply[success,t,1],0]]]]; > > > ClearAttributes[SameQ,{Listable}]; > > > ans] > > > In[22]:= > > > target = {0, 0, 1}; > > > In[263]:= > > > testlist = > > > Table[{Random[Integer, {0, 2}], Random[Integer, {0, 2}], > > > Random[Integer, {0, 2}]}, {10000}]; > > > In[381]:= > > > Do[Print[Timing[r[i]=hamming[i][testlist,target];]],{i,1,6}] > > > > > > r[1]==r[2]==r[3]==r[4]==r[5]==r[6] > > > {2.703 Second,Null} > > > {3.937 Second,Null} > > > {3.969 Second,Null} > > > {5.234 Second,Null} > > > {1.125 Second,Null} > > > {0.969 Second,Null} > > > Out[382]= > > > True > > > > > > Carl Woll > > > Dept of Physics > > > U of Washington > > > > > > Kew Joinery wrote: > > > > > > > Hello everyone, > > > > I have a problem extracting list of list. > > > > Given some list, consisting of elements (not all of them distinct) > with > > > > the same length and some target list of the same length, for example: > > > > In[26]:= > > > > > someList={{0,0,0},{0,0,1},{0,0,2},{0,1,0},{0,1,1},{0,1,2},{0,2,0},{0,2,1},{0 > , > > > > > > > > > 2,2},{1,0,0},{1,0,1},{1,0,2},{1,1,0},{1,1,1},{1,1,2},{1,2,0},{1,2,1},{1, > > > > > > > > > 2,2},{2,0,0},{2,0,1},{2,0,2},{2,1,0},{2,1,1},{2,1,2},{2,2,0},{2,2,1},{2, > > > > > > > > 2,2}}; > > > > > > > > In[27]:= > > > > target={0,0,1}; > > > > > > > > Hamming distance between two bit strings means the number of bit > > > > positions in which they differ. For example consecutive elements of > the > > > > Gray code list have Hamming distance = 1. > > > > I need to extract these cases(elements of someList) which differ from > > > > target in one coordinate (Hamming distance = 1) or less (the > > > > element===target itself//Hamming distance=0), so the answer should be: > > > > > > > > In[33]:= > > > > answer={{0,0,0},{0,0,1},{0,0,2},{0,1,1},{0,2,1},{1,0,1},{2,0,1}}; > > > > > > > > How could I achieve the extraction efficiently? > > > > Thank you in advance for any suggestions. > > > > Eugene

**References**:**Element Extraction***From:*kewjoi@hixnet.co.za (Kew Joinery)