Parser
- To: mathgroup at yoda.physics.unc.edu
- Subject: Parser
- From: cfw2 at po.cwru.edu (Charles F. Wells)
- Date: Tue, 29 Jun 93 10:34:27 -0400
Some interest has been shown in parsers in Mathematica. The following notebook contains a command AcceptQ that takes a context free grammar and a string as arguments and determines whether the string is generated by the grammar. As a side effect it prints successively a list of strings that derive to the given string in n steps. It stops and accepts when it finds the start symbol in that list. If it comes up with the empty list, it rejects. The algorithm is a backward exhaustive search, so is exponential in the length of the input and the size of the grammar. It would be difficult to adapt it to give a derivation tree. A forward breadth first search would be easier to adapt for that purpose. Cut here: ----------------------------------------------------------------- (*^ ::[ frontEndVersion = "Microsoft Windows Mathematica Notebook Front End Version 2.1"; microsoftWindowsStandardFontEncoding; fontset = title, "MS Serif", 18, L0, center, nohscroll, R32768; fontset = subtitle, "MS Serif", 14, L0, center, nohscroll, R32768; fontset = subsubtitle, "MS Serif", 12, L0, center, nohscroll; fontset = section, "MS Serif", 14, L0, B65280, grayBox; fontset = subsection, "MS Serif", 12, L0, B65280, blackBox; fontset = subsubsection, "MS Serif", 12, L0, B65280, whiteBox; fontset = text, "MS Serif", 12, L0, B65280; fontset = smalltext, "MS Sans Serif", 10, L0, B65280; fontset = input, "Courier New", 10, L0, nowordwrap; fontset = output, "Courier New", 10, L0, nowordwrap; fontset = message, "Courier New", 10, L0, nowordwrap, R65280; fontset = print, "Courier New", 10, L0, nowordwrap, G32768, B32768; fontset = info, "Courier New", 10, L0, nowordwrap, R32768, B32768; fontset = postscript, "Courier New", 8, L0, nowordwrap; fontset = name, "Helv", 10, L0, nohscroll, italic, B65280; fontset = header, "Helv", 18, L0, nohscroll, bold; fontset = footer, "Helv", 18, L0, center, nohscroll, bold; fontset = help, "Helv", 10, L0, nohscroll; fontset = clipboard, "Helv", 12, L0, nohscroll; fontset = completions, "Helv", 12, L0, nowordwrap, nohscroll; fontset = network, "Courier New", 10, L0, nowordwrap, nohscroll; fontset = graphlabel, "Courier New", 10, L0, nowordwrap, nohscroll; fontset = special1, "Helv", 12, L0, nowordwrap, nohscroll; fontset = special2, "Helv", 12, L0, center, nowordwrap, nohscroll; fontset = special3, "Helv", 12, L0, right, nowordwrap, nohscroll; fontset = special4, "Helv", 12, L0, nowordwrap, nohscroll; fontset = special5, "Helv", 12, L0, nowordwrap, nohscroll; fontset = Left Header, "Helv", 12, L0, nowordwrap, nohscroll; fontset = Left Footer, "Helv", 12, L0, nowordwrap, nohscroll;] :[font = title; inactive; nohscroll; center; ] Context-Free Grammars :[font = subtitle; inactive; nohscroll; center; ] Preliminary and incomplete version 27 January 1993 :[font = section; inactive; startGroup; Cclosed; ] Implementation :[font = input; nowordwrap; ] FindPlaces[grammar_List,s_String] := StringPosition[s,Part[#,2]]& /@ grammar (*FindPlaces[grammar,string] produces a list of pairs {m,n} where the right side of a production in grammar occurs as a substring of string from position m to position n. The lists of pairs are grouped by production: the kth list of pairs correspond to the occurrences of the right side of production k of grammar.*) :[font = input; nowordwrap; ] BDL[placeslist_List,r_String,s_String] := If[Length[placeslist]>0, StringInsert[StringDrop[s,#],r,Part[#,1]]& /@ placeslist, {} ] (*placeslist should consist of a list of pairs (m,n) of positive integers. BDL[placeslist,r,s] produces a list of strings, each of which is the result of replacing characters m through n of s with the string r for one of the pairs (m,n) of placeslist.*) :[font = input; nowordwrap; ] BackDerive[grammar_List,s_String] := Module[{places = StringPosition[s,Part[#,2]]& /@ grammar}, (BDL[FindPlaces[grammar,s][[#]],grammar[[#]][[1]],s]& /@ Table[k,{k,Length[grammar]}])] :[font = input; nowordwrap; ] BackDerive[grammar_List,x_List] := BackDerive[grammar,#]& /@ x :[font = input; nowordwrap; ] SimpleBackDerive[grammar_List,x_List] := Union[Flatten[BackDerive[grammar,x]]] :[font = input; nowordwrap; ] EmptyQ[x_List] := Length[x]==0 :[font = input; endGroup; nowordwrap; ] AcceptQ[grammar_List,s_String,maxit_Integer:30] := Module[{tries=List[s],k=0}, (While[(!MemberQ[tries,"S"] && !EmptyQ[tries] && k++<maxit), (tries=SimpleBackDerive[grammar,tries]; Print[k," ",tries]) ]; MemberQ[tries,"S"]) ] :[font = section; inactive; startGroup; Cclosed; ] Examples :[font = text; inactive; ] Execute the following definitions, giving you a grammar g1 and some strings. Then execute the commands. :[font = input; nowordwrap; ] g1 := {{"S","SS"}, {"S","aSb"}, {"S","bSa"}, {"S","cC"}, {"C","cC"}, {"C","c"}} :[font = input; nowordwrap; ] st = "abSabbaSbSacCab" :[font = input; nowordwrap; ] st2="aSbbcCa" :[font = input; nowordwrap; ] FindPlaces[g1,"aSSS"] (*Example*) :[font = input; nowordwrap; ] BDL[{{1,2},{2,5}},"xy","abcde"] :[font = input; nowordwrap; ] SimpleBackDerive[g1,{"ab","acCb","aSbb","SS"}] :[font = input; nowordwrap; ] AcceptQ[g1,st] :[font = input; endGroup; nowordwrap; ] AcceptQ[g1,st2] :[font = section; inactive; startGroup; Cclosed; ] Acknowledgments :[font = text; inactive; endGroup; ] This work was supported by the Consolidated Natural Gas Corporation :[font = section; inactive; startGroup; Cclosed; ] Author & Copyright :[font = text; inactive; ] Charles Wells Department of Mathematics Case Western Reserve University University Circle Cleveland, OH 44106-7058 :[font = text; inactive; ] Phone 216 368 2893 or 216 774 1926 :[font = text; inactive; endGroup; ] Email cfw2 at po.cwru.edu ^*) -------------------------------- cut here -- Charles Wells Department of Mathematics Case Western Reserve University