Re: Counting Symbols
- To: mathgroup at smc.vnet.net
- Subject: [mg70897] Re: Counting Symbols
- From: bghiggins at ucdavis.edu
- Date: Wed, 1 Nov 2006 03:55:07 -0500 (EST)
- References: <ei4lrs$djv$1@smc.vnet.net>
RM, Have you thought about using the J/Link options in Mathematica? I am assuming you have a way to capture keystrokes in a document and these keystrokes have a unicode equivalent If that is the case then try this: Needs["JLink`"] InstallJava[] Suppose now the file(s) are located in a given directory filepath = "/ToMydirectory/testdata.txt"; First, we need to create byte array object for our data. Normally we make this byte array much larger than the actual size we require. The byte array will be a buffer : buf = JavaNew["[B", 1000] Then we create a FileInputStream object which opens a connection to the datafile fis=JavaNew["java.io.FileInputStream",filepath] Now let us read the data into the buffer, the output of which is the size of the buffer fis[read[buf]] 527 To examine the contents of our buffer array we need to convert it back to a Mathematica expression using the JavaObjectToExpression function. bufdata = Take[JavaObjectToExpression[buf], 527] {83, 111, 32, 115, 104, 101, 32, 119, 101, 110, 116, 32, 105, 110, 116, 111, \ 32, 116, 104, 101, 32, 103, 97, 114, 100, 101, 110, 32, 116, 111, 32, 99, \ 117, 116, 32, 97, 32, 99, 97, 98, 98, 97, 103, 101, 45, 108, 101, 97, 102, \ 44, 32, 116, 111, 10, 109, 97, 107, 101, 32, 97, 110, 32, 97, 112, 112, 108, \ 101, 45, 112, 105, 101, 59, 32, 97, 110, 100, 32, 97, 116, 32, 116, 104, 101, \ 32, 115, 97, 109, 101, 32, 116, 105, 109, 101, 32, 97, 32, 103, 114, 101, 97, \ 116, 10, 115, 104, 101, 45, 98, 101, 97, 114, 44, 32, 99, 111, 109, 105, 110, \ 103, 32, 117, 112, 32, 116, 104, 101, 32, 115, 116, 114, 101, 101, 116, 44, \ 32, 112, 111, 112, 115, 32, 105, 116, 115, 32, 104, 101, 97, 100, 32, 105, \ 110, 116, 111, 32, 116, 104, 101, 10, 115, 104, 111, 112, 46, 32, 39, 87, \ 104, 97, 116, 33, 32, 110, 111, 32, 115, 111, 97, 112, 63, 39, 32, 83, 111, \ 32, 104, 101, 32, 100, 105, 101, 100, 44, 32, 97, 110, 100, 32, 115, 104, \ 101, 32, 118, 101, 114, 121, 10, 105, 109, 112, 114, 117, 100, 101, 110, 116, \ 108, 121, 32, 109, 97, 114, 114, 105, 101, 100, 32, 116, 104, 101, 32, 98, \ 97, 114, 98, 101, 114, 59, 32, 97, 110, 100, 32, 116, 104, 101, 114, 101, 32, \ 119, 101, 114, 101, 10, 112, 114, 101, 115, 101, 110, 116, 32, 116, 104, 101, \ 32, 80, 105, 99, 110, 105, 110, 110, 105, 101, 115, 44, 32, 97, 110, 100, 32, \ 116, 104, 101, 32, 74, 111, 98, 108, 105, 108, 108, 105, 101, 115, 44, 32, \ 97, 110, 100, 32, 116, 104, 101, 10, 71, 97, 114, 121, 97, 108, 105, 101, \ 115, 44, 32, 97, 110, 100, 32, 116, 104, 101, 32, 103, 114, 97, 110, 100, 32, \ 80, 97, 110, 106, 97, 110, 100, 114, 117, 109, 32, 104, 105, 109, 115, 101, \ 108, 102, 44, 32, 119, 105, 116, 104, 32, 116, 104, 101, 10, 108, 105, 116, \ 116, 108, 101, 32, 114, 111, 117, 110, 100, 32, 98, 117, 116, 116, 111, 110, \ 32, 97, 116, 32, 116, 111, 112, 44, 32, 97, 110, 100, 32, 116, 104, 101, 121, \ 32, 97, 108, 108, 32, 102, 101, 108, 108, 32, 116, 111, 32, 112, 108, 97, \ 121, 105, 110, 103, 10, 116, 104, 101, 32, 103, 97, 109, 101, 32, 111, 102, \ 32, 99, 97, 116, 99, 104, 32, 97, 115, 32, 99, 97, 116, 99, 104, 32, 99, 97, \ 110, 44, 32, 116, 105, 108, 108, 32, 116, 104, 101, 32, 103, 117, 110, 32, \ 112, 111, 119, 100, 101, 114, 32, 114, 97, 110, 10, 111, 117, 116, 32, 97, \ 116, 32, 116, 104, 101, 32, 104, 101, 101, 108, 115, 32, 111, 102, 32, 116, \ 104, 101, 105, 114, 32, 98, 111, 111, 116, 115, 46, 10, 10, 83, 97, 109, 117, \ 101, 108, 32, 70, 111, 111, 116, 101, 32, 49, 55, 50, 48, 45, 49, 55, 55, 55} If we convert these bytes back the characters, we get the data that was read from the file! FromCharacterCode[bufdata] So she went into the garden to cut a cabbage-leaf, to make an apple-pie; and at the same time a great she-bear, coming up the street, pops its head into the shop. 'What! no soap?' So he died, and she very imprudently married the barber; and there were present the Picninnies, and the Joblillies, and the Garyalies, and the grand Panjandrum himself, with the little round button at top, and they all fell to playing the game of catch as catch can, till the gun powder ran out at the heels of their boots. Samuel Foote 1720-1777 Knowing what charactercode you want to look for in the list bufdata you can then use a variety of Mathemnatica functions to do the task. I think this approach might be a lot cleaner than pulling the document into Mathematica as a notebook object. Brian R M wrote: > I have a whole bunch of documents from which I would like to figure out what keystrokes/symbols are used the most. What commands should I use to figure this out? I have (unsuccessfully) tried to use Flatten and Short. My approach is to pull the document into Mathematica as a notebook object, then use Mathematica's list manipulation commands to figure out what symbols are used most frequently. > > This whole problem involves my recent purchase of a Logic Controls programmable keyboard. So far I have programmed keys to perform [esc]sumt[esc], [downarrow]alt+5, cntrl+space, cntrl+9, etc. I am trying to figure out which symbols are used the most so that I can program the keyboard most efficiently.