Re: Count Ouccrence of words in a long text
- To: mathgroup at smc.vnet.net
- Subject: [mg118964] Re: Count Ouccrence of words in a long text
- From: Tomas Garza <tgarza10 at msn.com>
- Date: Wed, 18 May 2011 07:15:01 -0400 (EDT)
Here's a rough hint. First, get read of special characters like ",", ".", "?", etc. (you can do it with some string operation) and uppercase letters. Then use StringSplit and Tally: In[22]:= a = StringSplit[ "hello first of all I am pretty new to Mathematica so excuse me if \this has a simple answer what I need is to be able to count the occurrence \of each word of a text and count the times each word appears on it I know how to do this on other languages but I am trying to achieve it with mathematica can someone hint me the way to go thanks"] Out[22]= {"hello", "first", "of", "all", "I", "am", "pretty", "new", \"to", "Mathematica", "so", "excuse", "me", "if", "this", "has", "a", \"simple", "answer", "what", "I", "need", "is", "to", "be", "able", \"to", "count", "the", "occurrence", "of", "each", "word", "of", "a", \"text", "and", "count", "the", "times", "each", "word", "appears", \"on", "it", "I", "know", "how", "to", "do", "this", "on", "other", \"languages", "but", "I", "am", "trying", "to", "achieve", "it", \"with", "mathematica", "can", "someone", "hint", "me", "the", "way", \"to", "go", "thanks"} In[24]:= Tally[a] Out[24]= {{"hello", 1}, {"first", 1}, {"of", 3}, {"all", 1}, {"I", 4}, {"am", 2}, {"pretty", 1}, {"new", 1}, {"to", 6}, {"Mathematica", 1}, {"so", 1}, {"excuse", 1}, {"me", 2}, {"if", 1}, {"this", 2}, {"has", 1}, {"a", 2}, {"simple", 1}, {"answer", 1}, {"what", 1}, {"need", 1}, {"is", 1}, {"be", 1}, {"able", 1}, {"count", 2}, {"the", 3}, {"occurrence", 1}, {"each", 2}, {"word", 2}, {"text", 1}, {"and", 1}, {"times", 1}, {"appears", 1}, {"on", 2}, {"it", 2}, {"know", 1}, {"how", 1}, {"do", 1}, {"other", 1}, {"languages", 1}, {"but", 1}, {"trying", 1}, {"achieve", 1}, {"with", 1}, {"mathematica", 1}, {"can", 1}, {"someone", 1}, {"hint", 1}, {"way", 1}, {"go", 1}, {"thanks", 1}} -Tomas > Date: Tue, 17 May 2011 07:47:31 -0400 > From: yako at 11y11.com > Subject: [mg118949] Count Ouccrence of words in a long text > To: mathgroup at smc.vnet.net > > Hello, > > First of all I am pretty new to Mathematica, so excuse me if this has > a simple answer. > > What I need is to be able to count the occurrence of each word of a > text and count the times each word appears on it. I know how to do > this on other languages but I am trying to achieve it with > mathematica. > > Can someone hint me the way to go? > > Thanks! >