Re: Count Ouccrence of words in a long text
- To: mathgroup at smc.vnet.net
- Subject: [mg119004] Re: Count Ouccrence of words in a long text
- From: Matthias Bode <lvsaba at hotmail.com>
- Date: Thu, 19 May 2011 07:43:41 -0400 (EDT)
Hola: Second step: How could I take apart the words ("alice" should become "a,l,i,c,e") to get a tally of the letters in a text? (Interesting when comparing languages.) Best regards, MATTHIAS BODE. > Here's one approach, which I've encapsulated in a Module for convenience: > > wordCounts[txt_] := > Module[{words,unique,counts}, > words=StringCases[ToLowerCase[txt],WordCharacter..]; > unique=Union[words]; > counts=Count[words,#]&/@unique; > Reverse@SortBy[Transpose[{unique,counts}],Last] > ] > > (* example *) > story = ExampleData[{"Text", "AliceInWonderland"}]; > wordCounts[story] > > {{"the", 632}, {"and", 338}, {"a", 278}, {"to", 252}, {"she", > 242}, {"of", 199},... > > If you want a nice table printout, just use TableForm: > > wordCounts[story] // TableForm > > There's at least one anomaly: the "s" at the end of possessives is split > off as a separate word. > > On 5/17/2011 7:47 AM, Yako wrote: > > Hello, > > > > First of all I am pretty new to Mathematica, so excuse me if this has > > a simple answer. > > > > What I need is to be able to count the occurrence of each word of a > > text and count the times each word appears on it. I know how to do > > this on other languages but I am trying to achieve it with > > mathematica. > > > > Can someone hint me the way to go? > > > > Thanks! > > > > -- > Murray Eisenberg murray at math.umass.edu > Mathematics & Statistics Dept. > Lederle Graduate Research Tower phone 413 549-1020 (H) > University of Massachusetts 413 545-2859 (W) > 710 North Pleasant Street fax 413 545-1801 > Amherst, MA 01003-9305 >