Re: Using Mathematica for text mining
- To: mathgroup at smc.vnet.net
- Subject: [mg116852] Re: Using Mathematica for text mining
- From: Gregory Klopper <chartmagician at gmail.com>
- Date: Wed, 2 Mar 2011 04:35:47 -0500 (EST)
- References: <ijdob6$fap$1@smc.vnet.net>
I think you also should look at StringCases function, especially under Neat Examples section. On Feb 15, 6:33 am, Cameron Christiansen <c... at byu.edu> wrote: > Thank you for the responses. It was helpful. I had given up on it, but you > show that it is possible. Thanks. > > > > > On Fri, Feb 11, 2011 at 2:18 AM, Bill Rowe <readn... at sbcglobal.net> wrote: > > >> On 2/10/11 at 5:20 AM, c... at byu.edu (Cameron Christiansen) wrote: > > >> >Thank you for the response. It looks like that works well to cluster > >> >words in a single document together, however I'd like to cluster > >> >entire documents together based on the words they contain. Is that > >> >possible? > > >> Yes, it is possible. To do this you need to define a distance > >> function that provides a measure of how different one file is > >> from another. For example, > > >> FindClusters[filenameList, > >> DistanceFunction -> (Abs[ > >> Length@FindList[#1, "keyword"] - > >> Length@FindList[#2, "keyword"]] &)] > > >> would group file names according to the number of occurrences of > >> "keyword" in each file.- Hide quoted text - > > - Show quoted text -