MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Using Mathematica for text mining

  • To: mathgroup at smc.vnet.net
  • Subject: [mg116852] Re: Using Mathematica for text mining
  • From: Gregory Klopper <chartmagician at gmail.com>
  • Date: Wed, 2 Mar 2011 04:35:47 -0500 (EST)
  • References: <ijdob6$fap$1@smc.vnet.net>

I think you also should look at StringCases function, especially under
Neat Examples section.


On Feb 15, 6:33 am, Cameron Christiansen <c... at byu.edu> wrote:
> Thank you for the responses. It was helpful. I had given up on it, but you
> show that it is possible. Thanks.
>
>
>
> > On Fri, Feb 11, 2011 at 2:18 AM, Bill Rowe <readn... at sbcglobal.net> wrote:
>
> >> On 2/10/11 at 5:20 AM, c... at byu.edu (Cameron Christiansen) wrote:
>
> >> >Thank you for the response. It looks like that works well to cluster
> >> >words in a single document together, however I'd like to cluster
> >> >entire documents together based on the words they contain. Is that
> >> >possible?
>
> >> Yes, it is possible. To do this you need to define a distance
> >> function that provides a measure of how different one file is
> >> from another. For example,
>
> >> FindClusters[filenameList,
> >>  DistanceFunction -> (Abs[
> >>      Length@FindList[#1, "keyword"] -
> >>       Length@FindList[#2, "keyword"]] &)]
>
> >> would group file names according to the number of occurrences of
> >> "keyword" in each file.- Hide quoted text -
>
> - Show quoted text -


  • Prev by Date: Re: Bug in Mathematica 8 vs Mathematica 7: SeriesData
  • Next by Date: Re: Bug in Mathematica 8 vs Mathematica 7: SeriesData
  • Previous by thread: Re: FinancialData Function Not Working for Property "Members"
  • Next by thread: Re: 3D of combine plot