MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Computing n-grams

  • To: mathgroup at smc.vnet.net
  • Subject: [mg88942] Re: [mg88913] Computing n-grams
  • From: Darren Glosemeyer <darreng at wolfram.com>
  • Date: Thu, 22 May 2008 02:34:17 -0400 (EDT)
  • References: <200805211849.OAA10371@smc.vnet.net>

Coleman, Mark wrote:
> Greetings,
>
> Imagine one has a list such as {a,b,c,d,e,f,g}. I'm trying to find an
> efficient way in Mathematica to compute the n-grams of the list. That is, for
> n=2, the n-grams are all the lists of length 2 consisting of consecutive
> elements, e.g.,
>
> {a,b},{b,c},{c,d},{d,e},...
>
> While for n=3,
>
> {a,b,c},{b,c,d},{c,d,e},..., and so on.
>
> As I understand it, the built-in Mathematica commands such as Subsets or
> Permutations compute all possible list of size n, without regard to the
> order of the list elements.
>
> Thanks,
>
> Mark
>
>   

Partition with an offset of 1 will do the trick.

In[1]:= Partition[{a, b, c, d, e, f}, 3, 1]

Out[1]= {{a, b, c}, {b, c, d}, {c, d, e}, {d, e, f}}

In[2]:= nGram[list_,n_]:=Partition[list,n,1]

In[3]:= nGram[{a, b, c, d, e, f},2]

Out[3]= {{a, b}, {b, c}, {c, d}, {d, e}, {e, f}}


Darren Glosemeyer
Wolfram Research


  • References:
  • Prev by Date: Re: Computing n-grams
  • Next by Date: Re: Color space conversion in Mathematica v6.0
  • Previous by thread: Re: Computing n-grams
  • Next by thread: RE: Computing n-grams