MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Counting Matching Patterns in a Large File

  • To: mathgroup at smc.vnet.net
  • Subject: [mg116103] Counting Matching Patterns in a Large File
  • From: "W. Craig Carter" <ccarter at mit.edu>
  • Date: Wed, 2 Feb 2011 06:08:20 -0500 (EST)

MathGroup,

(*
I'm trying to find a more efficient way to check if a file has more than n lines that match a pattern.

As a test, one might use a test example file obtained from:
*)

Export["BigFile.tsv",   Map[RandomReal[{0, 1}, {#}] &, RandomInteger[{1, 20}, {10000}]]]

(*
Right now, I am using:
*)

n=5 (*for example*)

Count[Import["BigFile.tsv", "Table"], {a_?NumberQ, b_?NumberQ, c_?NumberQ}] > n

(*
But, in many cases, a count of 5 *would* be obtained well before the end-of-file is reached.

My target files are *much* larger than 10000 lines...

I haven't dealt with Streams very much---I am guessing that is where the answer lies.

Many Thanks, Craig
*)


  • Prev by Date: Plot artifact of NDSolve result
  • Next by Date: finding area in ListContourPlot
  • Previous by thread: Re: Plot artifact of NDSolve result
  • Next by thread: Re: Counting Matching Patterns in a Large File