MathGroup Archive 2012

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Import HTTP data in asynchronous/parallel way

  • To: mathgroup at
  • Subject: [mg125981] Re: Import HTTP data in asynchronous/parallel way
  • From: Murta <rodrigomurtax at>
  • Date: Wed, 11 Apr 2012 18:17:43 -0400 (EDT)
  • Delivered-to:
  • References: <jlp34d$14s$> <jm0k3v$aga$>

On Apr 10, 3:31 am, David Bailey <d... at> wrote:
> On 07/04/2012 10:58, Rodrigo Murta wrote:
> > Hi All
> >      I'm testing some web scraping using Mathematica and would like to
> > know how to work with Import in an asynchronous/parallel way.
> >      Now I'm using Parallelize to do that. It works but it doesn't look
> > like the best way to do that, due to the kernel number limitation.
> >     My code is like that:
> >     result = Parallelize[Import/@urlList]
> >    How can I do it in another asynchronous way? Something like
> > backgroud process using&  in bash?
> >    I can do it inside Mathematica? I know that I could speed up a lot
> > my scrap with that.
> > tks in advance
> > Murta
> Since nobody else has offered a better suggestion, I'll suggest that you
> pull the data over using Java - which you can call via J/Link in a
> totally seamless way.
> I'd write a simple Java class that has a public, static method that
> takes a list of URL's to try, and uses separate threads to run them in
> parallel. If it kept a tally of its progress in an array:
> public static boolean finished[10]
> Then you could monitor the progress from Mathematica and read the data
> in the order that it got delivered.
> To be clear, this would involve writing and compiling a Java class, and
> using AddToClassPath to make it available to Mathematica's J/Link.
> Constructing such a thing in pure J/Link code would be very tough!
> David Bailey

Hi David,
First of all tks for your answer.
For me the problem is that I know zero of Java, I hopped to be capable
to solve it inside Mathematica.
I get a little bit frustrated in see that Mathematica is Multi Kerner,
but not Mult Thread with one Kernel, in the sence that inside one
Kernel it's not possible to have more than one process at the same
time. This is essencial to handle internet process.
Someone could correct me if I'm wrong!
tks again

  • Prev by Date: Re: Image Processing Pixel Values Export Question
  • Next by Date: Re: Incorrect parallel computation
  • Previous by thread: Re: Import HTTP data in asynchronous/parallel way
  • Next by thread: Re: Import HTTP data in asynchronous/parallel way