MathGroup Archive 2012

[Date Index] [Thread Index] [Author Index]

Search the Archive

Problem Importing web site in Mathematica: How to by pass pages asking for login credentials

  • To: mathgroup at smc.vnet.net
  • Subject: [mg126422] Problem Importing web site in Mathematica: How to by pass pages asking for login credentials
  • From: Mark Coleman <markspcoleman at gmail.com>
  • Date: Wed, 9 May 2012 03:51:44 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com

Hi,

I'm using Mathematica v8 for some text mining/classification analysis of web
sites. As part of this I first Import[] the hyperlinks from the web
site 's home page into a list, and then systematically traverse this
list and Import each URL. In some cases, I hit a page or set of pages
that requires a user to enter login credentials. At this point my code
pops up the site's login screen and waits for manual input before
proceeding. This obviously makes importing a large set of URLs
infeasible.

I'm wondering if it's possible to identify these pages in advance, so
I can filter them out of my list of URLs. allowing me to automatically
Import the remaining pages?

Thanks,

Mark



  • Prev by Date: how export a graphic with given width preserving the ratio of text to image size
  • Next by Date: Re: Typesetting built-in functions without evaluating
  • Previous by thread: Re: how export a graphic with given width preserving the ratio of text to image size
  • Next by thread: Re: Problem Importing web site in Mathematica: How to by pass pages asking for login credentials