MathGroup Archive 2012

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Problem Importing web site in Mathematica: How to by pass pages asking for login credentials

  • To: mathgroup at smc.vnet.net
  • Subject: [mg126450] Re: Problem Importing web site in Mathematica: How to by pass pages asking for login credentials
  • From: Armand Tamzarian <mike.honeychurch at gmail.com>
  • Date: Fri, 11 May 2012 00:14:33 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com

On May 9, 5:53 pm, Mark Coleman <markspcole... at gmail.com> wrote:
> Hi,
>
> I'm using Mathematica v8 for some text mining/classification analysis of web
> sites. As part of this I first Import[] the hyperlinks from the web
> site 's home page into a list, and then systematically traverse this
> list and Import each URL. In some cases, I hit a page or set of pages
> that requires a user to enter login credentials. At this point my code
> pops up the site's login screen and waits for manual input before
> proceeding. This obviously makes importing a large set of URLs
> infeasible.
>
> I'm wondering if it's possible to identify these pages in advance, so
> I can filter them out of my list of URLs. allowing me to automatically
> Import the remaining pages?
>
> Thanks,
>
> Mark

I use wget in combination with Mathematica to work around logins and
cookies.

Mike



  • Prev by Date: How to combine 3D Graphics?
  • Next by Date: Fine control of evaluation
  • Previous by thread: Problem Importing web site in Mathematica: How to by pass pages asking for login credentials
  • Next by thread: Concurrency issues with callback functions through MathLink (KernelLink)