Problem Importing web site in Mathematica: How to by pass pages asking for login credentials
- To: mathgroup at smc.vnet.net
- Subject: [mg126422] Problem Importing web site in Mathematica: How to by pass pages asking for login credentials
- From: Mark Coleman <markspcoleman at gmail.com>
- Date: Wed, 9 May 2012 03:51:44 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
Hi, I'm using Mathematica v8 for some text mining/classification analysis of web sites. As part of this I first Import[] the hyperlinks from the web site 's home page into a list, and then systematically traverse this list and Import each URL. In some cases, I hit a page or set of pages that requires a user to enter login credentials. At this point my code pops up the site's login screen and waits for manual input before proceeding. This obviously makes importing a large set of URLs infeasible. I'm wondering if it's possible to identify these pages in advance, so I can filter them out of my list of URLs. allowing me to automatically Import the remaining pages? Thanks, Mark