MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Import html

  • To: mathgroup at smc.vnet.net
  • Subject: [mg108443] Import html
  • From: Scipione Dal Ferro <scipionedalferro at yahoo.it>
  • Date: Thu, 18 Mar 2010 04:31:28 -0500 (EST)

Hi there,

I use Import to parse the hyperlinks of many similar html pages without any problem, but for few pages (as for the example in the subject) it fails.
More in detail, here the example with the result:

In[1]:= Import["http://www.paginegialle.it/ascensoriromamir.a.m";, "Hyperlinks"]

Read::readt: Invalid input found when reading <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
 from C:\Users\scipione.dalferro\AppData\Local\Temp\mFA3E.tmp\ascensoriromamir.a.m. >>

Out[1]= $Failed

The error messages states there's an invalid input; anyway the page can be opened with a browser correctly.

I tried changing the Element to "Source" or other, but with the same result.
Similar pages work correctly, as this one for example:

In[2]:=Import["http://www.paginegialle.it/esis";, "Hyperlinks"]

Hope u can help me to understand this issue.

Thanks,
Scipione


  • Prev by Date: Re: Axeslabel containing capital n
  • Next by Date: Re: Butterworth filter
  • Previous by thread: Re: Rule
  • Next by thread: Re: Import html