Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Bug in ExportString?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg87382] Re: Bug in ExportString?
  • From: Szabolcs Horvát <szhorvat at gmail.com>
  • Date: Wed, 9 Apr 2008 05:54:34 -0400 (EDT)
  • Organization: University of Bergen
  • References: <ftfee5$bq7$1@smc.vnet.net> <ftfk7f$f9f$1@smc.vnet.net>

dh wrote:
> Hi,
> 
> using FullForm on ImportString[ExportString[{1,1}, "PDF"], "PDF"] 
> 
> //FullForm it is clear that the output is quite different from the 
> 
> input: {1,1}. Is any documetation available or is this a bug?
> 
> Daniel
> 


P_ter wrote:
 > I did not formulate a good question about:
 > ImportString[ExportString[" ", "PDF"], "PDF"]
 > My point is that this gives an image, while
 > ImportString[ExportString["t", "PDF"], "PDF"]
 > gives with Inputform a polygon.
 > I think a space is also text (" "). It has an ASCII place. So, there 
should be no difference in structure with the letter t.
 > with friendly greetings,
 > P_ter
 >


Hi Peter and Daniel,

Why do you think that there is a bug in ImportString or ExportString? 
For all of " ", "t", or {1,1} I get a result that is an image (Graphics 
object) and looks the same as the input.  This is what I expect.

The PDF format (unless it is a "tagged" PDF) does not preserve the 
structure of text, e.g. it does not preserve spaces and newlines as 
characters, or the exact order of lines of text (think about multicolumn 
layouts).  What it does preserve exactly is what the document /looks/ 
like.  PDF readers can work around these limitations partially, and they 
make it possible to copy text from untagged PDFs, but the results are 
usually not perfect ...

It is possible to extract text from PDFs with Mathematica too, but just 
like copying with a PDF reader, this is not very reliable ...  Try 
ImportString[ExportString["abc", "PDF"], {"PDF", "Plaintext"}].  Now try 
ImportString[ExportString["{1,1}", "PDF"], {"PDF", "Plaintext"}] and see 
how { and } are mangled.  I get the same result even if I export the PDF 
to disk and copy the text with Adobe Reader.

To avoid misunderstandings caused by e.g. version differences, here's 
the output of the two original examples:

In[1]:= ImportString[ExportString[" ", "PDF"], "PDF"] // InputForm

Out[1]//InputForm=
{Graphics[{Thickness[0.1203858530857692]}, ImageSize -> {6., 12.},
   PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]}

In[2]:= ImportString[ExportString["t", "PDF"], "PDF"] // InputForm

Out[2]//InputForm=
{Graphics[{Thickness[0.1203858530857692],
    Style[{Polygon[{{2.091796875, 8.509033203125}, {4.220947265625,
       8.509033203125}, {4.421722625851167, 8.451835899501459},
       {4.482421875, 8.303589152176556}, {4.421722625851167,
       8.156801263831468}, {4.220947265625, 8.09814453125},
       {2.091796875, 8.09814453125}, {2.091796875, 5.495071589250973},
       {2.108850370770762, 5.332197101756292}, {2.1600108580830497,
       5.183257225422544}, {2.2452783369368614, 5.0482519602497264},
       {2.364652807332199, 4.927181306237841}, {2.516930475494741,
       4.827340876360347}, {2.7009075476501696, 4.75602628359071},
       {2.9165840237984852, 4.713237527928927}, {3.1639599039396877,
       4.698974609375}, {3.366194069710223, 4.7067809227375665},
       {3.5765993348355423, 4.730199862825267}, {3.7951756993156462,
       4.7692314296381015}, {4.021923163150536, 4.82387562317607},
       {4.239915941290354, 4.888295938810949}, {4.4322282486852504,
       4.956655871914518}, {4.5988600853352235, 5.028955422486776},
       {4.739811451240271, 5.105194590527724}, {4.893894204462548,
       5.1688121276143}, {5.0240478515625, 5.107528935888862},
       {5.080078125, 4.963367506687742}, {5.019378875851167,
       4.8262102535262645}, {4.762354584333962, 4.657900419123906},
       {4.553572510003116, 4.568766279999848}, {4.291277149957441,
       4.476312659593871}, {4.005271056653925, 4.3939634686778035},
       {3.72535678254955, 4.335142618023468}, {3.4515343276443184,
       4.299850107630867}, {3.1838036919382304, 4.2880859375},
       {3.016159801260507, 4.293133587967097}, {2.8572888622951877,
       4.3082765393683875}, {2.565865839501763, 4.368848344973554},
       {2.309534623557955, 4.469801354315495}, {2.0882952144637645,
       4.6111355673942125}, {1.9100633970030552, 4.78668620781212},
       {1.7827549559596911, 4.990288499171632}, {1.706369891333673,
       5.221942441472747}, {1.680908203125, 5.481648034715468},
       {1.680908203125, 8.09814453125}, {0.940258124088035,
       8.09814453125}, {0.7345219593567602, 8.15621760624392},
       {0.67236328125, 8.305923782526751}, {0.7345219593567602,
       8.450960413120136}, {0.940258124088035, 8.509033203125},
       {1.680908203125, 8.509033203125}, {1.680908203125, 9.6669921875},
       {1.7389812781189207, 9.867767547726167}, {1.8840181937013618,
       9.928466796875}, {2.0337240849951357, 9.867767547726167},
       {2.091796875, 9.6669921875}, {2.091796875, 8.509033203125}}]},
     Thickness[0.1203858530857692]]}, ImageSize -> {6., 12.},
   PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]}


  • Prev by Date: Re: Dynamic
  • Next by Date: a workaround for large EPS files from ContourPlot
  • Previous by thread: Re: Bug in ExportString?
  • Next by thread: Re: Bug in ExportString?