Re: Bug in ExportString?
- To: mathgroup at smc.vnet.net
- Subject: [mg87382] Re: Bug in ExportString?
- From: Szabolcs Horvát <szhorvat at gmail.com>
- Date: Wed, 9 Apr 2008 05:54:34 -0400 (EDT)
- Organization: University of Bergen
- References: <ftfee5$bq7$1@smc.vnet.net> <ftfk7f$f9f$1@smc.vnet.net>
dh wrote:
> Hi,
>
> using FullForm on ImportString[ExportString[{1,1}, "PDF"], "PDF"]
>
> //FullForm it is clear that the output is quite different from the
>
> input: {1,1}. Is any documetation available or is this a bug?
>
> Daniel
>
P_ter wrote:
> I did not formulate a good question about:
> ImportString[ExportString[" ", "PDF"], "PDF"]
> My point is that this gives an image, while
> ImportString[ExportString["t", "PDF"], "PDF"]
> gives with Inputform a polygon.
> I think a space is also text (" "). It has an ASCII place. So, there
should be no difference in structure with the letter t.
> with friendly greetings,
> P_ter
>
Hi Peter and Daniel,
Why do you think that there is a bug in ImportString or ExportString?
For all of " ", "t", or {1,1} I get a result that is an image (Graphics
object) and looks the same as the input. This is what I expect.
The PDF format (unless it is a "tagged" PDF) does not preserve the
structure of text, e.g. it does not preserve spaces and newlines as
characters, or the exact order of lines of text (think about multicolumn
layouts). What it does preserve exactly is what the document /looks/
like. PDF readers can work around these limitations partially, and they
make it possible to copy text from untagged PDFs, but the results are
usually not perfect ...
It is possible to extract text from PDFs with Mathematica too, but just
like copying with a PDF reader, this is not very reliable ... Try
ImportString[ExportString["abc", "PDF"], {"PDF", "Plaintext"}]. Now try
ImportString[ExportString["{1,1}", "PDF"], {"PDF", "Plaintext"}] and see
how { and } are mangled. I get the same result even if I export the PDF
to disk and copy the text with Adobe Reader.
To avoid misunderstandings caused by e.g. version differences, here's
the output of the two original examples:
In[1]:= ImportString[ExportString[" ", "PDF"], "PDF"] // InputForm
Out[1]//InputForm=
{Graphics[{Thickness[0.1203858530857692]}, ImageSize -> {6., 12.},
PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]}
In[2]:= ImportString[ExportString["t", "PDF"], "PDF"] // InputForm
Out[2]//InputForm=
{Graphics[{Thickness[0.1203858530857692],
Style[{Polygon[{{2.091796875, 8.509033203125}, {4.220947265625,
8.509033203125}, {4.421722625851167, 8.451835899501459},
{4.482421875, 8.303589152176556}, {4.421722625851167,
8.156801263831468}, {4.220947265625, 8.09814453125},
{2.091796875, 8.09814453125}, {2.091796875, 5.495071589250973},
{2.108850370770762, 5.332197101756292}, {2.1600108580830497,
5.183257225422544}, {2.2452783369368614, 5.0482519602497264},
{2.364652807332199, 4.927181306237841}, {2.516930475494741,
4.827340876360347}, {2.7009075476501696, 4.75602628359071},
{2.9165840237984852, 4.713237527928927}, {3.1639599039396877,
4.698974609375}, {3.366194069710223, 4.7067809227375665},
{3.5765993348355423, 4.730199862825267}, {3.7951756993156462,
4.7692314296381015}, {4.021923163150536, 4.82387562317607},
{4.239915941290354, 4.888295938810949}, {4.4322282486852504,
4.956655871914518}, {4.5988600853352235, 5.028955422486776},
{4.739811451240271, 5.105194590527724}, {4.893894204462548,
5.1688121276143}, {5.0240478515625, 5.107528935888862},
{5.080078125, 4.963367506687742}, {5.019378875851167,
4.8262102535262645}, {4.762354584333962, 4.657900419123906},
{4.553572510003116, 4.568766279999848}, {4.291277149957441,
4.476312659593871}, {4.005271056653925, 4.3939634686778035},
{3.72535678254955, 4.335142618023468}, {3.4515343276443184,
4.299850107630867}, {3.1838036919382304, 4.2880859375},
{3.016159801260507, 4.293133587967097}, {2.8572888622951877,
4.3082765393683875}, {2.565865839501763, 4.368848344973554},
{2.309534623557955, 4.469801354315495}, {2.0882952144637645,
4.6111355673942125}, {1.9100633970030552, 4.78668620781212},
{1.7827549559596911, 4.990288499171632}, {1.706369891333673,
5.221942441472747}, {1.680908203125, 5.481648034715468},
{1.680908203125, 8.09814453125}, {0.940258124088035,
8.09814453125}, {0.7345219593567602, 8.15621760624392},
{0.67236328125, 8.305923782526751}, {0.7345219593567602,
8.450960413120136}, {0.940258124088035, 8.509033203125},
{1.680908203125, 8.509033203125}, {1.680908203125, 9.6669921875},
{1.7389812781189207, 9.867767547726167}, {1.8840181937013618,
9.928466796875}, {2.0337240849951357, 9.867767547726167},
{2.091796875, 9.6669921875}, {2.091796875, 8.509033203125}}]},
Thickness[0.1203858530857692]]}, ImageSize -> {6., 12.},
PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]}