Re: Bug in ExportString?
- To: mathgroup at smc.vnet.net
- Subject: [mg87415] Re: Bug in ExportString?
- From: dh <dh at metrohm.ch>
- Date: Thu, 10 Apr 2008 02:11:26 -0400 (EDT)
- References: <ftfee5$bq7$1@smc.vnet.net> <ftfk7f$f9f$1@smc.vnet.net> <fti3s8$ocb$1@smc.vnet.net>
Thank's Szabolcs,
I interpret this as: for Input PDF can only be used for pictures. The
manual is rather misleading here, because it says:
"Stores text, fonts, images, and 2D vector graphics in a device- and
resolution-independent way".
We can only hope that a furure version of import will work better.
Daniel
Szabolcs Horvát wrote:
> dh wrote:
>> Hi,
>>
>> using FullForm on ImportString[ExportString[{1,1}, "PDF"], "PDF"]
>>
>> //FullForm it is clear that the output is quite different from the
>>
>> input: {1,1}. Is any documetation available or is this a bug?
>>
>> Daniel
>>
>
>
> P_ter wrote:
> > I did not formulate a good question about:
> > ImportString[ExportString[" ", "PDF"], "PDF"]
> > My point is that this gives an image, while
> > ImportString[ExportString["t", "PDF"], "PDF"]
> > gives with Inputform a polygon.
> > I think a space is also text (" "). It has an ASCII place. So, there
> should be no difference in structure with the letter t.
> > with friendly greetings,
> > P_ter
> >
>
>
> Hi Peter and Daniel,
>
> Why do you think that there is a bug in ImportString or ExportString?
> For all of " ", "t", or {1,1} I get a result that is an image (Graphics
> object) and looks the same as the input. This is what I expect.
>
> The PDF format (unless it is a "tagged" PDF) does not preserve the
> structure of text, e.g. it does not preserve spaces and newlines as
> characters, or the exact order of lines of text (think about multicolumn
> layouts). What it does preserve exactly is what the document /looks/
> like. PDF readers can work around these limitations partially, and they
> make it possible to copy text from untagged PDFs, but the results are
> usually not perfect ...
>
> It is possible to extract text from PDFs with Mathematica too, but just
> like copying with a PDF reader, this is not very reliable ... Try
> ImportString[ExportString["abc", "PDF"], {"PDF", "Plaintext"}]. Now try
> ImportString[ExportString["{1,1}", "PDF"], {"PDF", "Plaintext"}] and see
> how { and } are mangled. I get the same result even if I export the PDF
> to disk and copy the text with Adobe Reader.
>
> To avoid misunderstandings caused by e.g. version differences, here's
> the output of the two original examples:
>
> In[1]:= ImportString[ExportString[" ", "PDF"], "PDF"] // InputForm
>
> Out[1]//InputForm=
> {Graphics[{Thickness[0.1203858530857692]}, ImageSize -> {6., 12.},
> PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]}
>
> In[2]:= ImportString[ExportString["t", "PDF"], "PDF"] // InputForm
>
> Out[2]//InputForm=
> {Graphics[{Thickness[0.1203858530857692],
> Style[{Polygon[{{2.091796875, 8.509033203125}, {4.220947265625,
> 8.509033203125}, {4.421722625851167, 8.451835899501459},
> {4.482421875, 8.303589152176556}, {4.421722625851167,
> 8.156801263831468}, {4.220947265625, 8.09814453125},
> {2.091796875, 8.09814453125}, {2.091796875, 5.495071589250973},
> {2.108850370770762, 5.332197101756292}, {2.1600108580830497,
> 5.183257225422544}, {2.2452783369368614, 5.0482519602497264},
> {2.364652807332199, 4.927181306237841}, {2.516930475494741,
> 4.827340876360347}, {2.7009075476501696, 4.75602628359071},
> {2.9165840237984852, 4.713237527928927}, {3.1639599039396877,
> 4.698974609375}, {3.366194069710223, 4.7067809227375665},
> {3.5765993348355423, 4.730199862825267}, {3.7951756993156462,
> 4.7692314296381015}, {4.021923163150536, 4.82387562317607},
> {4.239915941290354, 4.888295938810949}, {4.4322282486852504,
> 4.956655871914518}, {4.5988600853352235, 5.028955422486776},
> {4.739811451240271, 5.105194590527724}, {4.893894204462548,
> 5.1688121276143}, {5.0240478515625, 5.107528935888862},
> {5.080078125, 4.963367506687742}, {5.019378875851167,
> 4.8262102535262645}, {4.762354584333962, 4.657900419123906},
> {4.553572510003116, 4.568766279999848}, {4.291277149957441,
> 4.476312659593871}, {4.005271056653925, 4.3939634686778035},
> {3.72535678254955, 4.335142618023468}, {3.4515343276443184,
> 4.299850107630867}, {3.1838036919382304, 4.2880859375},
> {3.016159801260507, 4.293133587967097}, {2.8572888622951877,
> 4.3082765393683875}, {2.565865839501763, 4.368848344973554},
> {2.309534623557955, 4.469801354315495}, {2.0882952144637645,
> 4.6111355673942125}, {1.9100633970030552, 4.78668620781212},
> {1.7827549559596911, 4.990288499171632}, {1.706369891333673,
> 5.221942441472747}, {1.680908203125, 5.481648034715468},
> {1.680908203125, 8.09814453125}, {0.940258124088035,
> 8.09814453125}, {0.7345219593567602, 8.15621760624392},
> {0.67236328125, 8.305923782526751}, {0.7345219593567602,
> 8.450960413120136}, {0.940258124088035, 8.509033203125},
> {1.680908203125, 8.509033203125}, {1.680908203125, 9.6669921875},
> {1.7389812781189207, 9.867767547726167}, {1.8840181937013618,
> 9.928466796875}, {2.0337240849951357, 9.867767547726167},
> {2.091796875, 9.6669921875}, {2.091796875, 8.509033203125}}]},
> Thickness[0.1203858530857692]]}, ImageSize -> {6., 12.},
> PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]}
>