Re: Bug in ExportString?
- To: mathgroup at smc.vnet.net
- Subject: [mg87382] Re: Bug in ExportString?
- From: Szabolcs Horvát <szhorvat at gmail.com>
- Date: Wed, 9 Apr 2008 05:54:34 -0400 (EDT)
- Organization: University of Bergen
- References: <ftfee5$bq7$1@smc.vnet.net> <ftfk7f$f9f$1@smc.vnet.net>
dh wrote: > Hi, > > using FullForm on ImportString[ExportString[{1,1}, "PDF"], "PDF"] > > //FullForm it is clear that the output is quite different from the > > input: {1,1}. Is any documetation available or is this a bug? > > Daniel > P_ter wrote: > I did not formulate a good question about: > ImportString[ExportString[" ", "PDF"], "PDF"] > My point is that this gives an image, while > ImportString[ExportString["t", "PDF"], "PDF"] > gives with Inputform a polygon. > I think a space is also text (" "). It has an ASCII place. So, there should be no difference in structure with the letter t. > with friendly greetings, > P_ter > Hi Peter and Daniel, Why do you think that there is a bug in ImportString or ExportString? For all of " ", "t", or {1,1} I get a result that is an image (Graphics object) and looks the same as the input. This is what I expect. The PDF format (unless it is a "tagged" PDF) does not preserve the structure of text, e.g. it does not preserve spaces and newlines as characters, or the exact order of lines of text (think about multicolumn layouts). What it does preserve exactly is what the document /looks/ like. PDF readers can work around these limitations partially, and they make it possible to copy text from untagged PDFs, but the results are usually not perfect ... It is possible to extract text from PDFs with Mathematica too, but just like copying with a PDF reader, this is not very reliable ... Try ImportString[ExportString["abc", "PDF"], {"PDF", "Plaintext"}]. Now try ImportString[ExportString["{1,1}", "PDF"], {"PDF", "Plaintext"}] and see how { and } are mangled. I get the same result even if I export the PDF to disk and copy the text with Adobe Reader. To avoid misunderstandings caused by e.g. version differences, here's the output of the two original examples: In[1]:= ImportString[ExportString[" ", "PDF"], "PDF"] // InputForm Out[1]//InputForm= {Graphics[{Thickness[0.1203858530857692]}, ImageSize -> {6., 12.}, PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]} In[2]:= ImportString[ExportString["t", "PDF"], "PDF"] // InputForm Out[2]//InputForm= {Graphics[{Thickness[0.1203858530857692], Style[{Polygon[{{2.091796875, 8.509033203125}, {4.220947265625, 8.509033203125}, {4.421722625851167, 8.451835899501459}, {4.482421875, 8.303589152176556}, {4.421722625851167, 8.156801263831468}, {4.220947265625, 8.09814453125}, {2.091796875, 8.09814453125}, {2.091796875, 5.495071589250973}, {2.108850370770762, 5.332197101756292}, {2.1600108580830497, 5.183257225422544}, {2.2452783369368614, 5.0482519602497264}, {2.364652807332199, 4.927181306237841}, {2.516930475494741, 4.827340876360347}, {2.7009075476501696, 4.75602628359071}, {2.9165840237984852, 4.713237527928927}, {3.1639599039396877, 4.698974609375}, {3.366194069710223, 4.7067809227375665}, {3.5765993348355423, 4.730199862825267}, {3.7951756993156462, 4.7692314296381015}, {4.021923163150536, 4.82387562317607}, {4.239915941290354, 4.888295938810949}, {4.4322282486852504, 4.956655871914518}, {4.5988600853352235, 5.028955422486776}, {4.739811451240271, 5.105194590527724}, {4.893894204462548, 5.1688121276143}, {5.0240478515625, 5.107528935888862}, {5.080078125, 4.963367506687742}, {5.019378875851167, 4.8262102535262645}, {4.762354584333962, 4.657900419123906}, {4.553572510003116, 4.568766279999848}, {4.291277149957441, 4.476312659593871}, {4.005271056653925, 4.3939634686778035}, {3.72535678254955, 4.335142618023468}, {3.4515343276443184, 4.299850107630867}, {3.1838036919382304, 4.2880859375}, {3.016159801260507, 4.293133587967097}, {2.8572888622951877, 4.3082765393683875}, {2.565865839501763, 4.368848344973554}, {2.309534623557955, 4.469801354315495}, {2.0882952144637645, 4.6111355673942125}, {1.9100633970030552, 4.78668620781212}, {1.7827549559596911, 4.990288499171632}, {1.706369891333673, 5.221942441472747}, {1.680908203125, 5.481648034715468}, {1.680908203125, 8.09814453125}, {0.940258124088035, 8.09814453125}, {0.7345219593567602, 8.15621760624392}, {0.67236328125, 8.305923782526751}, {0.7345219593567602, 8.450960413120136}, {0.940258124088035, 8.509033203125}, {1.680908203125, 8.509033203125}, {1.680908203125, 9.6669921875}, {1.7389812781189207, 9.867767547726167}, {1.8840181937013618, 9.928466796875}, {2.0337240849951357, 9.867767547726167}, {2.091796875, 9.6669921875}, {2.091796875, 8.509033203125}}]}, Thickness[0.1203858530857692]]}, ImageSize -> {6., 12.}, PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]}