Re: Bug in ExportString?
- To: mathgroup at smc.vnet.net
- Subject: [mg87415] Re: Bug in ExportString?
- From: dh <dh at metrohm.ch>
- Date: Thu, 10 Apr 2008 02:11:26 -0400 (EDT)
- References: <ftfee5$bq7$1@smc.vnet.net> <ftfk7f$f9f$1@smc.vnet.net> <fti3s8$ocb$1@smc.vnet.net>
Thank's Szabolcs, I interpret this as: for Input PDF can only be used for pictures. The manual is rather misleading here, because it says: "Stores text, fonts, images, and 2D vector graphics in a device- and resolution-independent way". We can only hope that a furure version of import will work better. Daniel Szabolcs Horvát wrote: > dh wrote: >> Hi, >> >> using FullForm on ImportString[ExportString[{1,1}, "PDF"], "PDF"] >> >> //FullForm it is clear that the output is quite different from the >> >> input: {1,1}. Is any documetation available or is this a bug? >> >> Daniel >> > > > P_ter wrote: > > I did not formulate a good question about: > > ImportString[ExportString[" ", "PDF"], "PDF"] > > My point is that this gives an image, while > > ImportString[ExportString["t", "PDF"], "PDF"] > > gives with Inputform a polygon. > > I think a space is also text (" "). It has an ASCII place. So, there > should be no difference in structure with the letter t. > > with friendly greetings, > > P_ter > > > > > Hi Peter and Daniel, > > Why do you think that there is a bug in ImportString or ExportString? > For all of " ", "t", or {1,1} I get a result that is an image (Graphics > object) and looks the same as the input. This is what I expect. > > The PDF format (unless it is a "tagged" PDF) does not preserve the > structure of text, e.g. it does not preserve spaces and newlines as > characters, or the exact order of lines of text (think about multicolumn > layouts). What it does preserve exactly is what the document /looks/ > like. PDF readers can work around these limitations partially, and they > make it possible to copy text from untagged PDFs, but the results are > usually not perfect ... > > It is possible to extract text from PDFs with Mathematica too, but just > like copying with a PDF reader, this is not very reliable ... Try > ImportString[ExportString["abc", "PDF"], {"PDF", "Plaintext"}]. Now try > ImportString[ExportString["{1,1}", "PDF"], {"PDF", "Plaintext"}] and see > how { and } are mangled. I get the same result even if I export the PDF > to disk and copy the text with Adobe Reader. > > To avoid misunderstandings caused by e.g. version differences, here's > the output of the two original examples: > > In[1]:= ImportString[ExportString[" ", "PDF"], "PDF"] // InputForm > > Out[1]//InputForm= > {Graphics[{Thickness[0.1203858530857692]}, ImageSize -> {6., 12.}, > PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]} > > In[2]:= ImportString[ExportString["t", "PDF"], "PDF"] // InputForm > > Out[2]//InputForm= > {Graphics[{Thickness[0.1203858530857692], > Style[{Polygon[{{2.091796875, 8.509033203125}, {4.220947265625, > 8.509033203125}, {4.421722625851167, 8.451835899501459}, > {4.482421875, 8.303589152176556}, {4.421722625851167, > 8.156801263831468}, {4.220947265625, 8.09814453125}, > {2.091796875, 8.09814453125}, {2.091796875, 5.495071589250973}, > {2.108850370770762, 5.332197101756292}, {2.1600108580830497, > 5.183257225422544}, {2.2452783369368614, 5.0482519602497264}, > {2.364652807332199, 4.927181306237841}, {2.516930475494741, > 4.827340876360347}, {2.7009075476501696, 4.75602628359071}, > {2.9165840237984852, 4.713237527928927}, {3.1639599039396877, > 4.698974609375}, {3.366194069710223, 4.7067809227375665}, > {3.5765993348355423, 4.730199862825267}, {3.7951756993156462, > 4.7692314296381015}, {4.021923163150536, 4.82387562317607}, > {4.239915941290354, 4.888295938810949}, {4.4322282486852504, > 4.956655871914518}, {4.5988600853352235, 5.028955422486776}, > {4.739811451240271, 5.105194590527724}, {4.893894204462548, > 5.1688121276143}, {5.0240478515625, 5.107528935888862}, > {5.080078125, 4.963367506687742}, {5.019378875851167, > 4.8262102535262645}, {4.762354584333962, 4.657900419123906}, > {4.553572510003116, 4.568766279999848}, {4.291277149957441, > 4.476312659593871}, {4.005271056653925, 4.3939634686778035}, > {3.72535678254955, 4.335142618023468}, {3.4515343276443184, > 4.299850107630867}, {3.1838036919382304, 4.2880859375}, > {3.016159801260507, 4.293133587967097}, {2.8572888622951877, > 4.3082765393683875}, {2.565865839501763, 4.368848344973554}, > {2.309534623557955, 4.469801354315495}, {2.0882952144637645, > 4.6111355673942125}, {1.9100633970030552, 4.78668620781212}, > {1.7827549559596911, 4.990288499171632}, {1.706369891333673, > 5.221942441472747}, {1.680908203125, 5.481648034715468}, > {1.680908203125, 8.09814453125}, {0.940258124088035, > 8.09814453125}, {0.7345219593567602, 8.15621760624392}, > {0.67236328125, 8.305923782526751}, {0.7345219593567602, > 8.450960413120136}, {0.940258124088035, 8.509033203125}, > {1.680908203125, 8.509033203125}, {1.680908203125, 9.6669921875}, > {1.7389812781189207, 9.867767547726167}, {1.8840181937013618, > 9.928466796875}, {2.0337240849951357, 9.867767547726167}, > {2.091796875, 9.6669921875}, {2.091796875, 8.509033203125}}]}, > Thickness[0.1203858530857692]]}, ImageSize -> {6., 12.}, > PlotRange -> {{0., 5.75}, {0., 12.}}, AspectRatio -> Automatic]} >