Occasionally, I have to convert a PDF file to Postscript (e.g. for subsequent processing with some PostScript utility). In the Linux/command line area, I know two options: pdf2ps and pdftops. I also know that one of the two sucks has some issues and the other is better. But because their names are so close I can't manage to remember which one to take. This post should put an end to that!
[Spoiler alert and a questionable mnemonic: pdftops is da top.]
pdf2ps
The pdf2ps tools is based on Ghostscript, a PostScript interpreter. It is actually a fairly thin shell script wrapper around the "gs" tool.
Basic usage is pretty straight forward:
pdf2ps [options] input.pdf [output.ps]
In case you don't specify the output file name, it's extracted from the input file name by replacing the ".pdf" extension with ".ps".
Help: the man page is minimalistic, the only interesting thing is the note about the option -dLanguageLevel=3. Running pdf2ps -h also gives you this info. However, as the tool is based on Ghostscript, you can also give it other options that "gs" accepts (for example -r 300 to set the resolution to 300 dpi).
pdftops
The pdftops tool comes with Xpdf or is successor Poppler (e.g. on Ubuntu 9.10, it's part of the "poppler-utils" package).
Basic usage is straight forward:
pdftops [options] input.pdf [output.ps]
Help: the man page explains a lot of options and pdftops -h gives you a short version of that.
pdf2ps versus pdftops: pdftops wins hands down
Problem with pdf2ps: fonts are converted to bitmap fonts (at a pretty high resolution by default, but configurable with the -r option). For starters, the PDF-to-PS conversion can take quite some time. But more importantly, the resulting file can be huge, which puts a burden on subsequent processing of the PostScript file. I also remember having issues with pdf2ps rather aggressively cropping to bounding boxes.
With pdftops on the other hand, the conversion takes less time, the resulting file is smaller, and fonts are better preserved. Also, pdftops provides some interesting additional options like -eps to generate an EPS file, -f and -l to limit the page range to convert, and options to control/change the page size: -origpagesizes, -nocrop , -expand, -noshrink. Check out the man page for more info.
Even more Poppler goodies
Apart from the pdftops tool, the poppler-utils package also provides some other interesting toosl: pdfinfo (PDF document information extractor), pdfimages (PDF image extractor), pdftohtml (PDF to HTML converter), pdftotext (PDF to text converter), and pdffonts (PDF font analyzer). Handy.
thnx
thnx for sharing, didnt expect to be made as simple as that.
Good
Good comparasion
pdftops produces smaller files
Upon further testing, it appears that "pdftops" produces smaller files than "ps2ps".
I tested it with the same PDF.
original.pdf is 8.0 KB
output_pdftops.ps is 46.2 KB
output_ps2ps2.ps is 126.4 KB
a significant difference considering they are text files.
Open each output file with a text editor. The file produced by "ps2ps2" just has the BoundingBox comment, followed by a lot of messy code. The file produced by "pdftops" has more structured comments and code.
The differences in size in the output files clearly indicate that "ps2ps2" and "pdftops" do different things behind the scenes, even if the output file is displayed on the screen essentially the same.
Ghostscript does conversion PDF to PS
Ghostscript can be used to convert PDF-to-PS with seemingly the same appearance as "pdftops", however this is something that is apparently unknown to many.
People learn about "pdf2ps" and immediatelly go with that. Despite its name, use "ps2ps2" instead. According to the Ghostscript documentation, it converts both PDF and Postscript Level-3 to Postscript Level-2.
The programs "pdf2ps" or "ps2ps" are equivalent to using the "pswrite" device:
gs -sDEVICE=pswrite ...
which converts fonts into paths, a common complain.
But "ps2ps2" is equivalent to using the special "ps2write" device, which is based on "pdfwrite" to produce PDF:
gs -sDEVICE=ps2write ...
A caveat is that the "ps2write" device is not Level-1 or DSC-compliant PostScript... whatever that means.
http://www.ghostscript.com/pipermail/gs-devel/2009-April/008328.html
At least in the basic tests that I've done, "ps2ps2" and "pdftops" work the same. Maybe, there is some low-level difference, but to basic conversion and enter figures in LaTeX, it seems like a simple solution.
I believe the Ghostscript documentation needs a rewrite to appeal more to the common user. Right now (Ghostscript 8.70), it seems like it was written in 1993 exclusively for professional typesetters, and experts of the Postscript language.
Post new comment