I have sometimes come across PDF files which contain images that I would like to extract and use for other purposes. Here are two approaches that work to pull images out of a PDF file.
pdfimage -j file.pdg xxxThis will extract all (well, almost all) images from the PDF file and store them as xxx-nnn.jpg or xxx-nnn.ppm. The string "xxx" is used as a prefix for all of the files generated. Not everything that looks like it should be an image in the pdf file will be extracted though, and at this point the details are beyond me.
convert file.pdf xxx.jpgThis command will convert each page of the pdf file into an image and generates a series of files of the form xxx-n.jpg. This works quite well, and after this you can attempt to use the gimp (good luck!) or some other tool to crop and scale the image you are really after out of the page sized image.
The command ls /usr/bin/pdf* shows a long list of commands, worthy of investigation, including:
pdftotext pdftops pdfinfo pdftohtml pdffonts pdftosrc
There may be more clever tricks also, if you find any, let me know.
Adventures in Computing / [email protected]