From Indie IT Wiki
Jump to: navigation, search

Extract Text From PDF File

pdftotext pdffilename.pdf textfilename.txt

Removing PDF Passwords

gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=out.pdf -c .setpdfwrite -f in.pdf

The PDF Toolkit For Linux


sudo aptitude install pdftk

HOWTO: Join Several Pages Together Into One PDF

pdftk file1.pdf file2.pdf file3.pdf cat output newfile.pdf

HOWTO: Extract A Single Page From One PDF

pdftk ~/path/to/big.pdf cat 1 output /tmp/page1.pdf

HOWTO: Extract A Few Pages From One PDF

pdftk ~/path/to/big.pdf cat 9-11 output /tmp/pages.pdf

HOWTO: Extract All Pages Individually From One PDF

pdftk ~/path/to/big.pdf burst

HOWTO: Extract Pages From An Encrypted Document

Open the PDF with GIMP and you will be presented with the following window:

Gimp import from pdf.png

Click on the page you wish to extract and click the 'Import' button:

Gimp import from pdf select.png

You will now be able to save the page as a PDF with a new file name.

NOTE: This has only been shown to work on Linux but there is no reason to expect it would not work on other platforms.