dumppdf - Man Page

dumppdf – extract pdf structure in XML format

Synopsis

dumppdf [-h] [--version] [--debug] [--extract-toc | --extract-embedded EXTRACT_EMBEDDED] [--page-numbers PAGE_NUMBERS [PAGE_NUMBERS ...]] [--pagenos PAGENOS] [--objects OBJECTS] [--all] [--password PASSWORD] [--outfile OUTFILE] [--raw-stream | --binary-stream | --text-stream] files [files ...]

-h, --help: Show a help message and exit.
--version, -v: Show program’s version number and exit.
--debug, -d: Use debug logging level.
--extract-toc, -T: Extract structure of outline
--extract-embedded EXTRACT_EMBEDDED, -E EXTRACT_EMBEDDED: Extract embedded files

Used during PDF parsing

--page-numbers PAGE_NUMBERS [PAGE_NUMBERS ...]: A space-seperated list of page numbers to parse.
--pagenos PAGENOS, -p PAGENOS: A comma-separated list of page numbers to parse. Included for legacy applications; use --page-numbers for more idiomatic argument entry.
--objects OBJECTS, -i OBJECTS: Comma separated list of object numbers to extract
--all, -a: If the structure of all objects should be extracted
--password PASSWORD, -P PASSWORD: The password to use for decrypting PDF file.

Used during output generation.

--outfile OUTFILE, -o OUTFILE: Path to file where output is written. Or “-” (default) to write to stdout.
--raw-stream, -r: Write stream objects without encoding
--binary-stream, -b: Write stream objects with binary encoding
--text-stream, -t: Write stream objects as plain text

October 2021