dumppdf - Man Page

dumppdf – extract pdf structure in XML format


dumppdf [-h] [--version] [--debug] [--extract-toc | --extract-embedded EXTRACT_EMBEDDED] [--page-numbers PAGE_NUMBERS [PAGE_NUMBERS ...]] [--pagenos PAGENOS] [--objects OBJECTS] [--all] [--password PASSWORD] [--outfile OUTFILE] [--raw-stream | --binary-stream | --text-stream] files [files ...]


Positional Arguments


One or more paths to PDF files.

Optional Arguments


Show a help message and exit.


Show program’s version number and exit.


Use debug logging level.


Extract structure of outline


Extract embedded files


Used during PDF parsing

--page-numbers PAGE_NUMBERS [PAGE_NUMBERS ...]

A space-seperated list of page numbers to parse.


A comma-separated list of page numbers to parse. Included for legacy applications; use --page-numbers for more idiomatic argument entry.


Comma separated list of object numbers to extract


If the structure of all objects should be extracted


The password to use for decrypting PDF file.


Used during output generation.


Path to file where output is written. Or “-” (default) to write to stdout.


Write stream objects without encoding


Write stream objects with binary encoding


Write stream objects as plain text

