PDF to HTML / Command Line Reference

Availability: API client version >= 5.4.0

Conversion from PDF to HTML.

usage: pdf2html [options] source
source
Source to be converted. It can be URL, path to a local file or '-' to use stdin as an input text.

Options

General Options

-pdf-password

Password to open the encrypted PDF file.

-scale-factor

Set the scaling factor (zoom) for the main page area.

Constraint:
  • Must be a positive integer.
Default:
100

-dpi

Set the output graphics DPI.

Availability:
API client >= 5.16.0, converter >= 20.10. See versioning.
Default:
144

-image-mode

Specifies where the images are stored.

Default:
embed
Allowed Values:
  • embed — The images are embedded into the output HTML file.
  • separate — The images are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all image files.
  • none — The images are ignored and not converted.

-image-format

Specifies the format for the output images.

Availability:
API client >= 5.17.0, converter >= 20.10. See versioning.
Default:
png
Allowed Values:
  • png
  • jpg
  • svg

-css-mode

Specifies where the style sheets are stored.

Default:
embed
Allowed Values:
  • embed — Style sheets are embedded into the output HTML file.
  • separate — Style sheets are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all style sheets.

-font-mode

Specifies where the fonts are stored.

Default:
embed
Allowed Values:
  • embed — The fonts are embedded into the output HTML file.
  • separate — The font are saved to separate files. In this mode the output of the conversion is a zip file containing HTML and all font files.

-type3-mode

Sets the processing mode for handling Type 3 fonts.

Availability:
API client >= 6.2.0, converter >= 24.04. See versioning.
Default:
raster
Allowed Values:
  • raster — Rasters Type 3 fonts into images, ensuring an exact visual representation in the HTML output.
  • convert — Attempts to convert Type 3 fonts to a web font, resulting in smaller file sizes with some possible visual discrepancies.

-split-ligatures

Converts ligatures, two or more letters combined into a single glyph, back into their individual ASCII characters.

Default:
False
Allowed Values:
  • true, 1 or on
  • false, 0 or off

-custom-css

Apply custom CSS to the output HTML document. It allows you to modify the visual appearance and layout. Tip: Using !important in custom CSS provides a way to prioritize and override conflicting styles.

Availability:
API client >= 6.2.0, converter >= 24.04. See versioning.
Example:
  • Set the main background color to azure: "#page-container { background-color: azure; }"

-html-namespace

Add the specified prefix to all id and class attributes in the HTML content, creating a namespace for safe integration into another HTML document. This ensures unique identifiers, preventing conflicts when merging with other HTML.

Availability:
API client >= 6.3.0, converter >= 24.04. See versioning.
Constraint:
  • Start with a letter or underscore, and use only letters, numbers, hyphens, underscores, or colons.
Examples:
  • Namespace for first PDF embed: "pdf1_"
  • Custom namespace to avoid conflicts: "uniqueID123_"

-force-zip

Enforces the zip output format.

Default:
False
Allowed Values:
  • true, 1 or on
  • false, 0 or off

-title

Set the HTML title. The title from the input PDF is used by default.

-subject

Set the HTML subject. The subject from the input PDF is used by default.

-author

Set the HTML author. The author from the input PDF is used by default.

-keywords

Associate keywords with the HTML document. Keywords from the input PDF are used by default.

Miscellaneous

-debug-log

Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the getDebugLogUrl method or available in conversion statistics.

Default:
False
Allowed Values:
  • true, 1 or on
  • false, 0 or off

-tag

Tag the conversion with a custom value. The tag is used in conversion statistics. A value longer than 32 characters is cut off.

Example:
  • Track job in analytics: "client-1234"

-http-proxy

A proxy server used by the conversion process for accessing the source URLs with HTTP scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.

Constraint:
  • The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
Examples:
  • Corporate proxy server: "myproxy.com:8080"
  • Direct IP proxy connection: "113.25.84.10:33333"

-https-proxy

A proxy server used by the conversion process for accessing the source URLs with HTTPS scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.

Constraint:
  • The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
Examples:
  • Secure proxy for HTTPS: "myproxy.com:443"
  • Direct secure proxy IP: "113.25.84.10:44333"