Availability: API client version >= 5.4.0
Conversion from PDF to HTML.
usage: pdf2html [options] source
Source to be converted. It can be URL, path to a local
file or '-' to use stdin as an input text.
Options
General Options
Password to open the encrypted PDF file.
Set the scaling factor (zoom) for the main page area.
Constraint:
- Must be a positive integer number.
Default: 100
Set the page range to print.
Constraint:
- A comma separated list of page numbers or ranges.
Examples:
-
Just the second page is printed.
"2"
-
The first and the third page are printed.
"1,3"
-
Everything except the first page is printed.
"2-"
-
Just first 3 pages are printed.
"-3"
-
Pages 3, 6, 7, 8 and 9 are printed.
"3,6-9"
Set the output graphics DPI.
Availability:
API client >= 5.16.0, converter >= 20.10.
See
versioning.
Specifies where the images are stored.
Allowed values:
-
embed
The images are embedded into the output HTML file.
-
separate
The images are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all image files.
-
none
The images are ignored and not converted.
Default: embed
Specifies the format for the output images.
Availability:
API client >= 5.17.0, converter >= 20.10.
See
versioning.
Specifies where the style sheets are stored.
Specifies where the fonts are stored.
Sets the processing mode for handling Type 3 fonts.
Availability:
API client >= 6.2.0, converter >= 24.04.
See
versioning.
Allowed values:
-
raster
Rasters Type 3 fonts into images, ensuring an exact visual representation in the HTML output.
-
convert
Attempts to convert Type 3 fonts to a web font, resulting in smaller file sizes with some possible visual discrepancies.
Default: raster
Converts ligatures, two or more letters combined into a single glyph, back into their individual ASCII characters.
Allowed values:
-
true, 1 or on
-
false, 0 or off
Default: False
Apply custom CSS to the output HTML document. It allows you to modify the visual appearance and layout. Tip: Using !important in custom CSS provides a way to prioritize and override conflicting styles.
Availability:
API client >= 6.2.0, converter >= 24.04.
See
versioning.
Add the specified prefix to all id and class attributes in the HTML content, creating a namespace for safe integration into another HTML document. This ensures unique identifiers, preventing conflicts when merging with other HTML.
Availability:
API client >= 6.3.0, converter >= 24.04.
See
versioning.
Constraint:
- Start with a letter or underscore, and use only letters, numbers, hyphens, underscores, or colons.
Enforces the zip output format.
Allowed values:
-
true, 1 or on
-
false, 0 or off
Default: False
Set the HTML title. The title from the input PDF is used by default.
Set the HTML subject. The subject from the input PDF is used by default.
Set the HTML author. The author from the input PDF is used by default.
Associate keywords with the HTML document. Keywords from the input PDF are used by default.
Miscellaneous
Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the
getDebugLogUrl method or available in
conversion statistics.
Allowed values:
-
true, 1 or on
-
false, 0 or off
Default: False
Tag the conversion with a custom value. The tag is used in
conversion statistics. A value longer than 32 characters is cut off.
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTP scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
Constraint:
- The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
Examples:
-
"myproxy.com:8080"
-
"113.25.84.10:33333"
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTPS scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
Constraint:
- The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
Examples:
-
"myproxy.com:443"
-
"113.25.84.10:44333"