PDF to HTML / HTTP API Reference

Conversion Input

url
The address of the PDF to convert.
Constraint:
  • The supported protocols are http:// and https://.
file
The path to a local file to convert.
Constraint:
  • The file must exist and not be empty.
data
Convert raw data.

Conversion Format

input_format
The format of input file.
Allowed values:
  • pdf
output_format
The format of the output file.
Allowed values:
  • html
Default: html

Response

output_name
The file name of the created file (max 180 chars). If not specified then the name is auto-generated.
content_disposition
The value of the Content-Disposition HTTP header sent in the response.
Allowed values:
  • attachment
    Forces the browser to pop up a Save As dialog.
  • inline
    The browser will open the result file in the browser window.
Default: attachment

General Options

pdf_password
Password to open the encrypted PDF file.
scale_factor
Set the scaling factor (zoom) for the main page area.
Constraint:
  • Must be a positive integer number.
Default: 100
Set the page range to print.
Constraint:
  • A comma separated list of page numbers or ranges.
Examples:
  • Just the second page is printed.
    2
  • The first and the third page are printed.
    1,3
  • Everything except the first page is printed.
    2-
  • Just first 3 pages are printed.
    -3
  • Pages 3, 6, 7, 8 and 9 are printed.
    3,6-9
dpi
Set the output graphics DPI.
Available for converters >= 20.10. See versioning.
Default: 144
image_mode
Specifies where the images are stored.
Allowed values:
  • embed
    The images are embedded into the output HTML file.
  • separate
    The images are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all image files.
  • none
    The images are ignored and not converted.
Default: embed
image_format
Specifies the format for the output images.
Available for converters >= 20.10. See versioning.
Allowed values:
  • png
  • jpg
  • svg
Default: png
css_mode
Specifies where the style sheets are stored.
Allowed values:
  • embed
    Style sheets are embedded into the output HTML file.
  • separate
    Style sheets are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all style sheets.
Default: embed
font_mode
Specifies where the fonts are stored.
Allowed values:
  • embed
    The fonts are embedded into the output HTML file.
  • separate
    The font are saved to separate files. In this mode the output of the conversion is a zip file containing HTML and all font files.
Default: embed
type3_mode
Sets the processing mode for handling Type 3 fonts.
Available for converters >= 24.04. See versioning.
Allowed values:
  • raster
    Rasters Type 3 fonts into images, ensuring an exact visual representation in the HTML output.
  • convert
    Attempts to convert Type 3 fonts to a web font, resulting in smaller file sizes with some possible visual discrepancies.
Default: raster
split_ligatures
Converts ligatures, two or more letters combined into a single glyph, back into their individual ASCII characters.
Allowed values:
  • true, 1 or on
  • false, 0 or off
Default: false
custom_css
Apply custom CSS to the output HTML document. It allows you to modify the visual appearance and layout. Tip: Using !important in custom CSS provides a way to prioritize and override conflicting styles.
Available for converters >= 24.04. See versioning.
Example:
  • Set the main background color to azure.
    #page-container { background-color: azure; }
html_namespace
Add the specified prefix to all id and class attributes in the HTML content, creating a namespace for safe integration into another HTML document. This ensures unique identifiers, preventing conflicts when merging with other HTML.
Available for converters >= 24.04. See versioning.
Constraint:
  • Start with a letter or underscore, and use only letters, numbers, hyphens, underscores, or colons.
Examples:
  • pdf1_
  • uniqueID123_
force_zip
Enforces the zip output format.
Allowed values:
  • true, 1 or on
  • false, 0 or off
Default: false
title
Set the HTML title. The title from the input PDF is used by default.
subject
Set the HTML subject. The subject from the input PDF is used by default.
author
Set the HTML author. The author from the input PDF is used by default.
keywords
Associate keywords with the HTML document. Keywords from the input PDF are used by default.

Miscellaneous

debug_log
Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log is returned in the x-pdfcrowd-debug-log response header or available in conversion statistics.
Allowed values:
  • true, 1 or on
  • false, 0 or off
Default: false
tag
Tag the conversion with a custom value. The tag is used in conversion statistics. A value longer than 32 characters is cut off.
Example:
  • client-1234
http_proxy
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTP scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
Constraint:
  • The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
Examples:
  • myproxy.com:8080
  • 113.25.84.10:33333
https_proxy
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTPS scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
Constraint:
  • The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
Examples:
  • myproxy.com:443
  • 113.25.84.10:44333