Important: This document is for the beta version of the new Pdfcrowd API. Use this documentation for the stable API version.

HTML to PDF API - HTTP Documentation

Overview

The Pdfcrowd API is HTTP-based, the communication is made through normal HTTP requests. You can call the API by sending an HTTP request to the API server address with options passed as POST data.

The POST request's content type must be multipart/form-data if the request includes any local files. Otherwise it can be application/x-www-form-urlencoded too.

You can also check out our API client libraries if you want to implement the API in your favorite programming language.

Authentication

Authentication is needed in order to use the Pdfcrowd API. The credentials used for accessing the API are your Pdfcrowd username and the API key. You can find the API key in your account page.

The authentication method for user credentials is HTTP Basic Access Authentication. You provide your credentials every time you make a request.

Server Address

The server address is https://api.pdfcrowd.com/convert/

Both HTTP and HTTPS protocols are supported.

Getting Started

The API lets you convert a web page, a local HTML file, or a string containing HTML.

The best way to start with the API is to choose one of the examples below and once you get it working, you can:

Input HTML Tips

  • You can use the following classes in your HTML code which hide/remove elements from the output:
    • pdfcrowd-remove - sets display:none on the element
    • pdfcrowd-hide - sets visibility:hidden on the element
  • You can switch to the print version of the page (if it exists) with use_print_media.
  • You can force a page break with
    <div style="page-break-before:always"></div>
  • You can avoid a page break inside an element with the following CSS
    img { page-break-inside:avoid }
  • You can use custom_javascript to alter the HTML contents with a custom JavaScript.

Convert a web page to a PDF file

curl -f -u 'username:apikey' \
    -o example.pdf  \
    -F 'url=http://www.example.com' \
    https://api.pdfcrowd.com/convert/

Convert a local HTML file to a PDF file

curl -f -u 'username:apikey' \
    -o MyLayout.pdf  \
    -F 'file=@/path/to/MyLayout.html' \
    https://api.pdfcrowd.com/convert/

Convert a string containing HTML to a PDF file

curl -f -u 'username:apikey' \
    -o HelloWorld.pdf  \
    -F 'text= <html><body><h1>Hello World!</h1></body></html>' \
    https://api.pdfcrowd.com/convert/

# or use custom HTML producer
html_producer | curl -u 'username:apikey' \
    -o HelloWorld.pdf  \
    -F 'text=<-' \
    https://api.pdfcrowd.com/convert/

HTML to PDF API Reference

Conversion Input

Parameter Description Default
url
The address of the web page to convert.
The supported protocols are http:// and https://.
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
text
The string content to convert.
The string must not be empty.

 

Response

Parameter Description Default
output_name
The file name of the created file (max 180 chars). If not specified then the name is auto-generated.
content_disposition
The value of the Content-Disposition HTTP header sent in the response.
Allowed values:
  • attachment
    Forces the browser to pop up a Save As dialog.
  • inline
    The browser will open the result file in the browser window.
attachment

 

Page setup

Parameter Description Default
page_size
Set the output page size.
Allowed values:
  • A2
  • A3
  • A4
  • A5
  • A6
  • Letter
A4
page_width
Set the output page width.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 300mm
  • 9.5in
8.27in
page_height
Set the output page height. Use -1 for a single page PDF.
Can be -1 or specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 350mm
  • 15.25in
11.7in
orientation
Set the output page orientation.
Allowed values:
  • landscape
  • portrait
portrait
margin_top
Set the output page top margin.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 1in
  • 2.5cm
0.4in
margin_right
Set the output page right margin.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 1in
  • 2.5cm
0.4in
margin_bottom
Set the output page bottom margin.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 1in
  • 2.5cm
0.4in
margin_left
Set the output page left margin.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 1in
  • 2.5cm
0.4in
no_margins
Disable margins.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
header_url
Load an HTML code from the specified URL and use it as the page header. The following classes can be used in the HTML. The content of the respective elements will be expanded as follows:
  • pdfcrowd-page-count - the total page count of printed pages
  • pdfcrowd-page-number - the current page number
  • pdfcrowd-source-url - the source URL of a converted document
The following attributes can be used:
  • data-pdfcrowd-number-format - specifies the type of the used numerals
    • Arabic numerals are used by default.
    • Roman numerals can be generated by the roman and roman-lowercase values
    • Example: <span class='pdfcrowd-page-number' data-pdfcrowd-number-format='roman'></span>
  • data-pdfcrowd-placement - specifies where to place the source URL, allowed values:
    • The URL is inserted to the content
      • Example: <span class='pdfcrowd-source-url'></span>
        will produce <span>http://example.com</span>
    • href - the URL is set to the href attribute
      • Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href'>Link to source</a>
        will produce <a href='http://example.com'>Link to source</a>
    • href-and-content - the URL is set to the href attribute and to the content
      • Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href-and-content'></a>
        will produce <a href='http://example.com'>http://example.com</a>
The supported protocols are http:// and https://.
Examples:
  • http://myserver.com/header.html
header_html
Use the specified HTML code as the page header. The following classes can be used in the HTML. The content of the respective elements will be expanded as follows:
  • pdfcrowd-page-count - the total page count of printed pages
  • pdfcrowd-page-number - the current page number
  • pdfcrowd-source-url - the source URL of a converted document
The following attributes can be used:
  • data-pdfcrowd-number-format - specifies the type of the used numerals
    • Arabic numerals are used by default.
    • Roman numerals can be generated by the roman and roman-lowercase values
    • Example: <span class='pdfcrowd-page-number' data-pdfcrowd-number-format='roman'></span>
  • data-pdfcrowd-placement - specifies where to place the source URL, allowed values:
    • The URL is inserted to the content
      • Example: <span class='pdfcrowd-source-url'></span>
        will produce <span>http://example.com</span>
    • href - the URL is set to the href attribute
      • Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href'>Link to source</a>
        will produce <a href='http://example.com'>Link to source</a>
    • href-and-content - the URL is set to the href attribute and to the content
      • Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href-and-content'></a>
        will produce <a href='http://example.com'>http://example.com</a>
The string must not be empty.
Examples:
  • It displays the page number and the total page count.
    Page <span class='pdfcrowd-page-number'></span> of <span class='pdfcrowd-page-count'></span> pages
header_height
Set the header height.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 30mm
  • 1in
0.5in
print_page_range
Set the page range to print.
A comma seperated list of page numbers or ranges.
Examples:
  • Just the second page is printed.
    2
  • The first and the third page are printed.
    1,3
  • Everything except the first page is printed.
    2-
  • Just first 3 pages are printed.
    -3
  • Pages 3, 6, 7, 8 and 9 are printed.
    3,6-9
page_watermark
Apply the first page of the watermark PDF to every page of the output PDF.
The file must exist and not be empty.
multipage_watermark
Apply each page of the specified watermark PDF to the corresponding page of the output PDF.
The file must exist and not be empty.
page_background
Apply the first page of the specified PDF to the background of every page of the output PDF.
The file must exist and not be empty.
multipage_background
Apply each page of the specified PDF to the background of the corresponding page of the output PDF.
The file must exist and not be empty.
exclude_header_on_pages
The page header is not printed on the specified pages.
A comma seperated list of page numbers.
Examples:
  • The header is not printed on the second page.
    2
  • The header is not printed on the first and the last page.
    1,-1
page_numbering_offset
Set an offset between physical and logical page numbers.
Examples:
  • The page numbering will start with 0. Set exclude_header_on_pages to "1" and the page numbering will start on the second page with 1.
    1
  • The page numbering will start with 11 on the first page. It can be useful for joining documents.
    -10
0

 

General Options

Parameter Description Default
no_background
Do not print the background graphics.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
disable_javascript
Do not execute JavaScript.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
disable_image_loading
Do not load images.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
disable_remote_fonts
Disable loading fonts from remote sources.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
block_ads
Try to block ads. Enabling this option can produce smaller output and speed up the conversion.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
default_encoding
Set the default HTML content text encoding.
auto detect
http_auth_user_name
Set the HTTP authentication user name.
http_auth_password
Set the HTTP authentication password.
use_print_media
Use the print version of the page if available (@media print).
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
no_xpdfcrowd_header
Do not send the X-Pdfcrowd HTTP header in Pdfcrowd HTTP requests.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
cookies
Set cookies that are sent in Pdfcrowd HTTP requests.
Examples:
  • session=6d7184b3bf35;token=2710
verify_ssl_certificates
Do not allow insecure HTTPS connections.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
fail_on_main_url_error
Abort the conversion if the main URL HTTP status code is greater than or equal to 400.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
fail_on_any_url_error
Abort the conversion if any of the sub-request HTTP status code is greater than or equal to 400.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
custom_javascript
Run a custom JavaScript after the document is loaded. The script is intended for post-load DOM manipulation (add/remove elements, update CSS, ...).
The string must not be empty.
custom_http_header
Set a custom HTTP header that is sent in Pdfcrowd HTTP requests.
A string containing the header name and value separated by a colon.
Examples:
  • X-My-Client-ID:k2017-12345
javascript_delay
Wait the specified number of milliseconds to finish all JavaScript after the document is loaded. The maximum value is determined by your API license.
Must be a positive integer number or 0.
200
element_to_convert
Convert only the specified element and its children. The element is specified by one or more CSS selectors. If the element is not found, the conversion fails. If multiple elements are found, the first one is used.
The string must not be empty.
Examples:
  • The first element with the id main-content is converted.
    #main-content
  • The first element with the class name main-content is converted.
    .main-content
  • The first element with the tag name table is converted.
    table
  • The first element with the tag name table or with the id main-content is converted.
    table, #main-content
  • The first element <p class="article"> within <div class="user-panel main"> is converted.
    div.user-panel.main p.article
element_to_convert_mode
Specify the DOM handling when only a part of the document is converted.
Allowed values:
  • cut-out
    The element and its children are cut out of the document.
  • remove-siblings
    All element's siblings are removed.
  • hide-siblings
    All element's sibilings are hidden.
cut-out
wait_for_element
Wait for the specified element in a source document. The element is specified by one or more CSS selectors. If the element is not found, the conversion fails.
The string must not be empty.
Examples:
  • Wait until an element with the id main-content is found.
    #main-content
  • Wait until an element with the class name main-content is found.
    .main-content
  • Wait until an element with the tag name table is found.
    table
  • Wait until an element with the tag name table or with the id main-content is found.
    table, #main-content
  • Wait until <p class="article"> is found within <div class="user-panel main">.
    div.user-panel.main p.article

 

Print Resolution

Parameter Description Default
viewport_width
Set the viewport width in pixels. The viewport is the user's visible area of the page.
The value must be in a range 96-7680.
1024
viewport_height
Set the viewport height in pixels. The viewport is the user's visible area of the page.
Must be a positive integer number.
768
rendering_mode
Sets the rendering mode.
Allowed values:
  • default
    This mode is compatible with the Chrome preview.
  • viewport
    Takes the viewport width into account.
default
scale_factor
Set the scaling factor (zoom) for the main page area.
The value must be in a range 10-500.
100

 

PDF format

Misc values for pdf output.

Parameter Description Default
linearize
Create linearized PDF. This is also known as Fast Web View.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
encrypt
Encrypt the PDF. This prevents search engines from indexing the contents.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
user_password
Protect the PDF with a user password. When a PDF has a user password, it must be supplied in order to view the document and to perform operations allowed by the access permissions.
owner_password
Protect the PDF with an owner password. Supplying an owner password grants unlimited access to the PDF including changing the passwords and access permissions.
no_print
Disallow printing of the output PDF.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
no_modify
Disallow modification of the ouput PDF.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
no_copy
Disallow text and graphics extraction from the output PDF.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false

 

Miscellaneous

Parameter Description Default
debug_log
Turn on the debug logging. The URL of the log is returned in the X-Pdfcrowd-Debug-Log response header.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false

 

Response Headers

HTTP response can contain the following headers.
You can find details about each conversion in your conversion log.

Name Description
X-Pdfcrowd-Debug-Log URL to the debug log
X-Pdfcrowd-Remaining-Credits the number of available conversion credits in your account
X-Pdfcrowd-Consumed-Credits the number of credits consumed by the conversion
X-Pdfcrowd-Job-Id the unique ID of the conversion
X-Pdfcrowd-Pages the total number of pages in the output document
X-Pdfcrowd-Output-Size the size of the output in bytes