HTML to PDF API - HTTP Documentation

Overview

The Pdfcrowd API is HTTP-based, the communication is made through normal HTTP requests. You can call the API by sending an HTTP request to the API server address with options passed as POST data.

The POST request's content type must be multipart/form-data if the request includes any local files. Otherwise it can be application/x-www-form-urlencoded too.

You can also check out our API client libraries if you want to implement the API in your favorite programming language.

Authentication

Authentication is needed in order to use the Pdfcrowd API. The credentials used for accessing the API are your Pdfcrowd username and the API key. You can sign up for the Pdfcrowd API here.

The authentication method for user credentials is HTTP Basic Access Authentication. You provide your credentials every time you make a request.

Server Address

The server address is https://api.pdfcrowd.com/convert/

Both HTTP and HTTPS protocols are supported.

Examples

Convert a web page to a PDF file
curl -f -u "your_username:your_apikey" \
    -o example.pdf  \
    -F "url=http://www.example.com" \
    https://api.pdfcrowd.com/convert/
Convert a local HTML file to a PDF file
curl -f -u "your_username:your_apikey" \
    -o MyLayout.pdf  \
    -F "file=@/path/to/MyLayout.html" \
    https://api.pdfcrowd.com/convert/
Convert a string containing HTML to a PDF file
curl -f -u "your_username:your_apikey" \
    -o HelloWorld.pdf  \
    -F "text= <html><body><h1>Hello World!</h1></body></html>" \
    https://api.pdfcrowd.com/convert/

# or use custom HTML producer
html_producer | curl -u "your_username:your_apikey" \
    -o HelloWorld.pdf  \
    -F "text=<-" \
    https://api.pdfcrowd.com/convert/

Advanced Examples

Customize the page size and the orientation
curl -f -u "your_username:your_apikey" \
    -o letter_landscape.pdf  \
    -F "page_size=Letter"  \
    -F "orientation=landscape"  \
    -F "no_margins=True"  \
    -F "url=http://www.example.com" \
    https://api.pdfcrowd.com/convert/
Put the source URL in the header and the page number in the footer
curl -f -u "your_username:your_apikey" \
    -o header_footer.pdf  \
    -F "header_height=15mm"  \
    -F "footer_height=10mm"  \
    --form-string "header_html=<a class='pdfcrowd-source-url' data-pdfcrowd-placement='href-and-content'></a>"  \
    --form-string "footer_html=<center><span class='pdfcrowd-page-number'></span></center>"  \
    -F "margin_top=0mm"  \
    -F "margin_bottom=0mm"  \
    -F "url=http://www.example.com" \
    https://api.pdfcrowd.com/convert/
Zoom the HTML document
curl -f -u "your_username:your_apikey" \
    -o zoom_300.pdf  \
    -F "scale_factor=300"  \
    -F "url=http://www.example.com" \
    https://api.pdfcrowd.com/convert/
Set PDF metadata
curl -f -u "your_username:your_apikey" \
    -o with_metadata.pdf  \
    -F "author=Pdfcrowd"  \
    -F "title=Hello"  \
    -F "subject=Demo"  \
    -F "keywords=Pdfcrowd,demo"  \
    -F "url=http://www.example.com" \
    https://api.pdfcrowd.com/convert/
Create a Powerpoint like presentation from an HTML document
curl -f -u "your_username:your_apikey" \
    -o slide_show.pdf  \
    --form-string "page_layout=single-page"  \
    --form-string "page_mode=full-screen"  \
    --form-string "initial_zoom_type=fit-page"  \
    -F "orientation=landscape"  \
    -F "no_margins=True"  \
    -F "url=https://pdfcrowd.com/doc/api/" \
    https://api.pdfcrowd.com/convert/
Convert an HTML document section
curl -f -u "your_username:your_apikey" \
    -o html_part.pdf  \
    -F "element_to_convert=#main"  \
    -F "url=https://pdfcrowd.com/doc/api/" \
    https://api.pdfcrowd.com/convert/
Inject an HTML code
curl -f -u "your_username:your_apikey" \
    -o html_inject.pdf  \
    --form-string "custom_javascript=el=document.createElement('h2'); el.textContent='Hello from Pdfcrowd API'; el.style.color='red'; el_before=document.getElementsByTagName('h1')[0]; el_before.parentNode.insertBefore(el, el_before.nextSibling)"  \
    -F "url=http://www.example.com" \
    https://api.pdfcrowd.com/convert/
Renderer debugging - highlight HTML elements
curl -f -u "your_username:your_apikey" \
    -o highlight_background.pdf  \
    -F "custom_javascript=libPdfcrowd.highlightHtml(false, true, true, false)"  \
    -F "url=http://www.example.com" \
    https://api.pdfcrowd.com/convert/
Renderer debugging - borders with spacing around HTML elements
curl -f -u "your_username:your_apikey" \
    -o highlight_borders.pdf  \
    -F "custom_javascript=libPdfcrowd.highlightHtml(true, false, true, true)"  \
    -F "url=http://www.example.com" \
    https://api.pdfcrowd.com/convert/

Tips & Tricks

The API lets you convert a web page, a local HTML file, or a string containing HTML.

The best way to start with the API is to choose one of the examples and once you get it working, you can:

You can also use these HTML related features:

  • You can use the following classes in your HTML code which hide/remove elements from the output:
    • pdfcrowd-remove - sets display:none on the element
    • pdfcrowd-hide - sets visibility:hidden on the element
  • You can switch to the print version of the page (if it exists) with use_print_media.
  • You can force a page break with
    <div style="page-break-before:always"></div>
  • You can avoid a page break inside an element with the following CSS
    img { page-break-inside:avoid }
  • You can use custom_javascript to alter the HTML contents with a custom JavaScript.

HTML to PDF API Reference

Conversion Input

Parameter Description Default
url
The address of the web page to convert.
The supported protocols are http:// and https://.
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
text
The string content to convert.
The string must not be empty.

 

Response

Parameter Description Default
output_name
The file name of the created file (max 180 chars). If not specified then the name is auto-generated.
content_disposition
The value of the Content-Disposition HTTP header sent in the response.
Allowed values:
  • attachment
    Forces the browser to pop up a Save As dialog.
  • inline
    The browser will open the result file in the browser window.
attachment

 

Page setup

Parameter Description Default
page_size
Set the output page size.
Allowed values:
  • A2
  • A3
  • A4
  • A5
  • A6
  • Letter
A4
page_width
Set the output page width. The safe maximum is 200in otherwise some PDF viewers may be unable to open the PDF.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 300mm
  • 9.5in
8.27in
page_height
Set the output page height. Use -1 for a single page PDF. The safe maximum is 200in otherwise some PDF viewers may be unable to open the PDF.
Can be -1 or specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 350mm
  • 15.25in
  • The height of the page is calculated automatically so that the whole document fits into it.
    -1
11.7in
orientation
Set the output page orientation.
Allowed values:
  • landscape
  • portrait
portrait
margin_top
Set the output page top margin.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 1in
  • 2.5cm
0.4in
margin_right
Set the output page right margin.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 1in
  • 2.5cm
0.4in
margin_bottom
Set the output page bottom margin.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 1in
  • 2.5cm
0.4in
margin_left
Set the output page left margin.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 1in
  • 2.5cm
0.4in
no_margins
Disable margins.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
header_url
Load an HTML code from the specified URL and use it as the page header. The following classes can be used in the HTML. The content of the respective elements will be expanded as follows:
  • pdfcrowd-page-count - the total page count of printed pages
  • pdfcrowd-page-number - the current page number
  • pdfcrowd-source-url - the source URL of a converted document
The following attributes can be used:
  • data-pdfcrowd-number-format - specifies the type of the used numerals
    • Arabic numerals are used by default.
    • Roman numerals can be generated by the roman and roman-lowercase values
    • Example: <span class='pdfcrowd-page-number' data-pdfcrowd-number-format='roman'></span>
  • data-pdfcrowd-placement - specifies where to place the source URL, allowed values:
    • The URL is inserted to the content
      • Example: <span class='pdfcrowd-source-url'></span>
        will produce <span>http://example.com</span>
    • href - the URL is set to the href attribute
      • Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href'>Link to source</a>
        will produce <a href='http://example.com'>Link to source</a>
    • href-and-content - the URL is set to the href attribute and to the content
      • Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href-and-content'></a>
        will produce <a href='http://example.com'>http://example.com</a>
The supported protocols are http:// and https://.
Examples:
  • http://myserver.com/header.html
header_html
Use the specified HTML code as the page header. The following classes can be used in the HTML. The content of the respective elements will be expanded as follows:
  • pdfcrowd-page-count - the total page count of printed pages
  • pdfcrowd-page-number - the current page number
  • pdfcrowd-source-url - the source URL of a converted document
The following attributes can be used:
  • data-pdfcrowd-number-format - specifies the type of the used numerals
    • Arabic numerals are used by default.
    • Roman numerals can be generated by the roman and roman-lowercase values
    • Example: <span class='pdfcrowd-page-number' data-pdfcrowd-number-format='roman'></span>
  • data-pdfcrowd-placement - specifies where to place the source URL, allowed values:
    • The URL is inserted to the content
      • Example: <span class='pdfcrowd-source-url'></span>
        will produce <span>http://example.com</span>
    • href - the URL is set to the href attribute
      • Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href'>Link to source</a>
        will produce <a href='http://example.com'>Link to source</a>
    • href-and-content - the URL is set to the href attribute and to the content
      • Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href-and-content'></a>
        will produce <a href='http://example.com'>http://example.com</a>
The string must not be empty.
Examples:
  • It displays the page number and the total page count.
    Page <span class='pdfcrowd-page-number'></span> of <span class='pdfcrowd-page-count'></span> pages
header_height
Set the header height.
Can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt).
Examples:
  • 30mm
  • 1in
0.5in
print_page_range
Set the page range to print.
A comma seperated list of page numbers or ranges.
Examples:
  • Just the second page is printed.
    2
  • The first and the third page are printed.
    1,3
  • Everything except the first page is printed.
    2-
  • Just first 3 pages are printed.
    -3
  • Pages 3, 6, 7, 8 and 9 are printed.
    3,6-9
page_watermark
Apply the first page of the watermark PDF to every page of the output PDF.
The file must exist and not be empty.
multipage_watermark
Apply each page of the specified watermark PDF to the corresponding page of the output PDF.
The file must exist and not be empty.
page_background
Apply the first page of the specified PDF to the background of every page of the output PDF.
The file must exist and not be empty.
multipage_background
Apply each page of the specified PDF to the background of the corresponding page of the output PDF.
The file must exist and not be empty.
exclude_header_on_pages
The page header is not printed on the specified pages.
A comma seperated list of page numbers.
Examples:
  • The header is not printed on the second page.
    2
  • The header is not printed on the first and the last page.
    1,-1
page_numbering_offset
Set an offset between physical and logical page numbers.
Examples:
  • The page numbering will start with 0. Set exclude_header_on_pages to "1" and the page numbering will start on the second page with 1.
    1
  • The page numbering will start with 11 on the first page. It can be useful for joining documents.
    -10
0

 

General Options

Parameter Description Default
no_background
Do not print the background graphics.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
disable_javascript
Do not execute JavaScript.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
disable_image_loading
Do not load images.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
disable_remote_fonts
Disable loading fonts from remote sources.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
block_ads
Try to block ads. Enabling this option can produce smaller output and speed up the conversion.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
default_encoding
Set the default HTML content text encoding.
auto detect
http_auth_user_name
Set the HTTP authentication user name.
http_auth_password
Set the HTTP authentication password.
use_print_media
Use the print version of the page if available (@media print).
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
no_xpdfcrowd_header
Do not send the X-Pdfcrowd HTTP header in Pdfcrowd HTTP requests.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
cookies
Set cookies that are sent in Pdfcrowd HTTP requests.
Examples:
  • session=6d7184b3bf35;token=2710
verify_ssl_certificates
Do not allow insecure HTTPS connections.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
fail_on_main_url_error
Abort the conversion if the main URL HTTP status code is greater than or equal to 400.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
fail_on_any_url_error
Abort the conversion if any of the sub-request HTTP status code is greater than or equal to 400.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
custom_javascript
Run a custom JavaScript after the document is loaded. The script is intended for post-load DOM manipulation (add/remove elements, update CSS, ...).
The string must not be empty.
custom_http_header
Set a custom HTTP header that is sent in Pdfcrowd HTTP requests.
A string containing the header name and value separated by a colon.
Examples:
  • X-My-Client-ID:k2017-12345
javascript_delay
Wait the specified number of milliseconds to finish all JavaScript after the document is loaded. The maximum value is determined by your API license.
Must be a positive integer number or 0.
200
element_to_convert
Convert only the specified element from the main document and its children. The element is specified by one or more CSS selectors. If the element is not found, the conversion fails. If multiple elements are found, the first one is used.
The string must not be empty.
Examples:
  • The first element with the id main-content is converted.
    #main-content
  • The first element with the class name main-content is converted.
    .main-content
  • The first element with the tag name table is converted.
    table
  • The first element with the tag name table or with the id main-content is converted.
    table, #main-content
  • The first element <p class="article"> within <div class="user-panel main"> is converted.
    div.user-panel.main p.article
element_to_convert_mode
Specify the DOM handling when only a part of the document is converted.
Allowed values:
  • cut-out
    The element and its children are cut out of the document.
  • remove-siblings
    All element's siblings are removed.
  • hide-siblings
    All element's sibilings are hidden.
cut-out
wait_for_element
Wait for the specified element in a source document. The element is specified by one or more CSS selectors. The element is searched for in the main document and all iframes. If the element is not found, the conversion fails.
The string must not be empty.
Examples:
  • Wait until an element with the id main-content is found.
    #main-content
  • Wait until an element with the class name main-content is found.
    .main-content
  • Wait until an element with the tag name table is found.
    table
  • Wait until an element with the tag name table or with the id main-content is found.
    table, #main-content
  • Wait until <p class="article"> is found within <div class="user-panel main">.
    div.user-panel.main p.article

 

Print Resolution

Parameter Description Default
viewport_width
Set the viewport width in pixels. The viewport is the user's visible area of the page.
The value must be in a range 96-7680.
1024
viewport_height
Set the viewport height in pixels. The viewport is the user's visible area of the page.
Must be a positive integer number.
768
rendering_mode
Sets the rendering mode.
Allowed values:
  • default
    This mode is compatible with the Chrome preview.
  • viewport
    Takes the viewport width into account.
default
scale_factor
Set the scaling factor (zoom) for the main page area.
The value must be in a range 10-500.
100
disable_smart_shrinking
Disable the intelligent shrinking strategy that tries to optimally fit the HTML contents to a PDF page.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false

 

PDF Format

Miscellaneous values for PDF output.

Parameter Description Default
linearize
Create linearized PDF. This is also known as Fast Web View.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
encrypt
Encrypt the PDF. This prevents search engines from indexing the contents.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
user_password
Protect the PDF with a user password. When a PDF has a user password, it must be supplied in order to view the document and to perform operations allowed by the access permissions.
owner_password
Protect the PDF with an owner password. Supplying an owner password grants unlimited access to the PDF including changing the passwords and access permissions.
no_print
Disallow printing of the output PDF.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
no_modify
Disallow modification of the ouput PDF.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
no_copy
Disallow text and graphics extraction from the output PDF.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
title
Set the title of the PDF.
subject
Set the subject of the PDF.
author
Set the author of the PDF.
keywords
Associate keywords with the document.

 

Viewer Preferences

These preferences specify how a PDF viewer should present the document. The preferences may be ignored by some PDF viewers.

Parameter Description Default
page_layout
Specify the page layout to be used when the document is opened.
Allowed values:
  • single-page
    Display one page at a time.
  • one-column
    Display the pages in one column.
  • two-column-left
    Display the pages in two columns, with odd-numbered pages on the left.
  • two-column-right
    Display the pages in two columns, with odd-numbered pages on the right.
page_mode
Specify how the document should be displayed when opened.
Allowed values:
  • full-screen
    Full-screen mode.
  • thumbnails
    Thumbnail images are visible.
  • outlines
    Document outline is visible.
initial_zoom_type
Specify how the page should be displayed when opened.
Allowed values:
  • fit-width
    The page content is magnified just enough to fit the entire width of the page within the window.
  • fit-height
    The page content is magnified just enough to fit the entire height of the page within the window.
  • fit-page
    The page content is magnified just enough to fit the entire page within the window both horizontally and vertically. If the required horizontal and vertical magnification factors are different, use the smaller of the two, centering the page within the window in the other dimension.
initial_page
Display the specified page when the document is opened.
Must be a positive integer number.
initial_zoom
Specify the initial page zoom in percents when the document is opened.
Must be a positive integer number.
hide_toolbar
Specify whether to hide the viewer application's tool bars when the document is active.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
hide_menubar
Specify whether to hide the viewer application's menu bar when the document is active.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
hide_window_ui
Specify whether to hide user interface elements in the document's window (such as scroll bars and navigation controls), leaving only the document's contents displayed.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
fit_window
Specify whether to resize the document's window to fit the size of the first displayed page.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
center_window
Specify whether to position the document's window in the center of the screen.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
display_title
Specify whether the window's title bar should display the document title. If false , the title bar should instead display the name of the PDF file containing the document.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false
right_to_left
Set the predominant reading order for text to right-to-left. This option has no direct effect on the document's contents or page numbering but can be used to determine the relative positioning of pages when displayed side by side or printed n-up
Allowed values:
  • true, 1 or on
  • false, 0 or off
false

 

Miscellaneous

Parameter Description Default
debug_log
Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log is returned in the X-Pdfcrowd-Debug-Log response header.
Allowed values:
  • true, 1 or on
  • false, 0 or off
false

 

Response Headers

HTTP response can contain the following headers.
You can find details about each conversion in your conversion log.

Name Description
X-Pdfcrowd-Debug-Log URL to the debug log
X-Pdfcrowd-Remaining-Credits the number of available conversion credits in your account
X-Pdfcrowd-Consumed-Credits the number of credits consumed by the conversion
X-Pdfcrowd-Job-Id the unique ID of the conversion
X-Pdfcrowd-Pages the total number of pages in the output document
X-Pdfcrowd-Output-Size the size of the output in bytes

Troubleshooting

  • Check API Status Codes in case of the error code is returned.
  • You can use debug_log to get detailed info about the conversion, such as conversion errors, time, console output.
  • You can use our JavaScript library to resolve rendering problems, such as missing content or blank pages.
    Just use custom_javascript with libPdfcrowd.highlightHtml(borders, backgrounds, labels, noZeroSpace) method call to visualize all HTML elements. See example.
  • Take a look at the FAQ section.