HTML to PDF in Python

Overview

This page serves as a guide for using the PDFCrowd API to convert web pages and HTML content to PDF in Python applications.

Installation

You can install the client library from PyPI

pip install pdfcrowd

Download pdfcrowd-6.5.1-python.zip.
Extract the archive and run the following commands:
```
cd pdfcrowd-6.5.1
pip install .
```

Clone pdfcrowd-python from Github and install the library:

git clone https://github.com/pdfcrowd/pdfcrowd-python
cd pdfcrowd-python
pip install .

Quick Start

Below are Python examples to help you quickly get started with the API. Explore our additional examples for more insights.

import pdfcrowd
import sys

try:
    # Create an API client instance.
    client = pdfcrowd.HtmlToPdfClient('demo', 'ce544b6ea52a5621fb9d55f8b542d14d')

    # Specify the mapping of HTML content width to the PDF page width.
    # To fine-tune the layout, you can specify an exact viewport width, such as '960px'.
    client.setContentViewportWidth('balanced')

    # Run the conversion and save the result to a file.
    client.convertUrlToFile('http://www.example.com', 'example.pdf')

except pdfcrowd.Error as why:
    sys.stderr.write('PDFCrowd Error: {}\n'.format(why))
    raise

import pdfcrowd
import sys

try:
    # Create an API client instance.
    client = pdfcrowd.HtmlToPdfClient('demo', 'ce544b6ea52a5621fb9d55f8b542d14d')

    # Specify the mapping of HTML content width to the PDF page width.
    # To fine-tune the layout, you can specify an exact viewport width, such as '960px'.
    client.setContentViewportWidth('balanced')

    # Run the conversion and save the result to a file.
    client.convertFileToFile('/path/to/MyLayout.html', 'MyLayout.pdf')

except pdfcrowd.Error as why:
    sys.stderr.write('PDFCrowd Error: {}\n'.format(why))
    raise

import pdfcrowd
import sys

try:
    # Create an API client instance.
    client = pdfcrowd.HtmlToPdfClient('demo', 'ce544b6ea52a5621fb9d55f8b542d14d')

    # Specify the mapping of HTML content width to the PDF page width.
    # To fine-tune the layout, you can specify an exact viewport width, such as '960px'.
    client.setContentViewportWidth('balanced')

    # Run the conversion and save the result to a file.
    client.convertStringToFile('<html><body><h1>Hello World!</h1></body></html>', 'HelloWorld.pdf')

except pdfcrowd.Error as why:
    sys.stderr.write('PDFCrowd Error: {}\n'.format(why))
    raise

Authentication

To access the API, you will need to use your PDFCrowd username and API key. For initial testing, you may use the following demo credentials without registering:

Username: demo
API key: ce544b6ea52a5621fb9d55f8b542d14d

To obtain your personal API credentials, start a free trial or purchase the API license.

Customization

The table below highlights the most common customizations you might find useful. Refer to the Option Reference for a detailed description of all available options. For an interactive experience, explore these options in the API Playground. You can also test out different PDF layouts in our PDF Layout Preview tool.

For additional customization options and Troubleshooting, please visit the FAQ section of our website where you can find answers and help related to frequent queries and common issues.

Page Size	Adjust the page size using setPageSize() or setPageDimensions(). To create a single-page PDF with page height automatically adjusted to the HTML content, set -1 to setPageHeight().
Page Orientation	Change the page orientation to landscape using setOrientation().
Page Margins	Adjust the page margins with setPageMargins(). To eliminate all margins use setNoMargins().
Headers and Footers	Add custom headers and footers using setHeaderHtml() and setFooterHtml(). To adjust the height of headers or footers, use setFooterHeight() and setHeaderHeight() respectively. For detailed guidance on implementing these features, refer to this tutorial.
HTML Content Fitting	Set the viewport width for formatting the HTML content using setContentViewportWidth(). Use setContentFitMode() to control how the content is fitted into the PDF page. You can also adjust the zoom level of HTML content by using setScaleFactor(), which allows you to scale the content up or down.
Per-page Settings	Use setConversionConfig() to specify the PDF page size, orientation, margins, and the presence and appearance of headers and footers on a per-page basis.
Hide or Remove Contents	Use the following classes in your HTML code to hide or remove elements from the output: `pdfcrowd-remove`: This class applies `display:none!important` to the element, effectively removing it from the layout. `pdfcrowd-hide`: This class applies `visibility:hidden!important` to the element, making it invisible but still occupying space in the layout. For additional methods and detailed explanation, refer to this FAQ article.
Custom CSS Styling	To customize CSS styling specifically for the conversion, use setCustomCss() to inject additional styles. Alternatively, you can directly incorporate conversion-specific styling into your main stylesheet. Just prefix your CSS selectors with `.pdfcrowd-body` to ensure the styles apply only during the conversion process. For example: `.pdfcrowd-body h1 { font-size: 48px; }` `.pdfcrowd-body footer { display: none; }`
Avoid Page Break	To prevent page breaks within specific elements, use the `page-break-inside:avoid;` CSS property. Apply this property to elements where you do not want a page break, such as tables, table cells, and images. Here is an example: `th, td, img { page-break-inside:avoid }`
Force Page Break	You can force a page break in your document by incorporating a specific style in an HTML div tag. Insert the following code where you want the page break to occur: `<div style="page-break-before:always"></div>`
Use `@media print`	Activate the print version of a webpage (if available) using setUsePrintMedia(). This function instructs the API to apply the CSS rules defined within the `@media print` stylesheet, ensuring the output mirrors the print-optimized version of the webpage.
Inject Custom JavaScript	Use setCustomJavascript() or setOnLoadJavascript() to modify HTML content using custom JavaScript scripts. These scripts run when the page loads, allowing you to dynamically alter elements, styles, or behavior. In addition to the standard browser JavaScript APIs, your scripts can leverage helper functions provided by our JavaScript library.
Add PDF Signature	Enable the creation of PDFs with a digital signature field. This feature allows the PDF to be digitally signed using applications such as Adobe Acrobat or Preview. For more detailed instructions refer to our Create Digital Signature in PDF guide.
Fillable PDF Form	Create fillable PDFs equipped with interactive fields and buttons. This functionality allows users to enter data directly into the PDF, making it ideal for applications such as forms and surveys. For detailed instructions on how to implement this feature, please refer to our Create Fillable PDF Form guide.
HTML Templates	Add data to your HTML templates for dynamic generation tailored to specific content needs, such as reports, invoices, and personalized documents. For details, refer to HTML Template to PDF.

Error Handling

It is recommended that you implement error handling to catch errors the API may return. Effective error handling is vital as it ensures application stability and provides clearer diagnostics. See the example code below for guidance on implementing error handling, and refer to this list of status codes for more information.

try: 
    # Call the API 
except pdfcrowd.Error as why: 
    # Log the complete error
    sys.stderr.write('PDFCrowd Error: {}\n'.format(why))

    # Log the HTTP status code
    sys.stderr.write('Status Code: {}\n'.format(why.getStatusCode()))

    # Log the reason code
    sys.stderr.write('Reason Code: {}\n'.format(why.getReasonCode()))

    # Log the error message
    sys.stderr.write('Error Message: {}\n'.format(why.getMessage()))

    # Log the documentation link
    sys.stderr.write('Documentation Link: {}\n'.format(why.getDocumentationLink()))

Troubleshooting

If you are receiving an error, refer to the API Status Codes for more information.
Use setDebugLog() and getDebugLogUrl() to obtain detailed information about the conversion process, including load errors, load times, browser console output, etc.
Consult the FAQ for answers to common questions.
Contact us if you need assistance or if there is a feature you are missing.

API Method Reference

Refer to the HTML to PDF Python Reference for a description of all API methods.

HTML to PDF / Python Guide