Legacy Pdfcrowd API v1 for Python

This is the documentation of the Python client library for the legacy Pdfcrowd API v1. We strongly recommend the new improved Pdfcrowd API v2 for new integrations.

Installation

Install the Pdfcrowd API client library for Python.

Getting Started

In the following examples, do not forget to replace "username" and "apikey" with your username and API key.

HTML to PDF Example Application

The following code shows how to convert a web page, raw HTML code, and a local HTML file:

import pdfcrowd

try:
    # create an API client instance
    client = pdfcrowd.Client("your_username", "your_apikey")

    # convert a web page and store the generated PDF into a pdf variable
    pdf = client.convertURI('http://www.google.com')

    # convert an HTML string and save the result to a file
    output_file = open('html.pdf', 'wb')
    html="<head></head><body>My HTML Layout</body>"
    client.convertHtml(html, output_file)
    output_file.close()

    # convert an HTML file
    output_file = open('file.pdf', 'wb')
    client.convertFile('/path/to/MyLayout.html', output_file)
    output_file.close()

except pdfcrowd.Error, why:
    print('Failed: {}'.format(why))

HTML to PDF in Django

The following code shows how to generate PDF from a web page in a Django view function:

import pdfcrowd
from django.http import HttpResponse

def generate_pdf_view(request):
    try:
        # create an API client instance
        client = pdfcrowd.Client("your_username", "your_apikey")

        # convert a web page and store the generated PDF to a variable
        pdf = client.convertURI("http://www.google.com")

         # set HTTP response headers
        response = HttpResponse(mimetype="application/pdf")
        response["Cache-Control"] = "max-age=0"
        response["Accept-Ranges"] = "none"
        response["Content-Disposition"] = "attachment; filename=google.pdf"

        # send the generated PDF
        response.write(pdf)
    except pdfcrowd.Error, why:
        response = HttpResponse(mimetype="text/plain")
        response.write(why)
    return response

You can also convert raw HTML code, just use the convertHtml() method instead of convertURI():

pdf = client.convertHtml("<head></head><body>My HTML Layout</body>")

The API lets you also convert a local HTML file:

pdf = client.convertFile("/path/to/MyLayout.html")

Error Handling

try:
    # ..
    # call the API
except pdfcrowd.Error, why:
    # handle the error

API Reference

class pdfcrowd.Client

Provides access to the Pdfcrowd API from your Python applications.

Constructor

def __init__(self, username, apikey)
Arguments are your username at Pdfcrowd and apikey which can be found in your account.

Conversion

def convertHtml(self, html, outstream=None)
Converts the html string to PDF and writes the result to outstream. outstream can be any object having a write(str) method. If outstream is not provided then the return value is a string containing the created PDF.
def convertFile(self, fpath, outstream=None)
Converts a local file fpath to PDF and writes the result to outstream. The file can be either an HTML document or a .zip, .tar.gz., or .tar.bz2 archive which can contain external resources such as images, stylesheets, etc.
outstream can be any object having a write(str) method. If outstream is not provided then the return value is a string containing the created PDF.
def convertURI(self, url, outstream=None)
Converts a web page at url to PDF and writes the result to outstream. outstream can be any object having a write(str) method. If outstream is not provided then the return value is a string containing the created PDF.

Page Setup

def setPageWidth(self, value)
Sets PDF page width in units.
def setPageHeight(self, value)
Sets PDF page height in units. Use -1 for a single page PDF.
def setPageMargins(self, top, right, bottom, left)
Sets PDF page margins in units.
def setHorizontalMargin(self, value)
Deprecated. Use setPageMargins instead.
def setVerticalMargin(self, value)
Deprecated. Use setPageMargins instead.

Header and Footer

def setFooterHtml(self, html)
Places the specified html code inside the page footer. The following variables are expanded:
  • %u - URL to convert.
  • %p - The current page number.
  • %n - Total number of pages.
def setFooterUrl(self, url)
Loads HTML code from the specified url and places it inside the page footer. See setFooterHtml for the list of variables that are expanded.
def setHeaderHtml(self, html)
Places the specified html code inside the page header. See setFooterHtml for the list of variables that are expanded.
def setHeaderUrl(self, url)
Loads HTML code from the specified url and places it inside the page header. See setFooterHtml for the list of variables that are expanded.
def setHeaderFooterPageExcludeList(self, exclist)
exclist is a comma seperated list of physical page numbers on which the header a footer are not printed. Negative numbers count backwards from the last page: -1 is the last page, -2 is the last but one page, and so on.
Example: "1,-1" will not print the header and footer on the first and the last page.
def setPageNumberingOffset(self, offset)
An offset between physical and logical page numbers. The default value is 0.
Example: if set to "1" then the page numbering will start with 1 on the second page.

HTML options

def enableImages(self, value)
Set value to False to disable printing images to the PDF. The default is True
def enableBackgrounds(self, value)
Set value to False to disable printing backgrounds to the PDF. The default is True
def setHtmlZoom(self, value)
Set HTML zoom in percents. It determines the precision used for rendering of the HTML content. Despite its name, it does not zoom the HTML content. Higher values can improve glyph positioning and can lead to overall better visual appearance of generated PDF .The default value is 200. See also setPdfScalingFactor().
def enableJavaScript(self, value)
Set value to False to disable JavaScript in web pages. The default is True.
def enableHyperlinks(self, value)
Set value to False to disable hyperlinks in the PDF. The default is True.
def setDefaultTextEncoding(self, value)
value is the text encoding used when none is specified in a web page. The default is utf-8.
def usePrintMedia(self, value)
If value is True then the print CSS media type is used (if available).

PDF options

def setEncrypted(self, value)
If value is set to True then the PDF is encrypted. This prevents search engines from indexing the document. The default is False.
def setUserPassword(self, pwd)
Protects the PDF with a user password. When a PDF has a user password, it must be supplied in order to view the document and to perform operations allowed by the access permissions. At most 32 characters.
def setOwnerPassword(self, pwd)
Protects the PDF with an owner password. Supplying an owner password grants unlimited access to the PDF including changing the passwords and access permissions. At most 32 characters.
def setNoPrint(self, value)
Set value to True disables printing the generated PDF. The default is False.
def setNoModify(self, value)
Set value to True to disable modifying the PDF. The default is False.
def setNoCopy(self, value)
Set value to True to disable extracting text and graphics from the PDF. The default is False.
def setPageLayout(self, value)
Specifies the initial page layout when the PDF is opened in a viewer.
  • SINGLE_PAGE
  • CONTINUOUS
  • CONTINUOUS_FACING
def setPageMode(self, value)
Specifies the appearance of the PDF when opened.
  • FULLSCREEN - Full-screen mode.
def setInitialPdfZoomType(self, value)
value specifies the appearance of the PDF when opened.
  • FIT_WIDTH
  • FIT_HEIGHT
  • FIT_PAGE
def setInitialPdfExactZoom(self, value)
value specifies the initial page zoom of the PDF when opened.
def setPdfScalingFactor(self, value)
The scaling factor used to convert between HTML and PDF. The default value is 1.0.
def setPageBackgroundColor(self, value)
The page background color in RRGGBB hexadecimal format.
def setTransparentBackground(self, value)
Does not print the body background. Requires the following CSS rule to be declared:
body {background-color:rgba(255,255,255,0.0);}
def setAuthor(self, author)
Sets the author field in the created PDF.

Watermark

def setWatermark(self, url, offet_x=0, offset_y=0)
url is a public absolute URL of the watermark image (must start either with http:// or https://). The supported formats are PNG and JPEG. offset_x and offset_y is the watermark offset in units. The default offset is (0,0).
def setWatermarkRotation(self, angle)
Rotates the watermark by angle degrees.
def setWatermarkInBackground(self, value)
When value is set to True then the watermark is be placed in the background. By default, the watermark is placed in the foreground.

Miscellaneous

def useSSL(self, use_ssl)
Set to True to call the API over a secure connection. The default is False.
def numTokens(self)
Returns the number of available conversion credits in your account.
def setMaxPages(self, npages)
Prints at most npages pages.
def setFailOnNon200(self, value)
If value is True then the conversion will fail when the source URI returns 4xx or 5xx HTTP status code. The default is False.

class pdfcrowd.Error

Derived from standard Exception class. It is thrown when an error occurs.

Units

Page dimensions and margins can be specified in inches (in), millimeters (mm), centimeters (cm), or points (pt). If no units are specified, points are assumed. Examples: "210mm", "8.5in".

Important: This document is for the legacy Pdfcrowd API v1. The documentation for the new API v2 can be found here.