PDF to PDF API - Python SDK

Join multiple PDF files in Python using the Pdfcrowd API v2. The API is easy to use and the integration takes only a couple of lines of code.

Installation

Install the client library from PyPI
 $ pip install pdfcrowd

You can learn more about other install options here.

Authentication

Authentication is needed in order to use the Pdfcrowd API. The credentials used for accessing the API are your Pdfcrowd username and the API key. You can sign up for the Pdfcrowd API here.

Examples

Join 4 local PDF files together to a PDF file
import pdfcrowd
import sys

try:
    # create the API client instance
    client = pdfcrowd.PdfToPdfClient('your_username', 'your_apikey')

    # configure the conversion
    client.addPdfFile('/path/to/cover.pdf')
    client.addPdfFile('/path/to/proposal.pdf')
    client.addPdfFile('/path/to/price.pdf')
    client.addPdfFile('/path/to/contact.pdf')

    # run the conversion and write the result to a file
    client.convertToFile('offer.pdf')
except pdfcrowd.Error as why:
    # report the error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # handle the exception here or rethrow and handle it at a higher level
    raise
Join 4 local PDF files together to in-memory PDF
import pdfcrowd
import sys

try:
    # create the API client instance
    client = pdfcrowd.PdfToPdfClient('your_username', 'your_apikey')

    # configure the conversion
    client.addPdfFile('/path/to/cover.pdf')
    client.addPdfFile('/path/to/proposal.pdf')
    client.addPdfFile('/path/to/price.pdf')
    client.addPdfFile('/path/to/contact.pdf')

    # create output file for conversion result
    output_file = open('offer.pdf', 'wb')

    # run the conversion and store the result into a pdf variable
    pdf = client.convert()

    # write the pdf into the output file
    output_file.write(pdf)

    # close the output file
    output_file.close()
except pdfcrowd.Error as why:
    # report the error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # handle the exception here or rethrow and handle it at a higher level
    raise
Join 4 local PDF files together and write the resulting PDF to an output stream
import pdfcrowd
import sys

try:
    # create the API client instance
    client = pdfcrowd.PdfToPdfClient('your_username', 'your_apikey')

    # configure the conversion
    client.addPdfFile('/path/to/cover.pdf')
    client.addPdfFile('/path/to/proposal.pdf')
    client.addPdfFile('/path/to/price.pdf')
    client.addPdfFile('/path/to/contact.pdf')

    # create output stream for conversion result
    output_stream = open('offer.pdf', 'wb')

    # run the conversion and write the result into the output stream
    client.convertToStream(output_stream)

    # close the output stream
    output_stream.close()
except pdfcrowd.Error as why:
    # report the error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # handle the exception here or rethrow and handle it at a higher level
    raise
Join 4 in-memory PDFs together to a PDF file
import pdfcrowd
import sys

try:
    # create the API client instance
    client = pdfcrowd.PdfToPdfClient('your_username', 'your_apikey')

    # configure the conversion
    client.addPdfRawData(open('/path/to/cover.pdf', 'rb').read())
    client.addPdfRawData(open('/path/to/proposal.pdf', 'rb').read())
    client.addPdfRawData(open('/path/to/price.pdf', 'rb').read())
    client.addPdfRawData(open('/path/to/contact.pdf', 'rb').read())

    # run the conversion and write the result to a file
    client.convertToFile('offer.pdf')
except pdfcrowd.Error as why:
    # report the error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # handle the exception here or rethrow and handle it at a higher level
    raise
Join 2 in-memory PDFs together with 2 local PDF files to a PDF file
import pdfcrowd
import sys

try:
    # create the API client instance
    client = pdfcrowd.PdfToPdfClient('your_username', 'your_apikey')

    # configure the conversion
    client.addPdfRawData(open('/path/to/cover.pdf', 'rb').read())
    client.addPdfFile('/path/to/proposal.pdf')
    client.addPdfRawData(open('/path/to/price.pdf', 'rb').read())
    client.addPdfFile('/path/to/contact.pdf')

    # run the conversion and write the result to a file
    client.convertToFile('offer.pdf')
except pdfcrowd.Error as why:
    # report the error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # handle the exception here or rethrow and handle it at a higher level
    raise
Get info about the current conversion
import pdfcrowd
import sys

try:
    # create the API client instance
    client = pdfcrowd.PdfToPdfClient('your_username', 'your_apikey')

    # configure the conversion
    client.setDebugLog(True)
    client.addPdfRawData(open('/path/to/cover.pdf', 'rb').read())
    client.addPdfRawData(open('/path/to/proposal.pdf', 'rb').read())

    # run the conversion and write the result to a file
    client.convertToFile('offer.pdf')
    
    # print URL to the debug log
    print('Debug log url: {}'.format(client.getDebugLogUrl()))
    
    # print the number of available conversion credits in your account
    print('Remaining credit count: {}'.format(client.getRemainingCreditCount()))
    
    # print the number of credits consumed by the conversion
    print('Consumed credit count: {}'.format(client.getConsumedCreditCount()))
    
    # print the unique ID of the conversion
    print('Job id: {}'.format(client.getJobId()))
    
    # print the total number of pages in the output document
    print('Page count: {}'.format(client.getPageCount()))
    
    # print the size of the output in bytes
    print('Output size: {}'.format(client.getOutputSize()))
except pdfcrowd.Error as why:
    # report the error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # handle the exception here or rethrow and handle it at a higher level
    raise

Tips & Tricks

API Reference - class PdfToPdfClient

Conversion from PDF to PDF.

Constructor

def __init__(self, user_name, api_key)
Constructor for the Pdfcrowd API client.
Parameter Description Default
user_name
Your username at Pdfcrowd.
api_key
Your API key.

 

PDF Manipulation

def setAction(self, action)
Specifies the action to be performed on the input PDFs.
Parameter Description Default
action
Allowed values:
  • join
    Concatenate input PDFs into a single one.
  • shuffle
    Collate pages from input PDFs into a single one, take one page at a time from each input PDF. This is useful when combining two scanned documents containing odd and even pages.
join
Returns
  • PdfToPdfClient - The converter object.
def convert(self)
Perform an action on the input files.
Returns
  • byte[] - Byte array containing the output PDF.
def convertToStream(self, out_stream)
Perform an action on the input files and write the output PDF to an output stream.
Parameter Description Default
out_stream
The output stream that will contain the output PDF.
def convertToFile(self, file_path)
Perform an action on the input files and write the output PDF to a file.
Parameter Description Default
file_path
The output file path.
The string must not be empty.
def addPdfFile(self, file_path)
Add a PDF file to the list of the input PDFs.
Parameter Description Default
file_path
The file path to a local PDF file.
The file must exist and not be empty.
Returns
  • PdfToPdfClient - The converter object.
def addPdfRawData(self, pdf_raw_data)
Add in-memory raw PDF data to the list of the input PDFs.
Typical usage is for adding PDF created by another Pdfcrowd converter.

Example in PHP:
$clientPdf2Pdf->addPdfRawData($clientHtml2Pdf->convertUrl('http://www.example.com'));
Parameter Description Default
pdf_raw_data
The raw PDF data.
The input data must be PDF content.
Returns
  • PdfToPdfClient - The converter object.

 

Miscellaneous

def setDebugLog(self, debug_log)
Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the getDebugLogUrl method or available in conversion statistics.
Parameter Description Default
debug_log
Set to True to enable the debug logging.
False
Returns
  • PdfToPdfClient - The converter object.
def getDebugLogUrl(self)
Get the URL of the debug log for the last conversion.
Returns
  • string - The link to the debug log.
def getRemainingCreditCount(self)
Get the number of conversion credits available in your account.
The returned value can differ from the actual count if you run parallel conversions.
The special value 999999 is returned if the information is not available.
Returns
  • int - The number of credits.
def getConsumedCreditCount(self)
Get the number of credits consumed by the last conversion.
Returns
  • int - The number of credits.
def getJobId(self)
Get the job id.
Returns
  • string - The unique job identifier.
def getPageCount(self)
Get the total number of pages in the output document.
Returns
  • int - The page count.
def getOutputSize(self)
Get the size of the output in bytes.
Returns
  • int - The count of bytes.
def setTag(self, tag)
Tag the conversion with a custom value. The tag is used in conversion statistics. A value longer than 32 characters is cut off.
Parameter Description Default
tag
A string with the custom tag.
Returns
  • PdfToPdfClient - The converter object.

 

API Client Options

def setUseHttp(self, use_http)
Specifies if the client communicates over HTTP or HTTPS with Pdfcrowd API.
Parameter Description Default
use_http
Set to True to use HTTP.
False
Returns
  • PdfToPdfClient - The converter object.
def setUserAgent(self, user_agent)
Set a custom user agent HTTP header. It can be usefull if you are behind some proxy or firewall.
Parameter Description Default
user_agent
The user agent string.
pdfcrowd_python_client/4.3.5 (http://pdfcrowd.com)
Returns
  • PdfToPdfClient - The converter object.
def setProxy(self, host, port, user_name, password)
Specifies an HTTP proxy that the API client library will use to connect to the internet.
Parameter Description Default
host
The proxy hostname.
port
The proxy port.
user_name
The username.
password
The password.
Returns
  • PdfToPdfClient - The converter object.
def setRetryCount(self, retry_count)
Specifies the number of retries when the 502 HTTP status code is received. The 502 status code indicates a temporary network issue. This feature can be disabled by setting to 0.
Parameter Description Default
retry_count
Number of retries wanted.
1
Returns
  • PdfToPdfClient - The converter object.

 

Error Handling

try: 
    # call the API 
except pdfcrowd.Error as why: 
    # print error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # print just error code
    sys.stderr.write('Pdfcrowd Error Code: {}\n'.format(why.getCode()))

    # print just error message
    sys.stderr.write('Pdfcrowd Error Message: {}\n'.format(why.getMessage()))

    # or handle the error by your way

Troubleshooting