PDF to Text in Python

This page describes how to use our cloud-based API to convert PDF to text in Python. The API is user-friendly and can be integrated into your application with just a few lines of code.

Installation

The Python API client library provides easy access to the Pdfcrowd API. No third-party libraries are required.

Install the client library from PyPI
pip install pdfcrowd

We also offer other installation options.

Authentication

The credentials to access the API are your Pdfcrowd username and the API key. You can try out the API without registering using the following demo credentials:

  • Username: demo
  • API key: ce544b6ea52a5621fb9d55f8b542d14d

To get your personal API credentials, you can start a free API trial or buy the API license.

API Method Reference

Refer to the PDF to Text Python Reference for a description of all API methods.

Code Examples

Here are a few Python examples to get you started quickly with the API. See more examples.

import pdfcrowd
import sys

try:
    # create the API client instance
    client = pdfcrowd.PdfToTextClient('demo', 'ce544b6ea52a5621fb9d55f8b542d14d')

    # run the conversion and write the result to a file
    client.convertFileToFile('/path/to/invoice.pdf', 'invoice.txt')
except pdfcrowd.Error as why:
    # report the error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # rethrow or handle the exception
    raise
import pdfcrowd
import sys

try:
    # create the API client instance
    client = pdfcrowd.PdfToTextClient('demo', 'ce544b6ea52a5621fb9d55f8b542d14d')

    # run the conversion and write the result to a file
    client.convertUrlToFile('https://pdfcrowd.com/static/pdf/apisamples/invoice.pdf', 'invoice.txt')
except pdfcrowd.Error as why:
    # report the error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # rethrow or handle the exception
    raise
import pdfcrowd
import sys

try:
    # create the API client instance
    client = pdfcrowd.PdfToTextClient('demo', 'ce544b6ea52a5621fb9d55f8b542d14d')

    # run the conversion and write the result to a file
    client.convertRawDataToFile(open('/path/to/hello_world.pdf', 'rb').read(), 'invoice.txt')
except pdfcrowd.Error as why:
    # report the error
    sys.stderr.write('Pdfcrowd Error: {}\n'.format(why))

    # rethrow or handle the exception
    raise

Troubleshooting