HTML to PDF Python Reference
class HtmlToPdfClient
All setter methods return HtmlToPdfClient object unless specified otherwise.
Constructor
def __init__(self, user_name, api_key)
Constructor for the Pdfcrowd API client.
user_name
Your username at Pdfcrowd.
def convertUrl(self, url)
Convert a web page.
url
The address of the web page to convert.
The supported protocols are http:// and https://.
Returns
-
byte[] - Byte array containing the conversion output.
def convertUrlToStream(self, url, out_stream)
Convert a web page and write the result to an output stream.
url
The address of the web page to convert.
The supported protocols are http:// and https://.
out_stream
The output stream that will contain the conversion output.
def convertUrlToFile(self, url, file_path)
Convert a web page and write the result to a local file.
url
The address of the web page to convert.
The supported protocols are http:// and https://.
file_path
The output file path.
def convertFile(self, file)
Convert a local file.
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
Returns
-
byte[] - Byte array containing the conversion output.
def convertFileToStream(self, file, out_stream)
Convert a local file and write the result to an output stream.
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
out_stream
The output stream that will contain the conversion output.
def convertFileToFile(self, file, file_path)
Convert a local file and write the result to a local file.
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
file_path
The output file path.
def convertString(self, text)
Convert a string.
text
The string content to convert.
Returns
-
byte[] - Byte array containing the conversion output.
def convertStringToStream(self, text, out_stream)
Convert a string and write the output to an output stream.
text
The string content to convert.
out_stream
The output stream that will contain the conversion output.
def convertStringToFile(self, text, file_path)
Convert a string and write the output to a file.
text
The string content to convert.
file_path
The output file path.
def convertStream(self, in_stream)
Convert the contents of an input stream.
in_stream
The input stream with source data.
The stream can contain either HTML code or an archive (.zip, .tar.gz, .tar.bz2).
The archive can contain HTML code and its external assets (images, style sheets, javascript).
Returns
-
byte[] - Byte array containing the conversion output.
def convertStreamToStream(self, in_stream, out_stream)
Convert the contents of an input stream and write the result to an output stream.
in_stream
The input stream with source data.
The stream can contain either HTML code or an archive (.zip, .tar.gz, .tar.bz2).
The archive can contain HTML code and its external assets (images, style sheets, javascript).
out_stream
The output stream that will contain the conversion output.
def convertStreamToFile(self, in_stream, file_path)
Convert the contents of an input stream and write the result to a local file.
in_stream
The input stream with source data.
The stream can contain either HTML code or an archive (.zip, .tar.gz, .tar.bz2).
The archive can contain HTML code and its external assets (images, style sheets, javascript).
file_path
The output file path.
def setZipMainFilename(self, filename)
Set the file name of the main HTML document stored in the input archive. If not specified, the first HTML file in the archive is used for conversion. Use this method if the input archive contains multiple HTML documents.
Page Setup
def setPageSize(self, size)
Set the output page size.
size
Allowed values:
-
A0
-
A1
-
A2
-
A3
-
A4
-
A5
-
A6
-
Letter
Default: A4
def setPageWidth(self, width)
Set the output page width. The safe maximum is 200in otherwise some PDF viewers may be unable to open the PDF.
width
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 8.27in
Examples:
-
setPageWidth("300mm")
-
setPageWidth("9.5in")
def setPageHeight(self, height)
Set the output page height. Use -1 for a single page PDF. The safe maximum is 200in otherwise some PDF viewers may be unable to open the PDF.
height
The value must be -1 or specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 11.7in
def setPageDimensions(self, width, height)
Set the output page dimensions.
width
Set the output page width. The safe maximum is 200in otherwise some PDF viewers may be unable to open the PDF.
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 8.27in
height
Set the output page height. Use -1 for a single page PDF. The safe maximum is 200in otherwise some PDF viewers may be unable to open the PDF.
The value must be -1 or specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 11.7in
Examples:
-
setPageDimensions("300mm", "350mm")
-
setPageDimensions("9.5in", "15.25in")
-
setPageDimensions("372mm", "520pt")
def setOrientation(self, orientation)
Set the output page orientation.
orientation
Default: portrait
def setMarginTop(self, top)
Set the output page top margin.
top
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 0.4in
Examples:
-
setMarginTop("1in")
-
setMarginTop("2.5cm")
def setMarginRight(self, right)
Set the output page right margin.
right
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 0.4in
Examples:
-
setMarginRight("1in")
-
setMarginRight("2.5cm")
def setMarginBottom(self, bottom)
Set the output page bottom margin.
bottom
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 0.4in
Examples:
-
setMarginBottom("1in")
-
setMarginBottom("2.5cm")
def setMarginLeft(self, left)
Set the output page left margin.
left
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 0.4in
Examples:
-
setMarginLeft("1in")
-
setMarginLeft("2.5cm")
def setNoMargins(self, value)
Disable page margins.
value
Set to True to disable margins.
Default: False
def setPageMargins(self, top, right, bottom, left)
Set the output page margins.
top
Set the output page top margin.
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 0.4in
right
Set the output page right margin.
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 0.4in
bottom
Set the output page bottom margin.
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 0.4in
left
Set the output page left margin.
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: 0.4in
def setPrintPageRange(self, pages)
Set the page range to print.
pages
A comma separated list of page numbers or ranges.
Examples:
-
Just the second page is printed.
setPrintPageRange("2")
-
The first and the third page are printed.
setPrintPageRange("1,3")
-
Everything except the first page is printed.
setPrintPageRange("2-")
-
Just first 3 pages are printed.
setPrintPageRange("-3")
-
Pages 3, 6, 7, 8 and 9 are printed.
setPrintPageRange("3,6-9")
def setContentViewportWidth(self, width)
Set the viewport width for formatting the HTML content when generating a PDF. By specifying a viewport width, you can control how the content is rendered, ensuring it mimics the appearance on various devices or matches specific design requirements.
Availability:
API client >= 6.0.0, converter >= 24.04.
See
versioning.
width
The width of the viewport.
The value must be "balanced", "small", "medium", "large", "extra-large", or a number in the range 96-65000.
Allowed values:
-
balanced
The smart option to adjust the viewport width dynamically to fit the print area, ensuring an optimal layout.
-
small
A compact layout where less text fits on each PDF page, ideal for detailed sections or mobile views.
-
medium
A balanced amount of text per page, striking a good compromise between readability and content density.
-
large
A broader layout that accommodates more text per page, perfect for reducing page count and enhancing flow.
-
extra-large
Maximize the text per page, creating a spacious and content-rich PDF, akin to a widescreen experience.
-
A precise viewport width in pixels, such as 1000, to tailor the PDF's text density to your specific requirements. The value must be in the range 96-65000.
Default: medium
Examples:
-
Use the "large" viewport.
setContentViewportWidth("large")
-
Use an 800 pixels wide viewport.
setContentViewportWidth("800")
def setContentViewportHeight(self, height)
Set the viewport height for formatting the HTML content when generating a PDF. By specifying a viewport height, you can enforce loading of lazy-loaded images and also affect vertical positioning of absolutely positioned elements within the content.
Availability:
API client >= 6.0.0, converter >= 24.04.
See
versioning.
height
The viewport height.
The value must be "auto", "large", or a number.
Allowed values:
-
auto
The height of the print area is used.
-
large
Value suitable for documents with extensive lazy-loaded content.
-
A specific numerical value to set as the window height, allowing precise control based on the document's requirements.
Default: auto
def setContentFitMode(self, mode)
Specifies the mode for fitting the HTML content to the print area by upscaling or downscaling it.
Availability:
API client >= 6.0.0, converter >= 24.04.
See
versioning.
mode
The fitting mode.
Allowed values:
-
auto
Automatic mode
-
smart-scaling
Smartscaling to fit more content into the print area.
-
no-scaling
No scaling is performed.
-
viewport-width
The viewport width fits the print area width.
-
content-width
The HTML content width fits the print area width.
-
single-page
The entire HTML content fits the print area of a single page.
-
single-page-ratio
The entire HTML content fits the print area of a single page, maintaining the aspect ratio of the page height and width.
Default: auto
def setRemoveBlankPages(self, pages)
Specifies which blank pages to exclude from the output document.
Availability:
API client >= 5.13.0, converter >= 20.10.
See
versioning.
pages
The empty page behavior.
Allowed values:
-
trailing
Trailing blank pages are removed from the document.
-
all
All empty pages are removed from the document.
Availability:
API client >= 6.0.0, converter >= 24.04.
-
none
No blank page is removed from the document.
Default: trailing
Load an HTML code from the specified URL and use it as the page header. The following classes can be used in the HTML. The content of the respective elements will be expanded as follows:
- pdfcrowd-page-count - the total page count of printed pages
- pdfcrowd-page-number - the current page number
- pdfcrowd-source-url - the source URL of the converted document
- pdfcrowd-source-title - the title of the converted document
The following attributes can be used:
- data-pdfcrowd-number-format - specifies the type of the used numerals. Allowed values:
- arabic - Arabic numerals, they are used by default
- roman - Roman numerals
- eastern-arabic - Eastern Arabic numerals
- bengali - Bengali numerals
- devanagari - Devanagari numerals
- thai - Thai numerals
- east-asia - Chinese, Vietnamese, Japanese and Korean numerals
- chinese-formal - Chinese formal numerals
Please contact us if you need another type of numerals.
Example:
<span class='pdfcrowd-page-number' data-pdfcrowd-number-format='roman'></span> - data-pdfcrowd-placement - specifies where to place the source URL. Allowed values:
- The URL is inserted to the content
- Example: <span class='pdfcrowd-source-url'></span>
will produce <span>http://example.com</span>
- href - the URL is set to the href attribute
- Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href'>Link to source</a>
will produce <a href='http://example.com'>Link to source</a>
- href-and-content - the URL is set to the href attribute and to the content
- Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href-and-content'></a>
will produce <a href='http://example.com'>http://example.com</a>
Use the specified HTML code as the page header. The following classes can be used in the HTML. The content of the respective elements will be expanded as follows:
- pdfcrowd-page-count - the total page count of printed pages
- pdfcrowd-page-number - the current page number
- pdfcrowd-source-url - the source URL of the converted document
- pdfcrowd-source-title - the title of the converted document
The following attributes can be used:
- data-pdfcrowd-number-format - specifies the type of the used numerals. Allowed values:
- arabic - Arabic numerals, they are used by default
- roman - Roman numerals
- eastern-arabic - Eastern Arabic numerals
- bengali - Bengali numerals
- devanagari - Devanagari numerals
- thai - Thai numerals
- east-asia - Chinese, Vietnamese, Japanese and Korean numerals
- chinese-formal - Chinese formal numerals
Please contact us if you need another type of numerals.
Example:
<span class='pdfcrowd-page-number' data-pdfcrowd-number-format='roman'></span> - data-pdfcrowd-placement - specifies where to place the source URL. Allowed values:
- The URL is inserted to the content
- Example: <span class='pdfcrowd-source-url'></span>
will produce <span>http://example.com</span>
- href - the URL is set to the href attribute
- Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href'>Link to source</a>
will produce <a href='http://example.com'>Link to source</a>
- href-and-content - the URL is set to the href attribute and to the content
- Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href-and-content'></a>
will produce <a href='http://example.com'>http://example.com</a>
Set the header height.
Examples:
-
setHeaderHeight("30mm")
-
setHeaderHeight("1in")
Set the file name of the header HTML document stored in the input archive. Use this method if the input archive contains multiple HTML documents.
Load an HTML code from the specified URL and use it as the page footer. The following classes can be used in the HTML. The content of the respective elements will be expanded as follows:
- pdfcrowd-page-count - the total page count of printed pages
- pdfcrowd-page-number - the current page number
- pdfcrowd-source-url - the source URL of the converted document
- pdfcrowd-source-title - the title of the converted document
The following attributes can be used:
- data-pdfcrowd-number-format - specifies the type of the used numerals. Allowed values:
- arabic - Arabic numerals, they are used by default
- roman - Roman numerals
- eastern-arabic - Eastern Arabic numerals
- bengali - Bengali numerals
- devanagari - Devanagari numerals
- thai - Thai numerals
- east-asia - Chinese, Vietnamese, Japanese and Korean numerals
- chinese-formal - Chinese formal numerals
Please contact us if you need another type of numerals.
Example:
<span class='pdfcrowd-page-number' data-pdfcrowd-number-format='roman'></span> - data-pdfcrowd-placement - specifies where to place the source URL. Allowed values:
- The URL is inserted to the content
- Example: <span class='pdfcrowd-source-url'></span>
will produce <span>http://example.com</span>
- href - the URL is set to the href attribute
- Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href'>Link to source</a>
will produce <a href='http://example.com'>Link to source</a>
- href-and-content - the URL is set to the href attribute and to the content
- Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href-and-content'></a>
will produce <a href='http://example.com'>http://example.com</a>
Use the specified HTML as the page footer. The following classes can be used in the HTML. The content of the respective elements will be expanded as follows:
- pdfcrowd-page-count - the total page count of printed pages
- pdfcrowd-page-number - the current page number
- pdfcrowd-source-url - the source URL of the converted document
- pdfcrowd-source-title - the title of the converted document
The following attributes can be used:
- data-pdfcrowd-number-format - specifies the type of the used numerals. Allowed values:
- arabic - Arabic numerals, they are used by default
- roman - Roman numerals
- eastern-arabic - Eastern Arabic numerals
- bengali - Bengali numerals
- devanagari - Devanagari numerals
- thai - Thai numerals
- east-asia - Chinese, Vietnamese, Japanese and Korean numerals
- chinese-formal - Chinese formal numerals
Please contact us if you need another type of numerals.
Example:
<span class='pdfcrowd-page-number' data-pdfcrowd-number-format='roman'></span> - data-pdfcrowd-placement - specifies where to place the source URL. Allowed values:
- The URL is inserted to the content
- Example: <span class='pdfcrowd-source-url'></span>
will produce <span>http://example.com</span>
- href - the URL is set to the href attribute
- Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href'>Link to source</a>
will produce <a href='http://example.com'>Link to source</a>
- href-and-content - the URL is set to the href attribute and to the content
- Example: <a class='pdfcrowd-source-url' data-pdfcrowd-placement='href-and-content'></a>
will produce <a href='http://example.com'>http://example.com</a>
Set the footer height.
Examples:
-
setFooterHeight("30mm")
-
setFooterHeight("1in")
Set the file name of the footer HTML document stored in the input archive. Use this method if the input archive contains multiple HTML documents.
Disable horizontal page margins for header and footer. The header/footer contents width will be equal to the physical page width.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
The page header is not printed on the specified pages.
Examples:
-
The header is not printed on the second page.
setExcludeHeaderOnPages("2")
-
The header is not printed on the first and the last page.
setExcludeHeaderOnPages("1,-1")
The page footer is not printed on the specified pages.
Examples:
-
The footer is not printed on the second page.
setExcludeFooterOnPages("2")
-
The footer is not printed on the first and the last page.
setExcludeFooterOnPages("1,-1")
Set the scaling factor (zoom) for the header and footer.
def setPageNumberingOffset(self, offset)
Set an offset between physical and logical page numbers.
offset
Integer specifying page offset.
Default: 0
Examples:
-
The page numbering will start with 0.
setPageNumberingOffset(1)
-
The page numbering will start with 11 on the first page. It can be useful for joining documents.
setPageNumberingOffset(-10)
Watermark & Background
def setPageWatermark(self, watermark)
Apply a watermark to each page of the output PDF file. A watermark can be either a PDF or an image. If a multi-page file (PDF or TIFF) is used, the first page is used as the watermark.
watermark
The file path to a local file.
The file must exist and not be empty.
def setPageWatermarkUrl(self, url)
Load a file from the specified URL and apply the file as a watermark to each page of the output PDF. A watermark can be either a PDF or an image. If a multi-page file (PDF or TIFF) is used, the first page is used as the watermark.
url
The supported protocols are http:// and https://.
def setMultipageWatermark(self, watermark)
Apply each page of a watermark to the corresponding page of the output PDF. A watermark can be either a PDF or an image.
watermark
The file path to a local file.
The file must exist and not be empty.
def setMultipageWatermarkUrl(self, url)
Load a file from the specified URL and apply each page of the file as a watermark to the corresponding page of the output PDF. A watermark can be either a PDF or an image.
url
The supported protocols are http:// and https://.
def setPageBackground(self, background)
Apply a background to each page of the output PDF file. A background can be either a PDF or an image. If a multi-page file (PDF or TIFF) is used, the first page is used as the background.
background
The file path to a local file.
The file must exist and not be empty.
def setPageBackgroundUrl(self, url)
Load a file from the specified URL and apply the file as a background to each page of the output PDF. A background can be either a PDF or an image. If a multi-page file (PDF or TIFF) is used, the first page is used as the background.
url
The supported protocols are http:// and https://.
def setMultipageBackground(self, background)
Apply each page of a background to the corresponding page of the output PDF. A background can be either a PDF or an image.
background
The file path to a local file.
The file must exist and not be empty.
def setMultipageBackgroundUrl(self, url)
Load a file from the specified URL and apply each page of the file as a background to the corresponding page of the output PDF. A background can be either a PDF or an image.
url
The supported protocols are http:// and https://.
def setPageBackgroundColor(self, color)
The page background color in RGB or RGBA hexadecimal format. The color fills the entire page regardless of the margins.
color
The value must be in RRGGBB or RRGGBBAA hexadecimal format.
Examples:
-
red color
setPageBackgroundColor("FF0000")
-
green color
setPageBackgroundColor("00ff00")
-
green color with 50% opacity
setPageBackgroundColor("00ff0080")
General Options
Use the print version of the page if available (@media print).
def setNoBackground(self, value)
Do not print the background graphics.
value
Set to True to disable the background graphics.
Default: False
def setDisableJavascript(self, value)
Do not execute JavaScript.
value
Set to True to disable JavaScript in web pages.
Default: False
def setDisableImageLoading(self, value)
Do not load images.
value
Set to True to disable loading of images.
Default: False
def setDisableRemoteFonts(self, value)
Disable loading fonts from remote sources.
value
Set to True disable loading remote fonts.
Default: False
def setUseMobileUserAgent(self, value)
Use a mobile user agent.
Availability:
API client >= 5.3.0, converter >= 20.10.
See
versioning.
value
Set to True to use a mobile user agent.
Default: False
def setLoadIframes(self, iframes)
Specifies how iframes are handled.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
def setBlockAds(self, value)
Try to block ads. Enabling this option can produce smaller output and speed up the conversion.
value
Set to True to block ads in web pages.
Default: False
def setDefaultEncoding(self, encoding)
Set the default HTML content text encoding.
encoding
The text encoding of the HTML content.
Default: auto detect
Examples:
-
Set to use Latin-2 encoding.
setDefaultEncoding("iso8859-2")
-
Set to use UTF-8 encoding.
setDefaultEncoding("utf-8")
def setLocale(self, locale)
Set the locale for the conversion. This may affect the output format of dates, times and numbers.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
locale
The locale code according to ISO 639.
Default: en-US
def setHttpAuth(self, user_name, password)
Set credentials to access HTTP base authentication protected websites.
user_name
Set the HTTP authentication user name.
password
Set the HTTP authentication password.
def setCookies(self, cookies)
Set cookies that are sent in Pdfcrowd HTTP requests.
cookies
The cookie string.
def setVerifySslCertificates(self, value)
Do not allow insecure HTTPS connections.
value
Set to True to enable SSL certificate verification.
Default: False
def setFailOnMainUrlError(self, fail_on_error)
Abort the conversion if the main URL HTTP status code is greater than or equal to 400.
fail_on_error
Set to True to abort the conversion.
Default: False
def setFailOnAnyUrlError(self, fail_on_error)
Abort the conversion if any of the sub-request HTTP status code is greater than or equal to 400 or if some sub-requests are still pending. See details in a debug log.
fail_on_error
Set to True to abort the conversion.
Default: False
Do not send the X-Pdfcrowd HTTP header in Pdfcrowd HTTP requests.
def setCssPageRuleMode(self, mode)
Specifies behavior in presence of CSS @page rules. It may affect the page size, margins and orientation.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
mode
The page rule mode.
Allowed values:
-
default
The Pdfcrowd API page settings are preferred.
-
mode1
The converter version 18.10 mode.
-
mode2
CSS @page rule is preferred.
Default: default
def setCustomCss(self, css)
Apply custom CSS to the input HTML document. It allows you to modify the visual appearance and layout of your HTML content dynamically. Tip: Using !important in custom CSS provides a way to prioritize and override conflicting styles.
Availability:
API client >= 5.14.0, converter >= 20.10.
See
versioning.
css
A string containing valid CSS.
Examples:
-
Set the page background color to gray.
setCustomCss("body { background-color: gray; }")
-
Do not show nav HTML elements and the element with ad-block ID in the output PDF.
setCustomCss("nav, #ad-block { display: none !important; }")
def setCustomJavascript(self, javascript)
Run a custom JavaScript after the document is loaded and ready to print. The script is intended for post-load DOM manipulation (add/remove elements, update CSS, ...). In addition to the standard browser APIs, the custom JavaScript code can use helper functions from our
JavaScript library.
javascript
A string containing a JavaScript code.
Example:
-
Set the page background color to gray.
setCustomJavascript("document.body.style.setProperty('background-color', 'gray', 'important')")
def setOnLoadJavascript(self, javascript)
Run a custom JavaScript right after the document is loaded. The script is intended for early DOM manipulation (add/remove elements, update CSS, ...). In addition to the standard browser APIs, the custom JavaScript code can use helper functions from our
JavaScript library.
javascript
A string containing a JavaScript code.
Example:
-
Set the page background color to gray.
setOnLoadJavascript("document.body.style.setProperty('background-color', 'gray', 'important')")
Set a custom HTTP header that is sent in Pdfcrowd HTTP requests.
def setJavascriptDelay(self, delay)
Wait the specified number of milliseconds to finish all JavaScript after the document is loaded. Your API license defines the maximum wait time by "Max Delay" parameter.
delay
The number of milliseconds to wait.
Must be a positive integer number or 0.
Default: 200
Example:
-
Wait for 2 seconds.
setJavascriptDelay(2000)
def setElementToConvert(self, selectors)
Convert only the specified element from the main document and its children. The element is specified by one or more
CSS selectors. If the element is not found, the conversion fails. If multiple elements are found, the first one is used.
Examples:
-
The first element with the id main-content is converted.
setElementToConvert("#main-content")
-
The first element with the class name main-content is converted.
setElementToConvert(".main-content")
-
The first element with the tag name table is converted.
setElementToConvert("table")
-
The first element with the tag name table or with the id main-content is converted.
setElementToConvert("table, #main-content")
-
The first element <p class="article"> within <div class="user-panel main"> is converted.
setElementToConvert("div.user-panel.main p.article")
def setElementToConvertMode(self, mode)
Specify the DOM handling when only a part of the document is converted. This can affect the CSS rules used.
mode
Allowed values:
-
cut-out
The element and its children are cut out of the document.
-
remove-siblings
All element's siblings are removed.
-
hide-siblings
All element's siblings are hidden.
Default: cut-out
def setWaitForElement(self, selectors)
Wait for the specified element in a source document. The element is specified by one or more
CSS selectors. The element is searched for in the main document and all iframes. If the element is not found, the conversion fails. Your API license defines the maximum wait time by "Max Delay" parameter.
Examples:
-
Wait until an element with the id main-content is found.
setWaitForElement("#main-content")
-
Wait until an element with the class name main-content is found.
setWaitForElement(".main-content")
-
Wait until an element with the tag name table is found.
setWaitForElement("table")
-
Wait until an element with the tag name table or with the id main-content is found.
setWaitForElement("table, #main-content")
-
Wait until <p class="article"> is found within <div class="user-panel main">.
setWaitForElement("div.user-panel.main p.article")
def setAutoDetectElementToConvert(self, value)
The main HTML element for conversion is detected automatically.
Availability:
API client >= 5.5.0, converter >= 20.10.
See
versioning.
value
Set to True to detect the main element.
Default: False
def setReadabilityEnhancements(self, enhancements)
The input HTML is automatically enhanced to improve the readability.
Availability:
API client >= 5.5.0, converter >= 20.10.
See
versioning.
enhancements
Allowed values:
-
none
No enhancements are used.
-
readability-v1
Version 1 of the enhancements is used.
-
readability-v2
Version 2 of the enhancements is used.
-
readability-v3
Version 3 of the enhancements is used.
-
readability-v4
Version 4 of the enhancements is used.
Default: none
Print Resolution
def setViewportWidth(self, width)
Set the viewport width in pixels. The viewport is the user's visible area of the page.
width
The value must be in the range 96-65000.
Example:
-
Full HD width.
setViewportWidth(1920)
def setViewportHeight(self, height)
Set the viewport height in pixels. The viewport is the user's visible area of the page. If the input HTML uses lazily loaded images, try using a large value that covers the entire height of the HTML, e.g. 100000.
height
Must be a positive integer number.
def setViewport(self, width, height)
Set the viewport size. The viewport is the user's visible area of the page.
width
Set the viewport width in pixels. The viewport is the user's visible area of the page.
The value must be in the range 96-65000.
height
Set the viewport height in pixels. The viewport is the user's visible area of the page. If the input HTML uses lazily loaded images, try using a large value that covers the entire height of the HTML, e.g. 100000.
Must be a positive integer number.
def setRenderingMode(self, mode)
Set the rendering mode of the page, allowing control over how content is displayed.
mode
The rendering mode.
Allowed values:
-
default
The mode based on the standard browser print functionality.
-
viewport
Adapts the rendering according to the specified viewport width, influencing the @media (min-width) and @media (max-width) CSS properties. This mode is ideal for previewing different responsive designs of a web page, such as mobile or desktop views, by choosing the appropriate viewport size.
def setSmartScalingMode(self, mode)
Specifies the scaling mode used for fitting the HTML contents to the print area.
mode
The smart scaling mode.
Allowed values:
-
default
The mode based on the standard browser print functionality.
-
disabled
No smart scaling is performed.
-
viewport-fit
The viewport width fits the print area width.
-
content-fit
The HTML contents width fits the print area width.
-
single-page-fit
The whole HTML contents fits the print area of a single page.
-
single-page-fit-ex
The whole HTML contents fits the print area of a single page with respect to the page height/width ratio.
-
mode1
Scaling mode 1 is applied.
def setScaleFactor(self, factor)
Set the scaling factor (zoom) for the main page area.
factor
The percentage value.
The value must be in the range 10-500.
Default: 100
def setDisableSmartShrinking(self, value)
Disable the intelligent shrinking strategy that tries to optimally fit the HTML contents to a PDF page.
API client < 5.0.0. Smart scaling mode1 can be used instead.
See
versioning.
value
Set to True to disable the intelligent shrinking strategy.
Default: False
def setJpegQuality(self, quality)
Set the quality of embedded JPEG images. A lower quality results in a smaller PDF file but can lead to compression artifacts.
quality
The percentage value.
The value must be in the range 1-100.
Default: 100
def setConvertImagesToJpeg(self, images)
Specify which image types will be converted to JPEG. Converting lossless compression image formats (PNG, GIF, ...) to JPEG may result in a smaller PDF file.
images
The image category.
Allowed values:
-
none
No image conversion is done.
-
opaque
Only opaque images are converted to JPEG images.
-
all
All images are converted to JPEG images. The JPEG format does not support transparency so the transparent color is replaced by a PDF page background color.
Default: none
def setImageDpi(self, dpi)
Set the DPI of images in PDF. A lower DPI may result in a smaller PDF file. If the specified DPI is higher than the actual image DPI, the original image DPI is retained (no upscaling is performed). Use 0 to leave the images unaltered.
dpi
The DPI value.
Must be a positive integer number or 0.
Default: 0
Examples:
-
No change of the source image is done.
setImageDpi(0)
-
Screen-only view lower DPI.
setImageDpi(72)
-
Screen-only view recommended DPI.
setImageDpi(96)
-
Ebook typical DPI.
setImageDpi(150)
-
Printer standard DPI.
setImageDpi(300)
Miscellaneous values for PDF output.
Convert HTML forms to fillable PDF forms. Details can be found in the
blog post.
def setLinearize(self, value)
Create linearized PDF. This is also known as Fast Web View.
value
Set to True to create linearized PDF.
Default: False
def setEncrypt(self, value)
Encrypt the PDF. This prevents search engines from indexing the contents.
value
Set to True to enable PDF encryption.
Default: False
def setUserPassword(self, password)
Protect the PDF with a user password. When a PDF has a user password, it must be supplied in order to view the document and to perform operations allowed by the access permissions.
password
The user password.
def setOwnerPassword(self, password)
Protect the PDF with an owner password. Supplying an owner password grants unlimited access to the PDF including changing the passwords and access permissions.
password
The owner password.
def setNoPrint(self, value)
Disallow printing of the output PDF.
value
Set to True to set the no-print flag in the output PDF.
Default: False
def setNoModify(self, value)
Disallow modification of the output PDF.
value
Set to True to set the read-only only flag in the output PDF.
Default: False
def setNoCopy(self, value)
Disallow text and graphics extraction from the output PDF.
value
Set to True to set the no-copy flag in the output PDF.
Default: False
def setTitle(self, title)
Set the title of the PDF.
def setSubject(self, subject)
Set the subject of the PDF.
def setAuthor(self, author)
Set the author of the PDF.
def setKeywords(self, keywords)
Associate keywords with the document.
keywords
The string with the keywords.
Extract meta tags (author, keywords and description) from the input HTML and use them in the output PDF.
Viewer Preferences
These preferences specify how a PDF viewer should present the document. The preferences may be ignored by some PDF viewers.
def setPageLayout(self, layout)
Specify the page layout to be used when the document is opened.
layout
Allowed values:
-
single-page
Display one page at a time.
-
one-column
Display the pages in one column.
-
two-column-left
Display the pages in two columns, with odd-numbered pages on the left.
-
two-column-right
Display the pages in two columns, with odd-numbered pages on the right.
def setPageMode(self, mode)
Specify how the document should be displayed when opened.
def setInitialZoomType(self, zoom_type)
Specify how the page should be displayed when opened.
zoom_type
Allowed values:
-
fit-width
The page content is magnified just enough to fit the entire width of the page within the window.
-
fit-height
The page content is magnified just enough to fit the entire height of the page within the window.
-
fit-page
The page content is magnified just enough to fit the entire page within the window both horizontally and vertically. If the required horizontal and vertical magnification factors are different, use the smaller of the two, centering the page within the window in the other dimension.
def setInitialPage(self, page)
Display the specified page when the document is opened.
page
Must be a positive integer number.
def setInitialZoom(self, zoom)
Specify the initial page zoom in percents when the document is opened.
zoom
Must be a positive integer number.
Specify whether to hide the viewer application's tool bars when the document is active.
Specify whether to hide the viewer application's menu bar when the document is active.
def setHideWindowUi(self, value)
Specify whether to hide user interface elements in the document's window (such as scroll bars and navigation controls), leaving only the document's contents displayed.
value
Set to True to hide ui elements.
Default: False
def setFitWindow(self, value)
Specify whether to resize the document's window to fit the size of the first displayed page.
value
Set to True to resize the window.
Default: False
def setCenterWindow(self, value)
Specify whether to position the document's window in the center of the screen.
value
Set to True to center the window.
Default: False
def setDisplayTitle(self, value)
Specify whether the window's title bar should display the document title. If false , the title bar should instead display the name of the PDF file containing the document.
value
Set to True to display the title.
Default: False
def setRightToLeft(self, value)
Set the predominant reading order for text to right-to-left. This option has no direct effect on the document's contents or page numbering but can be used to determine the relative positioning of pages when displayed side by side or printed n-up
value
Set to True to set right-to-left reading order.
Default: False
Data
Methods related to HTML template rendering.
def setDataString(self, data_string)
Set the input data for template rendering. The data format can be JSON, XML, YAML or CSV.
data_string
The input data string.
def setDataFile(self, data_file)
Load the input data for template rendering from the specified file. The data format can be JSON, XML, YAML or CSV.
data_file
The file path to a local file containing the input data.
Specify the input data format.
def setDataEncoding(self, encoding)
encoding
The data file encoding.
Default: utf-8
def setDataIgnoreUndefined(self, value)
Ignore undefined variables in the HTML template. The default mode is strict so any undefined variable causes the conversion to fail. You can use {% if variable is defined %} to check if the variable is defined.
value
Set to True to ignore undefined variables.
Default: False
def setDataAutoEscape(self, value)
Auto escape HTML symbols in the input data before placing them into the output.
value
Set to True to turn auto escaping on.
Default: False
def setDataTrimBlocks(self, value)
Auto trim whitespace around each template command block.
value
Set to True to turn auto trimming on.
Default: False
def setDataOptions(self, options)
Set the advanced data options:
- csv_delimiter - The CSV data delimiter, the default is ,.
- xml_remove_root - Remove the root XML element from the input data.
- data_root - The name of the root element inserted into the input data without a root node (e.g. CSV), the default is data.
options
Comma separated list of options.
Examples:
-
Use semicolon to separate CSV data.
setDataOptions("csv_delimiter=;")
-
Name the root of data rows and use the name in the template loop {% for row in rows %}...{% endfor %}.
setDataOptions("data_root=rows")
-
Remove XML root so it the HTML template can be more simple.
setDataOptions("xml_remove_root=1")
Miscellaneous
def setDebugLog(self, value)
Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the
getDebugLogUrl method or available in
conversion statistics.
value
Set to True to enable the debug logging.
Default: False
Get the URL of the debug log for the last conversion.
Returns
-
string - The link to the debug log.
def getRemainingCreditCount(self)
Get the number of conversion credits available in your
account.
This method can only be called after a call to one of the convertXtoY methods.
The returned value can differ from the actual count if you run parallel conversions.
The special value
999999 is returned if the information is not available.
Returns
-
int - The number of credits.
def getConsumedCreditCount(self)
Get the number of credits consumed by the last conversion.
Returns
-
int - The number of credits.
Get the job id.
Returns
-
string - The unique job identifier.
Get the number of pages in the output document.
def getTotalPageCount(self)
Get the total number of pages in the original output document, including the pages excluded by
setPrintPageRange().
Returns
-
int - The total page count.
Get the size of the output in bytes.
Returns
-
int - The count of bytes.
Get the version details.
Returns
-
string - API version, converter version, and client version.
Tag the conversion with a custom value. The tag is used in
conversion statistics. A value longer than 32 characters is cut off.
tag
A string with the custom tag.
def setHttpProxy(self, proxy)
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTP scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
proxy
The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
def setHttpsProxy(self, proxy)
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTPS scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
proxy
The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
def setClientCertificate(self, certificate)
A client certificate to authenticate Pdfcrowd converter on your web server. The certificate is used for two-way SSL/TLS authentication and adds extra security.
certificate
The file must be in PKCS12 format.
The file must exist and not be empty.
def setClientCertificatePassword(self, password)
A password for PKCS12 file with a client certificate if it is needed.
Tweaks
Expert options for fine-tuning output.
def setLayoutDpi(self, dpi)
Set the internal DPI resolution used for positioning of PDF contents. It can help in situations when there are small inaccuracies in the PDF. It is recommended to use values that are a multiple of 72, such as 288 or 360.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
dpi
The DPI value.
The value must be in the range of 72-600.
Default: 300
def setContentAreaX(self, x)
Set the top left X coordinate of the content area. It is relative to the top left X coordinate of the print area.
x
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt". It may contain a negative value.
Default: 0in
Examples:
-
setContentAreaX("-1in")
-
setContentAreaX("2.5cm")
def setContentAreaY(self, y)
Set the top left Y coordinate of the content area. It is relative to the top left Y coordinate of the print area.
y
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt". It may contain a negative value.
Default: 0in
Examples:
-
setContentAreaY("-1in")
-
setContentAreaY("2.5cm")
def setContentAreaWidth(self, width)
Set the width of the content area. It should be at least 1 inch.
width
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: The width of the print area.
def setContentAreaHeight(self, height)
Set the height of the content area. It should be at least 1 inch.
height
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: The height of the print area.
def setContentArea(self, x, y, width, height)
Set the content area position and size. The content area enables to specify a web page area to be converted.
x
Set the top left X coordinate of the content area. It is relative to the top left X coordinate of the print area.
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt". It may contain a negative value.
Default: 0in
y
Set the top left Y coordinate of the content area. It is relative to the top left Y coordinate of the print area.
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt". It may contain a negative value.
Default: 0in
width
Set the width of the content area. It should be at least 1 inch.
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: The width of the print area.
height
Set the height of the content area. It should be at least 1 inch.
The value must be specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Default: The height of the print area.
def setContentsMatrix(self, matrix)
A 2D transformation matrix applied to the main contents on each page. The origin [0,0] is located at the top-left corner of the contents. The resolution is 72 dpi.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
matrix
A comma separated string of matrix elements: "scaleX,skewX,transX,skewY,scaleY,transY"
Default: 1,0,0,0,1,0
Examples:
-
Fine tune the contents height.
setContentsMatrix("1,0,0,0,1.001,0")
-
Translate the contents by -10 points in both directions.
setContentsMatrix("1,0,-10,0,1,-10")
A 2D transformation matrix applied to the page header contents. The origin [0,0] is located at the top-left corner of the header. The resolution is 72 dpi.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
Examples:
-
Fine tune the header contents height.
setHeaderMatrix("1,0,0,0,1.001,0")
-
Translate the header contents by -10 points in both directions.
setHeaderMatrix("1,0,-10,0,1,-10")
A 2D transformation matrix applied to the page footer contents. The origin [0,0] is located at the top-left corner of the footer. The resolution is 72 dpi.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
Examples:
-
Fine tune the footer contents height.
setFooterMatrix("1,0,0,0,1.001,0")
-
Translate the footer contents by -10 points in both directions.
setFooterMatrix("1,0,-10,0,1,-10")
def setDisablePageHeightOptimization(self, value)
Disable automatic height adjustment that compensates for pixel to point rounding errors.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
value
Set to True to disable automatic height scale.
Default: False
def setMainDocumentCssAnnotation(self, value)
Add special CSS classes to the main document's body element. This allows applying custom styling based on these classes:
- pdfcrowd-page-X - where X is the current page number
- pdfcrowd-page-odd - odd page
- pdfcrowd-page-even - even page
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
value
Set to True to add the special CSS classes.
Default: False
Warning
If your custom styling affects the contents area size (e.g. by using different margins, padding, border width), the resulting PDF may contain duplicit contents or some contents may be missing.
Add special CSS classes to the header/footer's body element. This allows applying custom styling based on these classes:
- pdfcrowd-page-X - where X is the current page number
- pdfcrowd-page-count-X - where X is the total page count
- pdfcrowd-page-first - the first page
- pdfcrowd-page-last - the last page
- pdfcrowd-page-odd - odd page
- pdfcrowd-page-even - even page
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
def setMaxLoadingTime(self, max_time)
Set the maximum time to load the page and its resources. After this time, all requests will be considered successful. This can be useful to ensure that the conversion does not timeout. Use this method if there is no other way to fix page loading.
Availability:
API client >= 5.15.0, converter >= 20.10.
See
versioning.
max_time
The number of seconds to wait.
The value must be in the range 10-30.
def setConversionConfig(self, json_string)
Allows to configure conversion via JSON. The configuration defines various page settings for individual PDF pages or ranges of pages. It provides flexibility in designing each page of the PDF, giving control over each page's size, header, footer etc. If a page or parameter is not explicitly specified, the system will use the default settings for that page or attribute. If a JSON configuration is provided, the settings in the JSON will take precedence over the global options.
The structure of the JSON must be:
- pageSetup: An array of objects where each object defines the configuration for a specific page or range of pages. The following properties can be set for each page object:
-
pages:
A comma-separated list of page numbers or ranges. For example:
- 1-: from page 1 to the end of the document
- 2: only the 2nd page
- 2, 4, 6: pages 2, 4, and 6
- 2-5: pages 2 through 5
- pageSize: The page size (optional).
Possible values: A0, A1, A2, A3, A4, A5, A6, Letter.
- pageWidth: The width of the page (optional).
- pageHeight: The height of the page (optional).
- marginLeft: Left margin (optional).
- marginRight: Right margin (optional).
- marginTop: Top margin (optional).
- marginBottom: Bottom margin (optional).
-
displayHeader: Header appearance (optional). Possible values:
- none: completely excluded
- space: only the content is excluded, the space is used
- content: the content is printed (default)
-
displayFooter: Footer appearance (optional). Possible values:
- none: completely excluded
- space: only the content is excluded, the space is used
- content: the content is printed (default)
- headerHeight: Height of the header (optional).
- footerHeight: Height of the footer (optional).
- orientation: Page orientation, such as "portrait" or "landscape" (optional).
Dimensions may be empty, 0 or specified in inches "in", millimeters "mm", centimeters "cm", pixels "px", or points "pt".
Availability:
API client >= 6.1.0, converter >= 24.04.
See
versioning.
json_string
The JSON string.
def setConversionConfigFile(self, filepath)
Allows to configure the conversion process via JSON file. See details of the
JSON string.
Availability:
API client >= 6.1.0, converter >= 24.04.
See
versioning.
filepath
The file path to a local file.
The file must exist and not be empty.
API Client Options
def setConverterVersion(self, version)
Set the converter version. Different versions may produce different output. Choose which one provides the best output for your case.
Availability:
API client >= 5.0.0.
See
versioning.
version
The version identifier.
Allowed values:
-
24.04
Version 24.04.
-
20.10
Version 20.10.
-
18.10
Version 18.10.
Default: 24.04
def setUseHttp(self, value)
Specifies if the client communicates over HTTP or HTTPS with Pdfcrowd API.
value
Set to True to use HTTP.
Default: False
Warning
Using HTTP is insecure as data sent over HTTP is not encrypted. Enable this option only if you know what you are doing.
def setUserAgent(self, agent)
Set a custom user agent HTTP header. It can be useful if you are behind a proxy or a firewall.
agent
The user agent string.
Default: pdfcrowd_python_client/6.1.0 (https://pdfcrowd.com)
def setProxy(self, host, port, user_name, password)
Specifies an HTTP proxy that the API client library will use to connect to the internet.
def setRetryCount(self, count)
Specifies the number of automatic retries when the 502 or 503 HTTP status code is received. The status code indicates a temporary network issue. This feature can be disabled by setting to 0.
count
Number of retries.
Default: 1