PDF to HTML Java Reference

Availability: API client version >= 5.4.0

class PdfToHtmlClient

All setter methods return PdfToHtmlClient object unless specified otherwise.

Constructor

public PdfToHtmlClient(String userName, String apiKey)

Constructor for the Pdfcrowd API client.

userName

Your username at Pdfcrowd.

apiKey

Your API key.

Conversion Input

public byte[] convertUrl(String url)

Convert a PDF.

url

The address of the PDF to convert.

The supported protocols are http:// and https://.

Returns

byte[] - Byte array containing the conversion output.

public void convertUrlToStream(String url, OutputStream outStream)

Convert a PDF and write the result to an output stream.

url

The address of the PDF to convert.

The supported protocols are http:// and https://.

outStream

The output stream that will contain the conversion output.

public void convertUrlToFile(String url, String filePath) throws IOException

Convert a PDF and write the result to a local file.

url

The address of the PDF to convert.

The supported protocols are http:// and https://.

filePath

The output file path.

The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.

public byte[] convertFile(String file)

Convert a local file.

file

The path to a local file to convert.

The file must exist and not be empty.

Returns

byte[] - Byte array containing the conversion output.

public void convertFileToStream(String file, OutputStream outStream)

Convert a local file and write the result to an output stream.

file

The path to a local file to convert.

The file must exist and not be empty.

outStream

The output stream that will contain the conversion output.

public void convertFileToFile(String file, String filePath) throws IOException

Convert a local file and write the result to a local file.

file

The path to a local file to convert.

The file must exist and not be empty.

filePath

The output file path.

The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.

public byte[] convertRawData(byte[] data)

Convert raw data.

data

The raw content to be converted.

Returns

byte[] - Byte array with the output.

public void convertRawDataToStream(byte[] data, OutputStream outStream)

Convert raw data and write the result to an output stream.

data

The raw content to be converted.

outStream

The output stream that will contain the conversion output.

public void convertRawDataToFile(byte[] data, String filePath) throws IOException

Convert raw data to a file.

data

The raw content to be converted.

filePath

The output file path.

The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.

public byte[] convertStream(InputStream inStream) throws IOException

Convert the contents of an input stream.

inStream

The input stream with source data.

Returns

byte[] - Byte array containing the conversion output.

public void convertStreamToStream(InputStream inStream, OutputStream outStream) throws IOException

Convert the contents of an input stream and write the result to an output stream.

inStream

The input stream with source data.

outStream

The output stream that will contain the conversion output.

public void convertStreamToFile(InputStream inStream, String filePath) throws IOException

Convert the contents of an input stream and write the result to a local file.

inStream

The input stream with source data.

filePath

The output file path.

The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.

General Options

public PdfToHtmlClient setPdfPassword(String password)

Password to open the encrypted PDF file.

password

The input PDF password.

public PdfToHtmlClient setScaleFactor(int factor)

Set the scaling factor (zoom) for the main page area.

factor

The percentage value.

Must be a positive integer number.

Default: 100

public PdfToHtmlClient setPrintPageRange(String pages)

Set the page range to print.

pages

A comma separated list of page numbers or ranges.

Examples:

Just the second page is printed.

setPrintPageRange("2")
The first and the third page are printed.

setPrintPageRange("1,3")
Everything except the first page is printed.

setPrintPageRange("2-")
Just first 3 pages are printed.

setPrintPageRange("-3")
Pages 3, 6, 7, 8 and 9 are printed.

setPrintPageRange("3,6-9")

public PdfToHtmlClient setDpi(int dpi)

Set the output graphics DPI.

Availability: API client >= 5.16.0, converter >= 20.10. See versioning.

dpi

The DPI value.

Default: 144

public PdfToHtmlClient setImageMode(String mode)

Specifies where the images are stored.

mode

The image storage mode.

Allowed values:

embed

The images are embedded into the output HTML file.
separate

The images are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all image files.
none

The images are ignored and not converted.

Default: embed

public PdfToHtmlClient setImageFormat(String imageFormat)

Specifies the format for the output images.

Availability: API client >= 5.17.0, converter >= 20.10. See versioning.

imageFormat

The image format.

Allowed values:

Default: png

public PdfToHtmlClient setCssMode(String mode)

Specifies where the style sheets are stored.

mode

The style sheet storage mode.

Allowed values:

embed

Style sheets are embedded into the output HTML file.
separate

Style sheets are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all style sheets.

Default: embed

public PdfToHtmlClient setFontMode(String mode)

Specifies where the fonts are stored.

mode

The font storage mode.

Allowed values:

embed

The fonts are embedded into the output HTML file.
separate

The font are saved to separate files. In this mode the output of the conversion is a zip file containing HTML and all font files.

Default: embed

public PdfToHtmlClient setSplitLigatures(boolean value)

Converts ligatures, two or more letters combined into a single glyph, back into their individual ASCII characters.

value

Set to true to split ligatures.

Default: false

public boolean isZippedOutput()

A helper method to determine if the output file is a zip archive. The output of the conversion may be either an HTML file or a zip file containing the HTML and its external assets.

Returns

boolean - True if the conversion output is a zip file, otherwise False.

public PdfToHtmlClient setForceZip(boolean value)

Enforces the zip output format.

value

Set to true to get the output as a zip archive.

Default: false

public PdfToHtmlClient setTitle(String title)

Set the HTML title. The title from the input PDF is used by default.

title

The HTML title.

public PdfToHtmlClient setSubject(String subject)

Set the HTML subject. The subject from the input PDF is used by default.

subject

The HTML subject.

public PdfToHtmlClient setAuthor(String author)

Set the HTML author. The author from the input PDF is used by default.

author

The HTML author.

public PdfToHtmlClient setKeywords(String keywords)

Associate keywords with the HTML document. Keywords from the input PDF are used by default.

keywords

The string containing the keywords.

Miscellaneous

public PdfToHtmlClient setDebugLog(boolean value)

Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the getDebugLogUrl method or available in conversion statistics.

value

Set to true to enable the debug logging.

Default: false

public String getDebugLogUrl()

Get the URL of the debug log for the last conversion.

Returns

String - The link to the debug log.

public int getRemainingCreditCount()

Get the number of conversion credits available in your account.
This method can only be called after a call to one of the convertXtoY methods.
The returned value can differ from the actual count if you run parallel conversions.
The special value 999999 is returned if the information is not available.

Returns

int - The number of credits.

public int getConsumedCreditCount()

Get the number of credits consumed by the last conversion.

Returns

int - The number of credits.

public String getJobId()

Get the job id.

Returns

String - The unique job identifier.

public int getPageCount()

Get the number of pages in the output document.

Returns

int - The page count.

public int getOutputSize()

Get the size of the output in bytes.

Returns

int - The count of bytes.

public String getVersion()

Get the version details.

Returns

String - API version, converter version, and client version.

public PdfToHtmlClient setTag(String tag)

Tag the conversion with a custom value. The tag is used in conversion statistics. A value longer than 32 characters is cut off.

tag

A string with the custom tag.

Example:

setTag("client-1234")

public PdfToHtmlClient setHttpProxy(String proxy)

A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTP scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.

proxy

The value must have format DOMAIN_OR_IP_ADDRESS:PORT.

Examples:

setHttpProxy("myproxy.com:8080")
setHttpProxy("113.25.84.10:33333")

public PdfToHtmlClient setHttpsProxy(String proxy)

A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTPS scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.

proxy

The value must have format DOMAIN_OR_IP_ADDRESS:PORT.

Examples:

setHttpsProxy("myproxy.com:443")
setHttpsProxy("113.25.84.10:44333")

API Client Options

public PdfToHtmlClient setUseHttp(boolean value)

Specifies if the client communicates over HTTP or HTTPS with Pdfcrowd API.

value

Set to true to use HTTP.

Default: false

Warning

Using HTTP is insecure as data sent over HTTP is not encrypted. Enable this option only if you know what you are doing.

public PdfToHtmlClient setUserAgent(String agent)

Set a custom user agent HTTP header. It can be useful if you are behind a proxy or a firewall.

agent

The user agent string.

Default: pdfcrowd_java_client/5.18.1 (https://pdfcrowd.com)

public PdfToHtmlClient setProxy(String host, int port, String userName, String password)

Specifies an HTTP proxy that the API client library will use to connect to the internet.

host

The proxy hostname.

port

The proxy port.

userName

The username.

password

The password.

public PdfToHtmlClient setRetryCount(int count)

Specifies the number of automatic retries when the 502 or 503 HTTP status code is received. The status code indicates a temporary network issue. This feature can be disabled by setting to 0.

count

Number of retries.

Default: 1

Example:

setRetryCount(3)