HTML to Image Java Reference
class HtmlToImageClient
All setter methods return HtmlToImageClient object unless specified otherwise.
Constructor
public HtmlToImageClient(String userName, String apiKey)
Constructor for the Pdfcrowd API client.
userName
Your username at Pdfcrowd.
The format of the output file.
public byte[] convertUrl(String url)
Convert a web page.
url
The address of the web page to convert.
The supported protocols are http:// and https://.
Returns
-
byte[] - Byte array containing the conversion output.
public void convertUrlToStream(String url, OutputStream outStream)
Convert a web page and write the result to an output stream.
url
The address of the web page to convert.
The supported protocols are http:// and https://.
outStream
The output stream that will contain the conversion output.
public void convertUrlToFile(String url, String filePath) throws IOException
Convert a web page and write the result to a local file.
url
The address of the web page to convert.
The supported protocols are http:// and https://.
filePath
The output file path.
public byte[] convertFile(String file)
Convert a local file.
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
Returns
-
byte[] - Byte array containing the conversion output.
public void convertFileToStream(String file, OutputStream outStream)
Convert a local file and write the result to an output stream.
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
outStream
The output stream that will contain the conversion output.
public void convertFileToFile(String file, String filePath) throws IOException
Convert a local file and write the result to a local file.
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
filePath
The output file path.
public byte[] convertString(String text)
Convert a string.
text
The string content to convert.
Returns
-
byte[] - Byte array containing the conversion output.
public void convertStringToStream(String text, OutputStream outStream)
Convert a string and write the output to an output stream.
text
The string content to convert.
outStream
The output stream that will contain the conversion output.
public void convertStringToFile(String text, String filePath) throws IOException
Convert a string and write the output to a file.
text
The string content to convert.
filePath
The output file path.
public byte[] convertStream(InputStream inStream) throws IOException
Convert the contents of an input stream.
inStream
The input stream with source data.
The stream can contain either HTML code or an archive (.zip, .tar.gz, .tar.bz2).
The archive can contain HTML code and its external assets (images, style sheets, javascript).
Returns
-
byte[] - Byte array containing the conversion output.
public void convertStreamToStream(InputStream inStream, OutputStream outStream) throws IOException
Convert the contents of an input stream and write the result to an output stream.
inStream
The input stream with source data.
The stream can contain either HTML code or an archive (.zip, .tar.gz, .tar.bz2).
The archive can contain HTML code and its external assets (images, style sheets, javascript).
outStream
The output stream that will contain the conversion output.
public void convertStreamToFile(InputStream inStream, String filePath) throws IOException
Convert the contents of an input stream and write the result to a local file.
inStream
The input stream with source data.
The stream can contain either HTML code or an archive (.zip, .tar.gz, .tar.bz2).
The archive can contain HTML code and its external assets (images, style sheets, javascript).
filePath
The output file path.
public HtmlToImageClient setZipMainFilename(String filename)
Set the file name of the main HTML document stored in the input archive. If not specified, the first HTML file in the archive is used for conversion. Use this method if the input archive contains multiple HTML documents.
Image Output
public HtmlToImageClient setScreenshotWidth(int width)
Set the output image width in pixels.
width
The value must be in the range 96-65000.
Default: 1024
Example:
-
Full HD width.
setScreenshotWidth(1920)
public HtmlToImageClient setScreenshotHeight(int height)
Set the output image height in pixels. If it is not specified, actual document height is used.
height
Must be a positive integer number.
public HtmlToImageClient setScaleFactor(int factor)
Set the scaling factor (zoom) for the output image.
factor
The percentage value.
Must be a positive integer number.
Default: 100
public HtmlToImageClient setBackgroundColor(String color)
The output image background color.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
color
The value must be in RRGGBB or RRGGBBAA hexadecimal format.
Examples:
-
red color
setBackgroundColor("FF0000")
-
fully transparent background
setBackgroundColor("00000000")
-
green color with 50% opacity
setBackgroundColor("00ff0080")
-
green color
setBackgroundColor("00ff00")
General Options
Use the print version of the page if available (@media print).
public HtmlToImageClient setNoBackground(boolean value)
Do not print the background graphics.
value
Set to true to disable the background graphics.
Default: false
public HtmlToImageClient setDisableJavascript(boolean value)
Do not execute JavaScript.
value
Set to true to disable JavaScript in web pages.
Default: false
public HtmlToImageClient setDisableImageLoading(boolean value)
Do not load images.
value
Set to true to disable loading of images.
Default: false
public HtmlToImageClient setDisableRemoteFonts(boolean value)
Disable loading fonts from remote sources.
value
Set to true disable loading remote fonts.
Default: false
public HtmlToImageClient setUseMobileUserAgent(boolean value)
Use a mobile user agent.
Availability:
API client >= 5.3.0, converter >= 20.10.
See
versioning.
value
Set to true to use a mobile user agent.
Default: false
public HtmlToImageClient setLoadIframes(String iframes)
Specifies how iframes are handled.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
public HtmlToImageClient setBlockAds(boolean value)
Try to block ads. Enabling this option can produce smaller output and speed up the conversion.
value
Set to true to block ads in web pages.
Default: false
public HtmlToImageClient setDefaultEncoding(String encoding)
Set the default HTML content text encoding.
encoding
The text encoding of the HTML content.
Default: auto detect
Examples:
-
Set to use Latin-2 encoding.
setDefaultEncoding("iso8859-2")
-
Set to use UTF-8 encoding.
setDefaultEncoding("utf-8")
public HtmlToImageClient setLocale(String locale)
Set the locale for the conversion. This may affect the output format of dates, times and numbers.
Availability:
API client >= 5.0.0, converter >= 20.10.
See
versioning.
locale
The locale code according to ISO 639.
Default: en-US
public HtmlToImageClient setHttpAuth(String userName, String password)
Set credentials to access HTTP base authentication protected websites.
userName
Set the HTTP authentication user name.
password
Set the HTTP authentication password.
public HtmlToImageClient setCookies(String cookies)
Set cookies that are sent in Pdfcrowd HTTP requests.
cookies
The cookie string.
public HtmlToImageClient setVerifySslCertificates(boolean value)
Do not allow insecure HTTPS connections.
value
Set to true to enable SSL certificate verification.
Default: false
public HtmlToImageClient setFailOnMainUrlError(boolean failOnError)
Abort the conversion if the main URL HTTP status code is greater than or equal to 400.
failOnError
Set to true to abort the conversion.
Default: false
public HtmlToImageClient setFailOnAnyUrlError(boolean failOnError)
Abort the conversion if any of the sub-request HTTP status code is greater than or equal to 400 or if some sub-requests are still pending. See details in a debug log.
failOnError
Set to true to abort the conversion.
Default: false
Do not send the X-Pdfcrowd HTTP header in Pdfcrowd HTTP requests.
public HtmlToImageClient setCustomCss(String css)
Apply custom CSS to the input HTML document. It allows you to modify the visual appearance and layout of your HTML content dynamically. Tip: Using !important in custom CSS provides a way to prioritize and override conflicting styles.
Availability:
API client >= 5.14.0, converter >= 20.10.
See
versioning.
css
A string containing valid CSS.
Examples:
-
Set the page background color to gray.
setCustomCss("body { background-color: gray; }")
-
Do not show nav HTML elements and the element with ad-block ID in the output PDF.
setCustomCss("nav, #ad-block { display: none !important; }")
public HtmlToImageClient setCustomJavascript(String javascript)
Run a custom JavaScript after the document is loaded and ready to print. The script is intended for post-load DOM manipulation (add/remove elements, update CSS, ...). In addition to the standard browser APIs, the custom JavaScript code can use helper functions from our
JavaScript library.
javascript
A string containing a JavaScript code.
Example:
-
Set the page background color to gray.
setCustomJavascript("document.body.style.setProperty('background-color', 'gray', 'important')")
public HtmlToImageClient setOnLoadJavascript(String javascript)
Run a custom JavaScript right after the document is loaded. The script is intended for early DOM manipulation (add/remove elements, update CSS, ...). In addition to the standard browser APIs, the custom JavaScript code can use helper functions from our
JavaScript library.
javascript
A string containing a JavaScript code.
Example:
-
Set the page background color to gray.
setOnLoadJavascript("document.body.style.setProperty('background-color', 'gray', 'important')")
Set a custom HTTP header that is sent in Pdfcrowd HTTP requests.
public HtmlToImageClient setJavascriptDelay(int delay)
Wait the specified number of milliseconds to finish all JavaScript after the document is loaded. Your API license defines the maximum wait time by "Max Delay" parameter.
delay
The number of milliseconds to wait.
Must be a positive integer number or 0.
Default: 200
Example:
-
Wait for 2 seconds.
setJavascriptDelay(2000)
public HtmlToImageClient setElementToConvert(String selectors)
Convert only the specified element from the main document and its children. The element is specified by one or more
CSS selectors. If the element is not found, the conversion fails. If multiple elements are found, the first one is used.
Examples:
-
The first element with the id main-content is converted.
setElementToConvert("#main-content")
-
The first element with the class name main-content is converted.
setElementToConvert(".main-content")
-
The first element with the tag name table is converted.
setElementToConvert("table")
-
The first element with the tag name table or with the id main-content is converted.
setElementToConvert("table, #main-content")
-
The first element <p class="article"> within <div class="user-panel main"> is converted.
setElementToConvert("div.user-panel.main p.article")
public HtmlToImageClient setElementToConvertMode(String mode)
Specify the DOM handling when only a part of the document is converted. This can affect the CSS rules used.
mode
Allowed values:
-
cut-out
The element and its children are cut out of the document.
-
remove-siblings
All element's siblings are removed.
-
hide-siblings
All element's siblings are hidden.
Default: cut-out
public HtmlToImageClient setWaitForElement(String selectors)
Wait for the specified element in a source document. The element is specified by one or more
CSS selectors. The element is searched for in the main document and all iframes. If the element is not found, the conversion fails. Your API license defines the maximum wait time by "Max Delay" parameter.
Examples:
-
Wait until an element with the id main-content is found.
setWaitForElement("#main-content")
-
Wait until an element with the class name main-content is found.
setWaitForElement(".main-content")
-
Wait until an element with the tag name table is found.
setWaitForElement("table")
-
Wait until an element with the tag name table or with the id main-content is found.
setWaitForElement("table, #main-content")
-
Wait until <p class="article"> is found within <div class="user-panel main">.
setWaitForElement("div.user-panel.main p.article")
public HtmlToImageClient setAutoDetectElementToConvert(boolean value)
The main HTML element for conversion is detected automatically.
Availability:
API client >= 5.5.0, converter >= 20.10.
See
versioning.
value
Set to true to detect the main element.
Default: false
public HtmlToImageClient setReadabilityEnhancements(String enhancements)
The input HTML is automatically enhanced to improve the readability.
Availability:
API client >= 5.5.0, converter >= 20.10.
See
versioning.
enhancements
Allowed values:
-
none
No enhancements are used.
-
readability-v1
Version 1 of the enhancements is used.
-
readability-v2
Version 2 of the enhancements is used.
-
readability-v3
Version 3 of the enhancements is used.
-
readability-v4
Version 4 of the enhancements is used.
Default: none
Data
Methods related to HTML template rendering.
public HtmlToImageClient setDataString(String dataString)
Set the input data for template rendering. The data format can be JSON, XML, YAML or CSV.
dataString
The input data string.
public HtmlToImageClient setDataFile(String dataFile)
Load the input data for template rendering from the specified file. The data format can be JSON, XML, YAML or CSV.
dataFile
The file path to a local file containing the input data.
Specify the input data format.
public HtmlToImageClient setDataEncoding(String encoding)
encoding
The data file encoding.
Default: utf-8
public HtmlToImageClient setDataIgnoreUndefined(boolean value)
Ignore undefined variables in the HTML template. The default mode is strict so any undefined variable causes the conversion to fail. You can use {% if variable is defined %} to check if the variable is defined.
value
Set to true to ignore undefined variables.
Default: false
public HtmlToImageClient setDataAutoEscape(boolean value)
Auto escape HTML symbols in the input data before placing them into the output.
value
Set to true to turn auto escaping on.
Default: false
public HtmlToImageClient setDataTrimBlocks(boolean value)
Auto trim whitespace around each template command block.
value
Set to true to turn auto trimming on.
Default: false
public HtmlToImageClient setDataOptions(String options)
Set the advanced data options:
- csv_delimiter - The CSV data delimiter, the default is ,.
- xml_remove_root - Remove the root XML element from the input data.
- data_root - The name of the root element inserted into the input data without a root node (e.g. CSV), the default is data.
options
Comma separated list of options.
Examples:
-
Use semicolon to separate CSV data.
setDataOptions("csv_delimiter=;")
-
Name the root of data rows and use the name in the template loop {% for row in rows %}...{% endfor %}.
setDataOptions("data_root=rows")
-
Remove XML root so it the HTML template can be more simple.
setDataOptions("xml_remove_root=1")
Miscellaneous
public HtmlToImageClient setDebugLog(boolean value)
Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the
getDebugLogUrl method or available in
conversion statistics.
value
Set to true to enable the debug logging.
Default: false
public String getDebugLogUrl()
Get the URL of the debug log for the last conversion.
Returns
-
String - The link to the debug log.
public int getRemainingCreditCount()
Get the number of conversion credits available in your
account.
This method can only be called after a call to one of the convertXtoY methods.
The returned value can differ from the actual count if you run parallel conversions.
The special value
999999 is returned if the information is not available.
Returns
-
int - The number of credits.
public int getConsumedCreditCount()
Get the number of credits consumed by the last conversion.
Returns
-
int - The number of credits.
Get the job id.
Returns
-
String - The unique job identifier.
public int getOutputSize()
Get the size of the output in bytes.
Returns
-
int - The count of bytes.
public String getVersion()
Get the version details.
Returns
-
String - API version, converter version, and client version.
public HtmlToImageClient setTag(String tag)
Tag the conversion with a custom value. The tag is used in
conversion statistics. A value longer than 32 characters is cut off.
tag
A string with the custom tag.
public HtmlToImageClient setHttpProxy(String proxy)
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTP scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
proxy
The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
public HtmlToImageClient setHttpsProxy(String proxy)
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTPS scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
proxy
The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
public HtmlToImageClient setClientCertificate(String certificate)
A client certificate to authenticate Pdfcrowd converter on your web server. The certificate is used for two-way SSL/TLS authentication and adds extra security.
certificate
The file must be in PKCS12 format.
The file must exist and not be empty.
public HtmlToImageClient setClientCertificatePassword(String password)
A password for PKCS12 file with a client certificate if it is needed.
Tweaks
Expert options for fine-tuning output.
public HtmlToImageClient setMaxLoadingTime(int maxTime)
Set the maximum time to load the page and its resources. After this time, all requests will be considered successful. This can be useful to ensure that the conversion does not timeout. Use this method if there is no other way to fix page loading.
Availability:
API client >= 5.15.0, converter >= 20.10.
See
versioning.
maxTime
The number of seconds to wait.
The value must be in the range 10-30.
API Client Options
public HtmlToImageClient setConverterVersion(String version)
Set the converter version. Different versions may produce different output. Choose which one provides the best output for your case.
Availability:
API client >= 5.0.0.
See
versioning.
version
The version identifier.
Allowed values:
-
24.04
Version 24.04.
-
20.10
Version 20.10.
-
18.10
Version 18.10.
Default: 24.04
public HtmlToImageClient setUseHttp(boolean value)
Specifies if the client communicates over HTTP or HTTPS with Pdfcrowd API.
value
Set to true to use HTTP.
Default: false
Warning
Using HTTP is insecure as data sent over HTTP is not encrypted. Enable this option only if you know what you are doing.
public HtmlToImageClient setUserAgent(String agent)
Set a custom user agent HTTP header. It can be useful if you are behind a proxy or a firewall.
agent
The user agent string.
Default: pdfcrowd_java_client/6.1.0 (https://pdfcrowd.com)
public HtmlToImageClient setProxy(String host, int port, String userName, String password)
Specifies an HTTP proxy that the API client library will use to connect to the internet.
public HtmlToImageClient setRetryCount(int count)
Specifies the number of automatic retries when the 502 or 503 HTTP status code is received. The status code indicates a temporary network issue. This feature can be disabled by setting to 0.
count
Number of retries.
Default: 1