Important: This document is for the beta version of the new Pdfcrowd API. Use this documentation for the stable API version.

HTML to Image - Ruby SDK

Installation

You can install the client library from rubygems.org
 $ gem install pdfcrowd

You can learn more install options here.

The API client library si common for all Pdfcrowd converters.

Authentication

Authentication is needed in order to use the Pdfcrowd API. The credentials used for accessing the API are your Pdfcrowd username and the API key. You can find the API key in your account page.

Getting Started

The API lets you convert a web page, a local HTML file, or a string containing HTML. The result of the conversion can be stored to a local file, to a stream object or to a variable. See the conversion input section for more details.

The best way to start with the API is to choose one of the examples below and once you get it working, you can:

You can also use these HTML related features:

  • You can use the following classes in your HTML code which hide/remove elements from the output:
    • pdfcrowd-remove - sets display:none on the element
    • pdfcrowd-hide - sets visibility:hidden on the element
  • You can switch to the print version of the page (if it exists) with setUsePrintMedia.
  • You can force a page break with
    <div style="page-break-before:always"></div>
  • You can avoid a page break inside an element with the following CSS
    img { page-break-inside:avoid }
  • You can use setCustomJavascript to alter the HTML contents with a custom JavaScript.

Web Page to PNG

Convert a web page to a PNG file

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")

    # run the conversion and write the result to a file
    client.convertUrlToFile("http://www.example.com", 'example.png')
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

Convert a web page to in-memory PNG

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")

    # create output file for conversion result
    output_file = open("example.png", "wb")

    # run the conversion and store the result into an image variable
    image = client.convertUrl("http://www.example.com")

    # write the image the into the output file
    output_file.write(image)

    # close the output file
    output_file.close()
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

Convert a web page and write the resulting PNG to an output stream

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")

    # create output stream for conversion result
    output_stream = open("example.png", "wb")

    # run the conversion and write the result into the output stream
    client.convertUrlToStream("http://www.example.com", output_stream)

    # close the output stream
    output_stream.close()
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

Local HTML File to PNG

Convert a local HTML file to a PNG file

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")

    # run the conversion and write the result to a file
    client.convertFileToFile("/path/to/MyLayout.html", 'MyLayout.png')
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

Convert a local HTML file to in-memory PNG

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")

    # create output file for conversion result
    output_file = open("MyLayout.png", "wb")

    # run the conversion and store the result into an image variable
    image = client.convertFile("/path/to/MyLayout.html")

    # write the image the into the output file
    output_file.write(image)

    # close the output file
    output_file.close()
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

Convert a local HTML file and write the resulting PNG to an output stream

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")

    # create output stream for conversion result
    output_stream = open("MyLayout.png", "wb")

    # run the conversion and write the result into the output stream
    client.convertFileToStream("/path/to/MyLayout.html", output_stream)

    # close the output stream
    output_stream.close()
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

HTML String to PNG

Convert a string containing HTML to a PNG file

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")

    # run the conversion and write the result to a file
    client.convertStringToFile("<html><body><h1>Hello World!</h1></body></html>", 'HelloWorld.png')
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

Convert a string containing HTML to in-memory PNG

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")

    # create output file for conversion result
    output_file = open("HelloWorld.png", "wb")

    # run the conversion and store the result into an image variable
    image = client.convertString("<html><body><h1>Hello World!</h1></body></html>")

    # write the image the into the output file
    output_file.write(image)

    # close the output file
    output_file.close()
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

Convert a string containing HTML and write the resulting PNG to an output stream

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")

    # create output stream for conversion result
    output_stream = open("HelloWorld.png", "wb")

    # run the conversion and write the result into the output stream
    client.convertStringToStream("<html><body><h1>Hello World!</h1></body></html>", output_stream)

    # close the output stream
    output_stream.close()
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

Conversion Info

Get info about the current conversion

require 'pdfcrowd'

begin
    # create the API client instance
    client = Pdfcrowd::HtmlToImageClient.new("username", "apikey")

    # configure the conversion
    client.setOutputFormat("png")
    client.setDebugLog(true)

    # run the conversion and write the result to a file
    client.convertFileToFile("/path/to/MyLayout.html", 'MyLayout.png')
    
    # print URL to the debug log
    puts "Debug log url: #{client.getDebugLogUrl()}"
    
    # print the number of available conversion credits in your account
    puts "Remaining credit count: #{client.getRemainingCreditCount()}"
    
    # print the number of credits consumed by the conversion
    puts "Consumed credit count: #{client.getConsumedCreditCount()}"
    
    # print the unique ID of the conversion
    puts "Job id: #{client.getJobId()}"
    
    # print the size of the output in bytes
    puts "Output size: #{client.getOutputSize()}"
rescue Pdfcrowd::Error => why
    # report the error to the standard error stream
    STDERR.puts "Pdfcrowd Error: #{why}"
end

API Reference - class HtmlToImageClient

Conversion from HTML to image.

Constructor

def initialize(user_name, api_key)
Constructor for the Pdfcrowd API client.
Parameter Description Default
user_name
Your username at Pdfcrowd.
api_key
Your API key.

 

Conversion Format

def setOutputFormat(output_format)
The format of the output file.
Parameter Description Default
output_format
Allowed values:
  • png
  • jpg
  • gif
  • tiff
  • bmp
  • ico
  • ppm
  • pgm
  • pbm
  • pnm
  • psb
  • pct
  • ras
  • tga
  • sgi
  • sun
  • webp
png
Returns
  • HtmlToImageClient - The converter object.

 

Conversion Input

def convertUrl(url)
Convert a web page.
Parameter Description Default
url
The address of the web page to convert.
The supported protocols are http:// and https://.
Returns
  • byte[] - Byte array containing the conversion output.
def convertUrlToStream(url, out_stream)
Convert a web page and write the result to an output stream.
Parameter Description Default
url
The address of the web page to convert.
The supported protocols are http:// and https://.
out_stream
The output stream that will contain the conversion output.
def convertUrlToFile(url, file_path)
Convert a web page and write the result to a local file.
Parameter Description Default
url
The address of the web page to convert.
The supported protocols are http:// and https://.
file_path
The output file path.
The string must not be empty.
def convertFile(file)
Convert a local file.
Parameter Description Default
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
Returns
  • byte[] - Byte array containing the conversion output.
def convertFileToStream(file, out_stream)
Convert a local file and write the result to an output stream.
Parameter Description Default
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
out_stream
The output stream that will contain the conversion output.
def convertFileToFile(file, file_path)
Convert a local file and write the result to a local file.
Parameter Description Default
file
The path to a local file to convert.
The file can be either a single file or an archive (.tar.gz, .tar.bz2, or .zip).
If the HTML document refers to local external assets (images, style sheets, javascript), zip the document together with the assets.
The file must exist and not be empty.
The file name must have a valid extension.
file_path
The output file path.
The string must not be empty.
def convertString(text)
Convert a string.
Parameter Description Default
text
The string content to convert.
The string must not be empty.
Returns
  • byte[] - Byte array containing the conversion output.
def convertStringToStream(text, out_stream)
Convert a string and write the output to an output stream.
Parameter Description Default
text
The string content to convert.
The string must not be empty.
out_stream
The output stream that will contain the conversion output.
def convertStringToFile(text, file_path)
Convert a string and write the output to a file.
Parameter Description Default
text
The string content to convert.
The string must not be empty.
file_path
The output file path.
The string must not be empty.

 

General Options

def setNoBackground(no_background)
Do not print the background graphics.
Parameter Description Default
no_background
Set to true to disable the background graphics.
false
Returns
  • HtmlToImageClient - The converter object.
def setDisableJavascript(disable_javascript)
Do not execute JavaScript.
Parameter Description Default
disable_javascript
Set to true to disable JavaScript in web pages.
false
Returns
  • HtmlToImageClient - The converter object.
def setDisableImageLoading(disable_image_loading)
Do not load images.
Parameter Description Default
disable_image_loading
Set to true to disable loading of images.
false
Returns
  • HtmlToImageClient - The converter object.
def setDisableRemoteFonts(disable_remote_fonts)
Disable loading fonts from remote sources.
Parameter Description Default
disable_remote_fonts
Set to true disable loading remote fonts.
false
Returns
  • HtmlToImageClient - The converter object.
def setBlockAds(block_ads)
Try to block ads. Enabling this option can produce smaller output and speed up the conversion.
Parameter Description Default
block_ads
Set to true to block ads in web pages.
false
Returns
  • HtmlToImageClient - The converter object.
def setDefaultEncoding(default_encoding)
Set the default HTML content text encoding.
Parameter Description Default
default_encoding
The text encoding of the HTML content.
auto detect
Returns
  • HtmlToImageClient - The converter object.
def setHttpAuth(user_name, password)
Set the HTTP authentication.
Parameter Description Default
user_name
Set the HTTP authentication user name.
password
Set the HTTP authentication password.
Returns
  • HtmlToImageClient - The converter object.
def setUsePrintMedia(use_print_media)
Use the print version of the page if available (@media print).
Parameter Description Default
use_print_media
Set to true to use the print version of the page.
false
Returns
  • HtmlToImageClient - The converter object.
def setNoXpdfcrowdHeader(no_xpdfcrowd_header)
Do not send the X-Pdfcrowd HTTP header in Pdfcrowd HTTP requests.
Parameter Description Default
no_xpdfcrowd_header
Set to true to disable sending X-Pdfcrowd HTTP header.
false
Returns
  • HtmlToImageClient - The converter object.
def setCookies(cookies)
Set cookies that are sent in Pdfcrowd HTTP requests.
Parameter Description Default
cookies
The cookie string.
Returns
  • HtmlToImageClient - The converter object.
Examples:
  • setCookies("session=6d7184b3bf35;token=2710")
def setVerifySslCertificates(verify_ssl_certificates)
Do not allow insecure HTTPS connections.
Parameter Description Default
verify_ssl_certificates
Set to true to enable SSL certificate verification.
false
Returns
  • HtmlToImageClient - The converter object.
def setFailOnMainUrlError(fail_on_error)
Abort the conversion if the main URL HTTP status code is greater than or equal to 400.
Parameter Description Default
fail_on_error
Set to true to abort the conversion.
false
Returns
  • HtmlToImageClient - The converter object.
def setFailOnAnyUrlError(fail_on_error)
Abort the conversion if any of the sub-request HTTP status code is greater than or equal to 400.
Parameter Description Default
fail_on_error
Set to true to abort the conversion.
false
Returns
  • HtmlToImageClient - The converter object.
def setCustomJavascript(custom_javascript)
Run a custom JavaScript after the document is loaded. The script is intended for post-load DOM manipulation (add/remove elements, update CSS, ...).
Parameter Description Default
custom_javascript
String containing a JavaScript code.
The string must not be empty.
Returns
  • HtmlToImageClient - The converter object.
def setCustomHttpHeader(custom_http_header)
Set a custom HTTP header that is sent in Pdfcrowd HTTP requests.
Parameter Description Default
custom_http_header
A string containing the header name and value separated by a colon.
Returns
  • HtmlToImageClient - The converter object.
Examples:
  • setCustomHttpHeader("X-My-Client-ID:k2017-12345")
def setJavascriptDelay(javascript_delay)
Wait the specified number of milliseconds to finish all JavaScript after the document is loaded. The maximum value is determined by your API license.
Parameter Description Default
javascript_delay
The number of milliseconds to wait.
Must be a positive integer number or 0.
200
Returns
  • HtmlToImageClient - The converter object.
def setElementToConvert(selectors)
Convert only the specified element from the main document and its children. The element is specified by one or more CSS selectors. If the element is not found, the conversion fails. If multiple elements are found, the first one is used.
Parameter Description Default
selectors
One or more CSS selectors separated by commas.
The string must not be empty.
Returns
  • HtmlToImageClient - The converter object.
Examples:
  • The first element with the id main-content is converted.
    setElementToConvert("#main-content")
  • The first element with the class name main-content is converted.
    setElementToConvert(".main-content")
  • The first element with the tag name table is converted.
    setElementToConvert("table")
  • The first element with the tag name table or with the id main-content is converted.
    setElementToConvert("table, #main-content")
  • The first element <p class="article"> within <div class="user-panel main"> is converted.
    setElementToConvert("div.user-panel.main p.article")
def setElementToConvertMode(mode)
Specify the DOM handling when only a part of the document is converted.
Parameter Description Default
mode
Allowed values:
  • cut-out
    The element and its children are cut out of the document.
  • remove-siblings
    All element's siblings are removed.
  • hide-siblings
    All element's sibilings are hidden.
cut-out
Returns
  • HtmlToImageClient - The converter object.
def setWaitForElement(selectors)
Wait for the specified element in a source document. The element is specified by one or more CSS selectors. The element is searched for in the main document and all iframes. If the element is not found, the conversion fails.
Parameter Description Default
selectors
One or more CSS selectors separated by commas.
The string must not be empty.
Returns
  • HtmlToImageClient - The converter object.
Examples:
  • Wait until an element with the id main-content is found.
    setWaitForElement("#main-content")
  • Wait until an element with the class name main-content is found.
    setWaitForElement(".main-content")
  • Wait until an element with the tag name table is found.
    setWaitForElement("table")
  • Wait until an element with the tag name table or with the id main-content is found.
    setWaitForElement("table, #main-content")
  • Wait until <p class="article"> is found within <div class="user-panel main">.
    setWaitForElement("div.user-panel.main p.article")

 

Image Output

def setScreenshotWidth(screenshot_width)
Set the output image width in pixels.
Parameter Description Default
screenshot_width
The value must be in a range 96-7680.
1024
Returns
  • HtmlToImageClient - The converter object.
def setScreenshotHeight(screenshot_height)
Set the output image height in pixels. If it's not specified, actual document height is used.
Parameter Description Default
screenshot_height
Must be a positive integer number.
Returns
  • HtmlToImageClient - The converter object.

 

Miscellaneous

def setDebugLog(debug_log)
Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the getDebugLogUrl method.
Parameter Description Default
debug_log
Set to true to enable the debug logging.
false
Returns
  • HtmlToImageClient - The converter object.
def getDebugLogUrl()
Get the URL of the debug log for the last conversion.
Returns
  • string - The link to the debug log.
def getRemainingCreditCount()
Get the number of conversion credits available in your account.
The returned value can differ from the actual count if you run parallel conversions.
The special value 999999 is returned if the information is not available.
Returns
  • int - The number of credits.
def getConsumedCreditCount()
Get the number of credits consumed by the last conversion.
Returns
  • int - The number of credits.
def getJobId()
Get the job id.
Returns
  • string - The unique job identifier.
def getOutputSize()
Get the size of the output in bytes.
Returns
  • int - The count of bytes.

 

API Client Options

def setUseHttp(use_http)
Specifies if the client communicates over HTTP or HTTPS with Pdfcrowd API.
Parameter Description Default
use_http
Set to true to use HTTP.
false
Returns
  • HtmlToImageClient - The converter object.
def setUserAgent(user_agent)
Set a custom user agent HTTP header. It can be usefull if you are behind some proxy or firewall.
Parameter Description Default
user_agent
The user agent string.
pdfcrowd_ruby_client/4.3 (http://pdfcrowd.com)
Returns
  • HtmlToImageClient - The converter object.
def setProxy(host, port, user_name, password)
Specifies an HTTP proxy that the API client library will use to connect to the internet.
Parameter Description Default
host
The proxy hostname.
port
The proxy port.
user_name
The username.
password
The password.
Returns
  • HtmlToImageClient - The converter object.
def setRetryCount(retry_count)
Specifies the number of retries when the 502 HTTP status code is received. The 502 status code indicates a temporary network issue. This feature can be disabled by setting to 0.
Parameter Description Default
retry_count
Number of retries wanted.
1
Returns
  • HtmlToImageClient - The converter object.

 

Error Handling

begin 
    # call the API 
rescue Pdfcrowd::Error => why 
    # print error
    STDERR.puts "Pdfcrowd Error: #{why}"

    # print just error code
    STDERR.puts "Pdfcrowd Error Code: #{why.getCode()}"

    # print just error message
    STDERR.puts "Pdfcrowd Error Message: #{why.getMessage()}"

    # or handle the error by your way
end