PDF to HTML Ruby Reference
Availability: API client version >= 5.4.0
class PdfToHtmlClient
All setter methods return PdfToHtmlClient object unless specified otherwise.
Constructor
def initialize(user_name, api_key)
Constructor for the Pdfcrowd API client.
user_name
Your username at Pdfcrowd.
Convert a PDF.
url
The address of the PDF to convert.
The supported protocols are http:// and https://.
Returns
-
byte[] - Byte array containing the conversion output.
def convertUrlToStream(url, out_stream)
Convert a PDF and write the result to an output stream.
url
The address of the PDF to convert.
The supported protocols are http:// and https://.
out_stream
The output stream that will contain the conversion output.
def convertUrlToFile(url, file_path)
Convert a PDF and write the result to a local file.
url
The address of the PDF to convert.
The supported protocols are http:// and https://.
file_path
The output file path.
The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.
Convert a local file.
file
The path to a local file to convert.
The file must exist and not be empty.
Returns
-
byte[] - Byte array containing the conversion output.
def convertFileToStream(file, out_stream)
Convert a local file and write the result to an output stream.
file
The path to a local file to convert.
The file must exist and not be empty.
out_stream
The output stream that will contain the conversion output.
def convertFileToFile(file, file_path)
Convert a local file and write the result to a local file.
file
The path to a local file to convert.
The file must exist and not be empty.
file_path
The output file path.
The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.
Convert raw data.
data
The raw content to be converted.
Returns
-
byte[] - Byte array with the output.
def convertRawDataToStream(data, out_stream)
Convert raw data and write the result to an output stream.
data
The raw content to be converted.
out_stream
The output stream that will contain the conversion output.
def convertRawDataToFile(data, file_path)
Convert raw data to a file.
data
The raw content to be converted.
file_path
The output file path.
The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.
def convertStream(in_stream)
Convert the contents of an input stream.
in_stream
The input stream with source data.
Returns
-
byte[] - Byte array containing the conversion output.
def convertStreamToStream(in_stream, out_stream)
Convert the contents of an input stream and write the result to an output stream.
in_stream
The input stream with source data.
out_stream
The output stream that will contain the conversion output.
def convertStreamToFile(in_stream, file_path)
Convert the contents of an input stream and write the result to a local file.
in_stream
The input stream with source data.
file_path
The output file path.
The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.
General Options
def setPdfPassword(password)
Password to open the encrypted PDF file.
password
The input PDF password.
def setScaleFactor(factor)
Set the scaling factor (zoom) for the main page area.
factor
The percentage value.
Must be a positive integer number.
Default: 100
def setPrintPageRange(pages)
Set the page range to print.
pages
A comma separated list of page numbers or ranges.
Examples:
-
Just the second page is printed.
setPrintPageRange("2")
-
The first and the third page are printed.
setPrintPageRange("1,3")
-
Everything except the first page is printed.
setPrintPageRange("2-")
-
Just first 3 pages are printed.
setPrintPageRange("-3")
-
Pages 3, 6, 7, 8 and 9 are printed.
setPrintPageRange("3,6-9")
Set the output graphics DPI.
Availability:
API client >= 5.16.0, converter >= 20.10.
See
versioning.
dpi
The DPI value.
Default: 144
Specifies where the images are stored.
mode
The image storage mode.
Allowed values:
-
embed
The images are embedded into the output HTML file.
-
separate
The images are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all image files.
-
none
The images are ignored and not converted.
Default: embed
Specifies the format for the output images.
Availability:
API client >= 5.17.0, converter >= 20.10.
See
versioning.
Specifies where the style sheets are stored.
mode
The style sheet storage mode.
Default: embed
Specifies where the fonts are stored.
mode
The font storage mode.
Default: embed
A helper method to determine if the output file is a zip archive. The output of the conversion may be either an HTML file or a zip file containing the HTML and its external assets.
Returns
-
bool - True if the conversion output is a zip file, otherwise False.
Enforces the zip output format.
value
Set to true to get the output as a zip archive.
Default: false
Set the HTML title. The title from the input PDF is used by default.
Set the HTML subject. The subject from the input PDF is used by default.
subject
The HTML subject.
Set the HTML author. The author from the input PDF is used by default.
def setKeywords(keywords)
Associate keywords with the HTML document. Keywords from the input PDF are used by default.
keywords
The string containing the keywords.
Miscellaneous
Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the
getDebugLogUrl method or available in
conversion statistics.
value
Set to true to enable the debug logging.
Default: false
Get the URL of the debug log for the last conversion.
Returns
-
string - The link to the debug log.
def getRemainingCreditCount()
Get the number of conversion credits available in your
account.
This method can only be called after a call to one of the convertXtoY methods.
The returned value can differ from the actual count if you run parallel conversions.
The special value
999999 is returned if the information is not available.
Returns
-
int - The number of credits.
def getConsumedCreditCount()
Get the number of credits consumed by the last conversion.
Returns
-
int - The number of credits.
Get the job id.
Returns
-
string - The unique job identifier.
Get the number of pages in the output document.
Get the size of the output in bytes.
Returns
-
int - The count of bytes.
Get the version details.
Returns
-
string - API version, converter version, and client version.
Tag the conversion with a custom value. The tag is used in
conversion statistics. A value longer than 32 characters is cut off.
tag
A string with the custom tag.
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTP scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
proxy
The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTPS scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
proxy
The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
API Client Options
Specifies if the client communicates over HTTP or HTTPS with Pdfcrowd API.
value
Set to true to use HTTP.
Default: false
Warning
Using HTTP is insecure as data sent over HTTP is not encrypted. Enable this option only if you know what you are doing.
Set a custom user agent HTTP header. It can be useful if you are behind a proxy or a firewall.
agent
The user agent string.
Default: pdfcrowd_ruby_client/5.17.0 (https://pdfcrowd.com)
def setProxy(host, port, user_name, password)
Specifies an HTTP proxy that the API client library will use to connect to the internet.
Specifies the number of automatic retries when the 502 or 503 HTTP status code is received. The status code indicates a temporary network issue. This feature can be disabled by setting to 0.
count
Number of retries.
Default: 1