PDF to HTML / Golang Reference

Availability: API client version >= 5.4.0

class PdfToHtmlClient

All setter methods return PdfToHtmlClient object unless specified otherwise.

Constructor

func NewPdfToHtmlClient(userName string, apiKey string) PdfToHtmlClient

Constructor for the PDFCrowd API client.

Parameters:
  • userName - Your username at PDFCrowd.
  • apiKey - Your API key.

Conversion Input

func (client *PdfToHtmlClient) ConvertUrl(url string) ([]byte, error)

Convert a PDF.

Parameter:
  • url - The address of the PDF to convert.
    Constraint:
    • Supported protocols are http:// and https://.
Returns:
[]byte - Byte array containing the conversion output.

func (client *PdfToHtmlClient) ConvertUrlToStream(url string, outStream io.Writer) error

Convert a PDF and write the result to an output stream.

Parameters:
  • url - The address of the PDF to convert.
    Constraint:
    • Supported protocols are http:// and https://.
  • outStream (OutputStream) - The output stream that will contain the conversion output.

func (client *PdfToHtmlClient) ConvertUrlToFile(url string, filePath string) error

Convert a PDF and write the result to a local file.

Parameters:
  • url - The address of the PDF to convert.
    Constraint:
    • Supported protocols are http:// and https://.
  • filePath - The output file path.
    Constraint:
    • The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.

func (client *PdfToHtmlClient) ConvertFile(file string) ([]byte, error)

Convert a local file.

Parameter:
  • file - The path to a local file to convert.
    Constraint:
    • The file must exist and not be empty.
Returns:
[]byte - Byte array containing the conversion output.

func (client *PdfToHtmlClient) ConvertFileToStream(file string, outStream io.Writer) error

Convert a local file and write the result to an output stream.

Parameters:
  • file - The path to a local file to convert.
    Constraint:
    • The file must exist and not be empty.
  • outStream (OutputStream) - The output stream that will contain the conversion output.

func (client *PdfToHtmlClient) ConvertFileToFile(file string, filePath string) error

Convert a local file and write the result to a local file.

Parameters:
  • file - The path to a local file to convert.
    Constraint:
    • The file must exist and not be empty.
  • filePath - The output file path.
    Constraint:
    • The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.

func (client *PdfToHtmlClient) ConvertRawData(data []byte) ([]byte, error)

Convert raw data.

Parameter:
  • data (byte[]) - The raw content to be converted.
Returns:
[]byte - Byte array with the output.

func (client *PdfToHtmlClient) ConvertRawDataToStream(data []byte, outStream io.Writer) error

Convert raw data and write the result to an output stream.

Parameters:
  • data (byte[]) - The raw content to be converted.
  • outStream (OutputStream) - The output stream that will contain the conversion output.

func (client *PdfToHtmlClient) ConvertRawDataToFile(data []byte, filePath string) error

Convert raw data to a file.

Parameters:
  • data (byte[]) - The raw content to be converted.
  • filePath - The output file path.
    Constraint:
    • The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.

func (client *PdfToHtmlClient) ConvertStream(inStream io.Reader) ([]byte, error)

Convert the contents of an input stream.

Parameter:
  • inStream (InputStream) - The input stream with source data.
Returns:
[]byte - Byte array containing the conversion output.

func (client *PdfToHtmlClient) ConvertStreamToStream(inStream io.Reader, outStream io.Writer) error

Convert the contents of an input stream and write the result to an output stream.

Parameters:
  • inStream (InputStream) - The input stream with source data.
  • outStream (OutputStream) - The output stream that will contain the conversion output.

func (client *PdfToHtmlClient) ConvertStreamToFile(inStream io.Reader, filePath string) error

Convert the contents of an input stream and write the result to a local file.

Parameters:
  • inStream (InputStream) - The input stream with source data.
  • filePath - The output file path.
    Constraint:
    • The converter generates an HTML or ZIP file. If ZIP file is generated, the file path must have a ZIP or zip extension.

General Options

func (client *PdfToHtmlClient) SetPdfPassword(password string) *PdfToHtmlClient

Password to open the encrypted PDF file.

Parameter:
  • password - The input PDF password.

func (client *PdfToHtmlClient) SetScaleFactor(factor int) *PdfToHtmlClient

Set the scaling factor (zoom) for the main page area.

Parameter:
  • factor (int) - The percentage value.
    Constraint:
    • Must be a positive integer.
    Default:
    100

func (client *PdfToHtmlClient) SetPrintPageRange(pages string) *PdfToHtmlClient

Set the page range to print.

Parameter:
  • pages
    Constraint:
    • A comma separated list of page numbers or ranges.
Examples:
  • Just the second page is printed: SetPrintPageRange("2")
  • The first and the third page are printed: SetPrintPageRange("1,3")
  • Everything except the first page is printed: SetPrintPageRange("2-")
  • Just first 3 pages are printed: SetPrintPageRange("-3")
  • Pages 3, 6, 7, 8 and 9 are printed: SetPrintPageRange("3,6-9")

func (client *PdfToHtmlClient) SetDpi(dpi int) *PdfToHtmlClient

Set the output graphics DPI.

Availability:
API client >= 5.16.0, converter >= 20.10. See versioning.
Parameter:
  • dpi (int) - The DPI value.
    Default:
    144

func (client *PdfToHtmlClient) SetImageMode(mode string) *PdfToHtmlClient

Specifies where the images are stored.

Parameter:
  • mode - The image storage mode.
    Allowed Values:
    • embed — The images are embedded into the output HTML file.
    • separate — The images are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all image files.
    • none — The images are ignored and not converted.
    Default:
    embed

func (client *PdfToHtmlClient) SetImageFormat(imageFormat string) *PdfToHtmlClient

Specifies the format for the output images.

Availability:
API client >= 5.17.0, converter >= 20.10. See versioning.
Parameter:
  • imageFormat - The image format.
    Allowed Values:
    • png
    • jpg
    • svg
    Default:
    png

func (client *PdfToHtmlClient) SetCssMode(mode string) *PdfToHtmlClient

Specifies where the style sheets are stored.

Parameter:
  • mode - The style sheet storage mode.
    Allowed Values:
    • embed — Style sheets are embedded into the output HTML file.
    • separate — Style sheets are saved to separate files. In this mode the output of the conversion is a zip file containing the HTML and all style sheets.
    Default:
    embed

func (client *PdfToHtmlClient) SetFontMode(mode string) *PdfToHtmlClient

Specifies where the fonts are stored.

Parameter:
  • mode - The font storage mode.
    Allowed Values:
    • embed — The fonts are embedded into the output HTML file.
    • separate — The font are saved to separate files. In this mode the output of the conversion is a zip file containing HTML and all font files.
    Default:
    embed

func (client *PdfToHtmlClient) SetType3Mode(mode string) *PdfToHtmlClient

Sets the processing mode for handling Type 3 fonts.

Availability:
API client >= 6.2.0, converter >= 24.04. See versioning.
Parameter:
  • mode - The type3 font mode.
    Allowed Values:
    • raster — Rasters Type 3 fonts into images, ensuring an exact visual representation in the HTML output.
    • convert — Attempts to convert Type 3 fonts to a web font, resulting in smaller file sizes with some possible visual discrepancies.
    Default:
    raster

func (client *PdfToHtmlClient) SetSplitLigatures(value bool) *PdfToHtmlClient

Converts ligatures, two or more letters combined into a single glyph, back into their individual ASCII characters.

Parameter:
  • value (bool) - Set to true to split ligatures.
    Default:
    false

func (client *PdfToHtmlClient) SetCustomCss(css string) *PdfToHtmlClient

Apply custom CSS to the output HTML document. It allows you to modify the visual appearance and layout. Tip: Using !important in custom CSS provides a way to prioritize and override conflicting styles.

Availability:
API client >= 6.2.0, converter >= 24.04. See versioning.
Parameter:
  • css - A string containing valid CSS.
Example:
  • Set the main background color to azure: SetCustomCss("#page-container { background-color: azure; }")

func (client *PdfToHtmlClient) SetHtmlNamespace(prefix string) *PdfToHtmlClient

Add the specified prefix to all id and class attributes in the HTML content, creating a namespace for safe integration into another HTML document. This ensures unique identifiers, preventing conflicts when merging with other HTML.

Availability:
API client >= 6.3.0, converter >= 24.04. See versioning.
Parameter:
  • prefix - The prefix to add before each id and class attribute name.
    Constraint:
    • Start with a letter or underscore, and use only letters, numbers, hyphens, underscores, or colons.
Examples:
  • Namespace for first PDF embed: SetHtmlNamespace("pdf1_")
  • Custom namespace to avoid conflicts: SetHtmlNamespace("uniqueID123_")

func (client *PdfToHtmlClient) IsZippedOutput() bool

A helper method to determine if the output file is a zip archive. The output of the conversion may be either an HTML file or a zip file containing the HTML and its external assets.

Returns:
bool - True if the conversion output is a zip file, otherwise False.

func (client *PdfToHtmlClient) SetForceZip(value bool) *PdfToHtmlClient

Enforces the zip output format.

Parameter:
  • value (bool) - Set to true to get the output as a zip archive.
    Default:
    false

func (client *PdfToHtmlClient) SetTitle(title string) *PdfToHtmlClient

Set the HTML title. The title from the input PDF is used by default.

Parameter:
  • title - The HTML title.

func (client *PdfToHtmlClient) SetSubject(subject string) *PdfToHtmlClient

Set the HTML subject. The subject from the input PDF is used by default.

Parameter:
  • subject - The HTML subject.

func (client *PdfToHtmlClient) SetAuthor(author string) *PdfToHtmlClient

Set the HTML author. The author from the input PDF is used by default.

Parameter:
  • author - The HTML author.

func (client *PdfToHtmlClient) SetKeywords(keywords string) *PdfToHtmlClient

Associate keywords with the HTML document. Keywords from the input PDF are used by default.

Parameter:
  • keywords - The string containing the keywords.

Miscellaneous

func (client *PdfToHtmlClient) SetDebugLog(value bool) *PdfToHtmlClient

Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the getDebugLogUrl method or available in conversion statistics.

Parameter:
  • value (bool) - Set to true to enable the debug logging.
    Default:
    false

func (client *PdfToHtmlClient) GetDebugLogUrl() string

Get the URL of the debug log for the last conversion.

Returns:
string - The link to the debug log.

func (client *PdfToHtmlClient) GetRemainingCreditCount() int

Get the number of conversion credits available in your account.
This method can only be called after a call to one of the convertXtoY methods.
The returned value can differ from the actual count if you run parallel conversions.
The special value 999999 is returned if the information is not available.

Returns:
int - The number of credits.

func (client *PdfToHtmlClient) GetConsumedCreditCount() int

Get the number of credits consumed by the last conversion.

Returns:
int - The number of credits.

func (client *PdfToHtmlClient) GetJobId() string

Get the job id.

Returns:
string - The unique job identifier.

func (client *PdfToHtmlClient) GetPageCount() int

Get the number of pages in the output document.

Returns:
int - The page count.

func (client *PdfToHtmlClient) GetOutputSize() int

Get the size of the output in bytes.

Returns:
int - The count of bytes.

func (client *PdfToHtmlClient) GetVersion() string

Get the version details.

Returns:
string - API version, converter version, and client version.

func (client *PdfToHtmlClient) SetTag(tag string) *PdfToHtmlClient

Tag the conversion with a custom value. The tag is used in conversion statistics. A value longer than 32 characters is cut off.

Parameter:
  • tag - A string with the custom tag.
Example:
  • Track job in analytics: SetTag("client-1234")

func (client *PdfToHtmlClient) SetHttpProxy(proxy string) *PdfToHtmlClient

A proxy server used by the conversion process for accessing the source URLs with HTTP scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.

Parameter:
  • proxy
    Constraint:
    • The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
Examples:
  • Corporate proxy server: SetHttpProxy("myproxy.com:8080")
  • Direct IP proxy connection: SetHttpProxy("113.25.84.10:33333")

func (client *PdfToHtmlClient) SetHttpsProxy(proxy string) *PdfToHtmlClient

A proxy server used by the conversion process for accessing the source URLs with HTTPS scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.

Parameter:
  • proxy
    Constraint:
    • The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
Examples:
  • Secure proxy for HTTPS: SetHttpsProxy("myproxy.com:443")
  • Direct secure proxy IP: SetHttpsProxy("113.25.84.10:44333")

API Client Options

func (client *PdfToHtmlClient) SetConverterVersion(version string) *PdfToHtmlClient

Set the converter version. Different versions may produce different output. Choose which one provides the best output for your case.

Availability:
API client >= 5.0.0. See versioning.
Parameter:
  • version - The version identifier.
    Allowed Values:
    • 24.04 — Version 24.04.
    • 20.10 — Version 20.10.
    • 18.10 — Version 18.10.
    • latest — Version 20.10 is used.
    Default:
    24.04

func (client *PdfToHtmlClient) SetUseHttp(value bool) *PdfToHtmlClient

Specify whether to use HTTP or HTTPS when connecting to the PDFCrowd API.

Parameter:
  • value (bool) - Set to true to use HTTP.
    Default:
    false

func (client *PdfToHtmlClient) SetClientUserAgent(agent string) *PdfToHtmlClient

Specifies the User-Agent HTTP header that the client library will use when interacting with the API.

Availability:
API client >= 6.4.0 See versioning.
Parameter:
  • agent - The user agent string.

func (client *PdfToHtmlClient) SetUserAgent(agent string) *PdfToHtmlClient

Deprecated Replaced with: SetClientUserAgent

Set a custom user agent HTTP header. It can be useful if you are behind a proxy or a firewall.

Parameter:
  • agent - The user agent string.
    Default:
    pdfcrowd_go_client/6.5.2 (https://pdfcrowd.com)

func (client *PdfToHtmlClient) SetProxy(host string, port int, userName string, password string) *PdfToHtmlClient

Specifies an HTTP proxy that the API client library will use to connect to the internet.

Parameters:
  • host - The proxy hostname.
  • port (int) - The proxy port.
  • userName - The username.
  • password - The password.

func (client *PdfToHtmlClient) SetRetryCount(count int) *PdfToHtmlClient

Specifies the number of automatic retries when the 502 or 503 HTTP status code is received. The status code indicates a temporary network issue. This feature can be disabled by setting to 0.

Parameter:
  • count (int) - Number of retries.
    Default:
    1
Example:
  • Retry failed requests three times: SetRetryCount(3)