class PdfToTextClient
All setter methods return PdfToTextClient object unless specified otherwise.
Constructor
public PdfToTextClient(string userName, string apiKey)
Constructor for the Pdfcrowd API client.
userName
Your username at Pdfcrowd.
public byte[] convertUrl(string url)
Convert a PDF.
url
The address of the PDF to convert.
The supported protocols are http:// and https://.
Returns
-
byte[] - Byte array containing the conversion output.
public void convertUrlToStream(string url, Stream outStream)
Convert a PDF and write the result to an output stream.
url
The address of the PDF to convert.
The supported protocols are http:// and https://.
outStream
The output stream that will contain the conversion output.
public void convertUrlToFile(string url, string filePath)
Convert a PDF and write the result to a local file.
url
The address of the PDF to convert.
The supported protocols are http:// and https://.
filePath
The output file path.
public byte[] convertFile(string file)
Convert a local file.
file
The path to a local file to convert.
The file must exist and not be empty.
Returns
-
byte[] - Byte array containing the conversion output.
public void convertFileToStream(string file, Stream outStream)
Convert a local file and write the result to an output stream.
file
The path to a local file to convert.
The file must exist and not be empty.
outStream
The output stream that will contain the conversion output.
public void convertFileToFile(string file, string filePath)
Convert a local file and write the result to a local file.
file
The path to a local file to convert.
The file must exist and not be empty.
filePath
The output file path.
public byte[] convertRawData(byte[] data)
Convert raw data.
data
The raw content to be converted.
Returns
-
byte[] - Byte array with the output.
public void convertRawDataToStream(byte[] data, Stream outStream)
Convert raw data and write the result to an output stream.
data
The raw content to be converted.
outStream
The output stream that will contain the conversion output.
public void convertRawDataToFile(byte[] data, string filePath)
Convert raw data to a file.
data
The raw content to be converted.
filePath
The output file path.
public byte[] convertStream(Stream inStream)
Convert the contents of an input stream.
inStream
The input stream with source data.
Returns
-
byte[] - Byte array containing the conversion output.
public void convertStreamToStream(Stream inStream, Stream outStream)
Convert the contents of an input stream and write the result to an output stream.
inStream
The input stream with source data.
outStream
The output stream that will contain the conversion output.
public void convertStreamToFile(Stream inStream, string filePath)
Convert the contents of an input stream and write the result to a local file.
inStream
The input stream with source data.
filePath
The output file path.
General Options
public PdfToTextClient setPdfPassword(string password)
The password to open the encrypted PDF file.
password
The input PDF password.
public PdfToTextClient setPrintPageRange(string pages)
Set the page range to print.
pages
A comma separated list of page numbers or ranges.
Examples:
-
Just the second page is printed.
setPrintPageRange("2")
-
The first and the third page are printed.
setPrintPageRange("1,3")
-
Everything except the first page is printed.
setPrintPageRange("2-")
-
Just first 3 pages are printed.
setPrintPageRange("-3")
-
Pages 3, 6, 7, 8 and 9 are printed.
setPrintPageRange("3,6-9")
public PdfToTextClient setNoLayout(bool value)
Ignore the original PDF layout.
value
Set to true to ignore the layout.
Default: false
public PdfToTextClient setEol(string eol)
The end-of-line convention for the text output.
eol
Allowed values:
-
unix
Unix convension "LF" is used.
-
dos
Dos convension "CR LF" is used.
-
mac
Mac convension "CR" is used.
Default: unix
public PdfToTextClient setPageBreakMode(string mode)
Specify the page break mode for the text output.
mode
Allowed values:
-
none
No page breaks are inserted.
-
default
The standard page break code "FF" is used.
-
custom
A custom page break is used.
Default: none
public PdfToTextClient setCustomPageBreak(string pageBreak)
Specify the custom page break.
pageBreak
String to insert between the pages.
public PdfToTextClient setParagraphMode(string mode)
Specify the paragraph detection mode.
public PdfToTextClient setLineSpacingThreshold(string threshold)
Set the maximum line spacing when the paragraph detection mode is enabled.
threshold
The value must be a positive integer percentage.
Default: 10%
public PdfToTextClient setRemoveHyphenation(bool value)
Remove the hyphen character from the end of lines.
value
Set to true to remove hyphens.
Default: false
public PdfToTextClient setRemoveEmptyLines(bool value)
Remove empty lines from the text output.
value
Set to true to remove empty lines.
Default: false
public PdfToTextClient setCropAreaX(int x)
Set the top left X coordinate of the crop area in points.
x
Must be a positive integer number or 0.
Default: 0
public PdfToTextClient setCropAreaY(int y)
Set the top left Y coordinate of the crop area in points.
y
Must be a positive integer number or 0.
Default: 0
public PdfToTextClient setCropAreaWidth(int width)
Set the width of the crop area in points.
width
Must be a positive integer number or 0.
Default: PDF page width.
public PdfToTextClient setCropAreaHeight(int height)
Set the height of the crop area in points.
height
Must be a positive integer number or 0.
Default: PDF page height.
public PdfToTextClient setCropArea(int x, int y, int width, int height)
Set the crop area. It allows to extract just a part of a PDF page.
x
Set the top left X coordinate of the crop area in points.
Must be a positive integer number or 0.
Default: 0
y
Set the top left Y coordinate of the crop area in points.
Must be a positive integer number or 0.
Default: 0
width
Set the width of the crop area in points.
Must be a positive integer number or 0.
Default: PDF page width.
height
Set the height of the crop area in points.
Must be a positive integer number or 0.
Default: PDF page height.
Miscellaneous
public PdfToTextClient setDebugLog(bool value)
Turn on the debug logging. Details about the conversion are stored in the debug log. The URL of the log can be obtained from the
getDebugLogUrl method or available in
conversion statistics.
value
Set to true to enable the debug logging.
Default: false
public string getDebugLogUrl()
Get the URL of the debug log for the last conversion.
Returns
-
string - The link to the debug log.
public int getRemainingCreditCount()
Get the number of conversion credits available in your
account.
This method can only be called after a call to one of the convertXtoY methods.
The returned value can differ from the actual count if you run parallel conversions.
The special value
999999 is returned if the information is not available.
Returns
-
int - The number of credits.
public int getConsumedCreditCount()
Get the number of credits consumed by the last conversion.
Returns
-
int - The number of credits.
Get the job id.
Returns
-
string - The unique job identifier.
public int getPageCount()
Get the number of pages in the output document.
public int getOutputSize()
Get the size of the output in bytes.
Returns
-
int - The count of bytes.
public string getVersion()
Get the version details.
Returns
-
string - API version, converter version, and client version.
public PdfToTextClient setTag(string tag)
Tag the conversion with a custom value. The tag is used in
conversion statistics. A value longer than 32 characters is cut off.
tag
A string with the custom tag.
public PdfToTextClient setHttpProxy(string proxy)
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTP scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
proxy
The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
public PdfToTextClient setHttpsProxy(string proxy)
A proxy server used by Pdfcrowd conversion process for accessing the source URLs with HTTPS scheme. It can help to circumvent regional restrictions or provide limited access to your intranet.
proxy
The value must have format DOMAIN_OR_IP_ADDRESS:PORT.
API Client Options
public PdfToTextClient setUseHttp(bool value)
Specifies if the client communicates over HTTP or HTTPS with Pdfcrowd API.
value
Set to true to use HTTP.
Default: false
Warning
Using HTTP is insecure as data sent over HTTP is not encrypted. Enable this option only if you know what you are doing.
public PdfToTextClient setClientUserAgent(string agent)
Specifies the User-Agent HTTP header that the client library will use when interacting with the API.
agent
The user agent string.
public PdfToTextClient setUserAgent(string agent)
Set a custom user agent HTTP header. It can be useful if you are behind a proxy or a firewall.
agent
The user agent string.
Default: pdfcrowd_dotnet_client/6.4.0 (https://pdfcrowd.com)
public PdfToTextClient setProxy(string host, int port, string userName, string password)
Specifies an HTTP proxy that the API client library will use to connect to the internet.
public PdfToTextClient setRetryCount(int count)
Specifies the number of automatic retries when the 502 or 503 HTTP status code is received. The status code indicates a temporary network issue. This feature can be disabled by setting to 0.
count
Number of retries.
Default: 1