McAPI - HTML to PDF Converter API Overview

An easy to use HTML to PDF converter REST API for fast and reliable conversion of HTML to PDF. The API can convert web content pointed to by URLs but you can also provide inline HTML code with your REST call.

Features include:

  • Reasonable defaults - just submit an HTML string with your call or a URL to your web page and get your PDF in seconds
  • Format selection from a list of standard paper formats (A4, letter etc.)
  • Single or multi-page PDFs
  • Links on the website are preserved and clickable in the generated PDF
  • Customizable headers and footers provided as HTML with your call. Page headers and footers can have separate styling and can include place holders for title, date, page number and other metadata.
  • PDFs can be returned immediately as a base64 encoded string or as a downloadable URL, with the PDF stored on our cloud servers for up to 30 days at no extra cost.
  • Built-in ad blocker and auto-clicking of cookie consent banners (experimental as of API version V1)

Extensive sample code illustrates HTML and website conversion to PDF conversion in Node / JS, Python, Ruby, Swift, PHP, C# and other languages and environments.

McAPI HTML to PDF conversion - Applications & Benefits

An obvious use case for HTML to PDF conversion are business documents. Usually this will be invoices, receipts, package lists or delivery notes. Likewise, contracts and agreements, NDAs or resumes are candidates for conversion. Another interesting application for converting HTML to PDF is in archiving, for instance to preserve searchable snapshots of websites or tweets or other social media content. Finally, a business may have to create PDFs to fulfill ISO 9001 requirements.

Conversion to PDF vs screenshots

Capturing screenshots can be an alternative to converting HTML (or websites) into PDFs. However PDFs have many benefits over screenshots (i.e. JPEG or PNG image files). Consider the following screenshot of an invoice (link to HTML):

HTML 2 PDF Conversion API - HTML Invoice Template

This screenshot has a size around 200kB, it is not searchable (except after the extra step of OCR) and can't be edited.

The PDF created from the HTML on the other hand is only 21kB (link to PDF) and is searchable and indexable:

Searching in a PDF from a converted HTML Invoice

It is also fully editable, shown here in Adobe Acrobat:

HTML Invoice converted to PDF edited in Adobe Acrobat

The API returns a standard PDF. As such the converted document or website can afterwards be encrypted, have uses restricted to view only or can be archived as a PDF/X etc.

McAPI HTML to PDF Converter API - Plans and pricing

We provide a generous free plan with this API. See the RapidAPI listing for all plans and pricing. All tiers include free storage of your PDFs for up to 30 days.

McAPI HTML to PDF Converter API - Specifications

Version:

V1.0

Protocol:

https

URL:

https://mcapi-html-2-pdf.p.rapidapi.com

Endpoint:

/

Method:

POST

McAPI HTML to PDF Converter API - Sample cURL call

Shown is a cURL request to convert an invoice from HTML into a PDF. The page format is set to "A4" which will create a PDF with a page size of 210x297mm (see reference below for a list of all formats). With storeExternal set to "true", the API will return a URL to the created PDF (replace YOUR_API_KEY with your RapidAPI key):
curl --request POST \
	--url https://mcapi-html-2-pdf.p.rapidapi.com/ \
	--header 'content-type: application/json' \
	--header 'x-rapidapi-host: mcapi-html-2-pdf.p.rapidapi.com' \
	--header 'x-rapidapi-key: YOUR_API_KEY' \
	--data '{
	  "url": "https://mcapi.io/html2pdf/templates/invoice.html",
	  "format": "A4",
	  "storeExternal": "true"
}'

McAPI HTML to PDF Converter - Sample response

Shown is the response from the cURL request above (the returned URL is shortened):
{
  "service": "McAPI HTML 2 PDF, https://mcapi.io",
  "version": "V1",
  "pdf": "https://...pdf"
}

McAPI HTML to PDF Converter - Sample PDF

Shown is the returned PDF from the cURL call above:

HTML to PDF conversion without background

If you compare this PDF with the introductory screenshot above, you'll notice that the gray background of the headers for Payment Method and Items has been left out. Per default, the API will ignore background colors and images because those elements are usually not desired and wasteful when printing the PDF. Sometimes, however, the background will be useful or informative. Another situation where keeping the background might be advised is inverted text. Light text would be hard or impossible to read on a white background.

To keep the background during the conversion, set the background option to "true", like so:


...

--data '{
  "url": "https://mcapi.io/html2pdf/templates/invoice.html",
  "format": "A4",
  "storeExternal": "true",
  "background": "true"
}'

...

The invoice with background elements:

HTML to PDF conversion with background

Storing PDFs on a cloud server

The storeExternal option controls the PDF delivery and storage. Set it to "true" and the PDFs will be stored on one of our servers for 30 days; the API call will then return the PDF's URL. Set it to "false" and the PDFs are returned immediately as base64 encoded strings, like so:


...

--data '{
  "url": "https://mcapi.io/html2pdf/templates/invoice.html",
  "format": "A4",
  "storeExternal": "false"
}'

...

Returns:

{
  "service": "McAPI HTML 2 PDF, https://mcapi.io",
  "version": "V1",
  "pdf": "data:application/pdf;base64,..."
}

See the sample code for snippets that show how to decode the PDFs and write them to a file.

Sending HTML code for conversion

The previous examples all sent URLs with the REST call. You can also post HTML code directly to the API. Use the html parameter for this, e.g.:


...

--data '{
  "html": "<h1>Some Headline</h1><p>Some Text</p><p>\
  An image:</p><img src=\"https://mcapi.io/html2pdf/templates/logo.png\"/>",
  "format": "A4",
  "storeExternal": "true"
}'

...

Result of direct conversion of HTML to PDF:

Inline HTML to PDF conversion

Note that the HTML must not contain relative paths, e.g. something like this will not resolve:

<img src="../templates/logo.png"/>

All references must be absolute and point to valid web locations:

<img src="https://mcapi.io/html2pdf/templates/logo.png"/>

Same goes for included CSS files and scripts, all URLs must be absolute. An alternative are inline or embedded elements: For images, you can provide the contents as a base64 string. Styles can be embedded with the <style> element.

Also keep in mind that all HTML must be properly escaped so that it can be transmitted as JSON. Use JSON.stringify (Node JS) or the respective equivalent in your language.

Specifying headers and footers

A very useful feature of the API is the ability to put headers and footers on the converted PDF's pages. Besides static text or imagery, headers and footers can also include dynamic fields for variables like title, date or page number.

Dynamic fields are specified by using a reserved class name. Here is an example for a header that includes the print date:


<div style="color: #D3D3D3; border-bottom: solid #D3D3D3 1px; 
	text-align: center; font-size: 9px; padding-bottom: 4px; width: 100%;">
  Print date <span class="date"></span>
</div>

And here's a footer that displays the page number:


<div style="color: #D3D3D3; border-top: solid #D3D3D3 1px; 
	text-align: center; font-size: 9px; padding-top: 4px; width: 100%;">
  Page <span class="pageNumber"></span>
</div>

Footers and headers are always submitted as plain HTML to the API, the considerations for the html parameter above apply to these params as well.

Specify footers and headers like so:


...

--data '{
  "pageFooter": "<div>...</div>",
  "pageHeader": "<div>...</div>",
  "html": "<br/>",
  "format": "A4",
  "storeExternal": "true"
}'

...

We leave the html parameter without text, the resulting PDF will now only display footer and header:

HTML to PDF Conversion Header Footer Example

Headers and footers must not contain <script> tags. Note that classes and styles from the main content (either via url or html) are not visible to the footer and header HTML snippets because they are rendered in a separate browser context. Like in the example it's best practice to include all styles inline. Also note that footers and headers will only be visible with top and bottom margin of at least 70px (the default). See reference for more on the margin parameter.

See reference section below for a list of all dynamic fields.

Automatically accepting cookie consent banners and notices (experimental)

When converting a website to a PDF, the API can also automatically click the "Accept" button on GDPR cookie consent banners (Note that this feature is experimental as of V1.0 and will only work with sites in english. See discussion in the reference section.)

Set the cookie option to "true" like so:


...

--data '{
  "url": "https://cnbc.com",
  "storeExternal": "true",
  "cookie": "true",
  "background": "true"
}'

...

Blocking ads

The API comes with a built-in ad blocker which can be useful when you want to convert a website to PDF. Activate it like so:


...

--data '{
  "url": "https://cnbc.com",
  "storeExternal": "true",
  "cookie": "true",
  "adblock": "true",
  "background": "true"

...

Note that the cookie and adblock options will increase the time it takes to capture a site and convert it to PDF.

McAPI HTML 2 PDF Converter API - Reference

Describes parameters for PDF creation and delivery. All parameters are case-sensitive.

Parameters to control the PDF creation and format

  • Name: url

    Description: Url of website or page to convert into PDF.

    Type: String

    Required: Yes, when html is not specified

    Default: n/a

    Sample JSON:

    "url" : "https://mcapi.io"

    Must be a fully qualified url, i.e. including protocol specifiers http:// or https://. The page will always be rendered with full size. If the content doesn't fit a single page, the resulting PDF will have multiple pages. This parameter is ignored when html is also specified.

  • Name: html

    Description: HTML to convert into PDF.

    Type: String

    Required: Yes, when url is not specified

    Default: n/a

    Sample JSON:

    "html" : "<h1>Text</h1>"

    Must be properly escaped HTML with all external references being absolute and pointing to valid web locations.

  • Name: pageHeader

    Description: HTML for page header.

    Type: String

    Required: No

    Default: n/a

    Sample JSON:

    "pageHeader" : "<span>Header</span>"

    Must be properly escaped HTML with all external references being absolute and pointing to valid web locations. Use the following dynamic fields as class names for variable content:

    • url
    • date
    • title
    • pageNumber
    • totalPages

  • Name: pageFooter

    Description: HTML for page footer.

    Type: String

    Required: No

    Default: n/a

    Sample JSON:

    "pageFooter" : "<span>Footer</span>"

    Must be properly escaped HTML with all external references being absolute and pointing to valid web locations. Use the following dynamic fields as class names for variable content:

    • url
    • date
    • title
    • pageNumber
    • totalPages

  • Name: format

    Description: Page format of the created PDF.

    Type: String

    Required: No

    Values: See listFormats option.

    Default: "A4"

    Sample JSON:

    "format" : "Letter"

  • Name: listFormats

    Description: Returns a list of all pre-defined page formats.

    Type: String

    Required: No

    Values: "true" or "false"

    Default: "false"

    Sample JSON:

    "listFormats" : "true"

    If set to "true", all other parameters from the REST call will be ignored. The API will not create a PDF, instead it returns an array of the format names like so:

    
    ["Letter", "Legal", "Tabloid", "Ledger", "A0", "A2", "A3", "A4", "A5", "A6"]
    	
  • Name: orientation

    Description: Page orientation.

    Type: Number

    Required: No

    Values: 0: portrait, 1: landscape

    Default: 0

    Sample JSON:

    "orientation" : 1

  • Name: background

    Description: Render page background elements (colored tables, images etc.) into the PDF.

    Type: String

    Required: No

    Values: "true" or "false"

    Default: "false"

    Sample JSON:

    "background" : "true"

  • Name: margin

    Description: Specify page margins.

    Type: Dictionary

    Required: No

    Values: Must specify top, right, bottom, left of margin, all numeric

    Default: { "top": 70, "right": 25, "bottom": 70, "left": 25 }

    Sample JSON:

    "margin" : { "top": 100, "right": 10, "bottom": 100, "left": 10 }

    header and footer content will only be visible with top and bottom margins of min. 70px.

  • Name: cookie

    Description: Try to locate cookie consent banner (if any) and auto-click the "Accept"-button.

    Type: String

    Required: No

    Values: "true" or "false"

    Default: "false"

    Sample JSON:

    "cookie" : "true"

  • Name: adblock

    Description: Block ads and trackers.

    Type: String

    Required: No

    Values: "true" or "false"

    Default: "false"

    Sample JSON:

    "adblock" : "true"

    Based on the Clickz OSS ad blocker. Creating a PDF takes more time if this feature is activated. Useful only if url is specified.

  • Name: stealth

    Description: Work in stealth mode.

    Type: String

    Required: No

    Values: "true" or "false"

    Default: "false"

    Sample JSON:

    "stealth" : "true"

    Prevent detection of headless browsing, i.e. without a human, by target site. Creating a PDF takes more time if this feature is activated. Useful only if url is specified.

Parameters to control the PDF delivery

  • Name: storeExternal

    Description: Store PDF on server.

    Type: String

    Required: No

    Values: "true" or "false"

    Default: "true"

    Sample JSON:

    "storeExternal" : "true"

    Stores the PDF on a cloud storage maintained by McAPI. When this parameter is set to "true", the API will return the respective URLs in the response instead of the base64-encoded PDF.

    Note: Stored PDFs will be purged automatically after 30 days and can not be recovered after expiry.

  • Name: header

    Description: Prepend MIME-type header to base64 string.

    Type: String

    Required: No

    Values: "true" or "false"

    Default: "true"

    Sample JSON:

    "header" : "false"

    The following header will be used:

    data:application/pdf;base64,
    

    This parameter is ignored if storeExternal is set to "true" as downloadable PDFs are always binary and directly usable.

McAPI HTML to PDF Conversion API - Error codes and messages

Upon success, the API returns a status code of 200. With any error, the API will instead return a status code of 400. Also provided is a status text with more information. The following example shows the response for a messed up data package (with the cURL from the introduction we changed the JSON block from "format": "A4" to "format: "A4"). This will cause the JSON parser in the API's event handler to raise an exception; the handler will then return:

Malformed event parameter

This error will also be returned if the html parameter contains unescaped HTML.

Other error messages:

  • No URL or HTML content specified.

    Both url and html property absent or empty string

  • Invalid margin specified.

    margin does not contain all four elements or is malformed

  • Unknown format specified.

    format parameter contains unknown paper format

  • Invalid orientaion specified.

    orientation parameter not numeric or not 0 or 1

  • Can't load page.

    Catch all error for browser-related issues

    This error will be returned if the target server didn't respond or was unreachable, for example if the URL was misspelled or incomplete. Another scenario is related to the cookie option when auto-clicking the "Accept"-button triggers navigation away from the loaded page. This message may contain additional information related to the error.

Timeout scenarios

V1 of the API runs synchronous, i.e. it will only return after the respective site or content was loaded completely and after the PDF has been generated. For heavy sites, a multi-page PDF can take up to 10 seconds, longer if the cookie and adblock options are activated. Also, storing a PDF on our cloud storage takes more time than simply returning the document as base64 because the API waits for the server to confirm the successful upload. Consider increasing the respective timeout-parameter on your side for your environment or programming language if you encounter timeouts or implement a simple retry pattern.

On our side, the API itself may simply timeout if an URL takes too long to load (this timeout is currently at 30s).

V2 of the API will introduce an asynchronous mode where the API will return immediately, providing a URL for the caller to check the progress of the operation.

McAPI HTML to PDF Converter API - Sample code and snippets