Document Conversion
The document conversion API endpoint /conversion
allows you to convert an unstructured PDF of payroll data or financial statements (balance sheet, income statement) in to structured data suitable for financial spreading and evaluation.
The conversion API operates asynchronously, in order to be notified when a conversion is complete you’ll want to setup a webhook listener for the document_conversion_complete
event.
To use the conversion API you perform a multipart/form-data
POST to the /conversion
endpoint with the following keys:
There is a Postman tutorial for document conversion with example responses on this page.
POST Request Table of Keys:
Keys | Required | Type | Definition |
---|---|---|---|
document | yes | binary | The PDF document to convert |
type | yes | string (text in Postman) | either |
url | no | string (text in Postman) | The URL the PDF was downloaded from. This additional hint helps the conversion process and will result in a higher chance that we can convert the document |
In response to this request you will receive a document processing ID
which can be used to track the status of this conversion via requests to the /conversion/id
endpoint, where "id" is the numeric id.
When the conversion is completed a webhook notification will be sent. You may also poll the /conversion/id endpoint though this is not recommended
Once the document has been converted a GET request can be sent to /conversion/id endpoint with the following keys:
GET Request Table of Keys:
Keys | Required | Type | Definition |
---|---|---|---|
id | yes | integer | The document processing identifier |
format | yes | string | Either |
You also need to send an accept header with a value of application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
to get an Excel output as opposed to JSON.
Status and Rating Table:
In both the GET and POST requests, you will see a status
key in the response. In only the GET request, you will see a rating
key.
Key | Description |
---|---|
status (GET and POST) | “new” status means the document has not yet been processed. “done” status means the document has been processed “not processable” status means the document could not be converted. Please ensure you entered the correct file and file type during the POST request. |
rating (GET) | “A” rating means the document has both basic processing and extract information; “C” rating means document has only basic processing information; “F” rating means there is no processing information available for this document; “NA” means a rating is not applicable to this type of document if the status is “done” or if the document is still being processed. |
Flowchart of Steps
Using Postman for Document Conversion
Please download the OpenAPI specification found here.
After you have downloaded and imported the provided OpenAPI specification to Postman, you may start making requests through the API. Follow this tutorial on converting a financial statement from pdf to excel file.
Step 1: Locate POST Request
After clicking the Collections tab, you may click Boss Insights to drop down the available endpoints. There you will see conversion.
Under conversion, there are three requests. The first GET request will list all documents that have already been converted. The POST request is used for file conversion and the last GET request is used to retrieve the converted document. Please select the POST request.
Step 2: Enter Key Configurations (POST)
Next, please ensure that the form-data radio button is selected (1) and the three variables from the chart above have been entered (2). By default the type for these variables will be “text”.
We need to change the type of the document key to file. This can be done by hovering over document
, clicking the carrot/dropdown symbol that appears (3) and changing the type from text
to file
(4). When you select file, the Value cell to its right will show a button the says “Select Files”.
Step 3: Upload The File For Conversion (POST)
Next, you need to enter in the values, please refer to the chart above for values (1). Below are sample values. Please note that in the image, the pdf file is a payroll report and therefore the the type
is payroll-report
. You may then click send (2).
Below is a sample response you may receive. Please take note of the id
number as this is what will be used to retrieve your converted file. :
{
"status": "new",
"type": "payroll-report",
"url": "http://www.gusto.com/",
"file": "payroll-example.pdf",
"id": 9
}
Step 4: Retrieve Your Converted File (GET)
If you have set up a Webhook, you will receive a notification when the file is ready to be retrieved. If not you may poll the conversion/id endpoint, although this is not recommended. To retrieve the file, please go to the third request under the collections dropdown, a GET request (1).
Next, please add the format
key as shown in the second table above (2). Then, set value pair to either data or pages (3). Next, add the ID from the previous step (4). Finally, you may hit send (5).
Below is a sample response you may receive. For an explanation on “status” and “rating” keys, please see the third chart above:
{
"status": "done",
"type": "payroll-report",
"url": "http://www.gusto.com/",
"file": "payroll-example.pdf",
"id": "9",
"created": "2023-05-09 21:54:06",
"rating": "A"
}
Step 5: Download Your File:
To download your converted file, please click the code block icon in the rightmost navigation bar on the Postman interface (1). Please select cURL from the dropdown menu (2). Finally, you may follow the link provided to download the file, this is outlined in blue in the image below (3).
You’ve successfully converted a pdf file and downloaded it!