Writing the results
All articles

Parsel output types explored: JSON

This second post in the series takes a look at another output file type that our clients use to analyse their PDF table data and power their applications: JavaScript Object Notation (JSON).

How Parsel helps clients analyse table data in PDFs

Parsel's core proposition is to turn table data from PDF documents (and image files) into structured data that can be worked with in a variety of formats. Inevitably, Microsoft Excel happens to be the most common format for manual data analysis by our clients. However, an increasing number of our clients are using Parsel's JSON (JavaScript Object Notation) output type, available on our Pro and Enterprise plans, to power automated data analysis and routing in their own applications. In this post we'll look at the JSON output from Parsel and discuss the typical use cases that leverage this file type.

Exporting into Javascript Object Notation (JSON)

Parsel clients from all industries, from financial services to public sector contractors to small retail businesses, choose to use Parsel's JSON output file type to automate their document processing needs. As an open data format, JSON has widespread adoption across many software and web applications around the world. It's therefore only natural that Parsel's JSON output is used by clients who have downstream processes that they want to hook table data into. Parsel clients use JSON to turn unstructured financial industry PDFs (annual reports, company analyses, regulatory filings etc.) into structured database entires. Large manufacturers, wholesalers and retailers are using Parsel's JSON output to automate the processing of messy, unstructured invoices and purchase orders.

Example PDF table: A company's balance sheet

Example PDF table

After the document is passed through Parsel's AI, the table is detected and turned into structured data that is outputted into JSON, including the table title, row categories, row captions, column captions and cell values.

Example JSON output: Table meta data, title and captions

JSON output overview

Example JSON output: Row categories and values

JSON output categories and values

Parsel detects row categories to support data analysis

In addition to showing the individual row captions for each line in the table, Parsel also tries to infer any categories applied to the row captions. In this example table (the balance sheet of a company), ASSETS, EQUITY, and LIABILITIES are all categories of data found in the table. Within each of those categories, there are some observable sub-categories (Non-current assets, Current liabilities etc.). Parsel detects these categories and sub-categories and exports them to JSON. This allows for further downstream analysis, meaning clients can evaluate or aggregate data within entire categories of rows.

Client types predominantly using Parsel's JSON output

JSON is output type of choice for clients who want to automate document analysis either for their own firm or as part of a downstream 3rd party application. As JSON is a universal data format, this output type can be integrated with almost any other software application. Parsel has clients in the following industries, among others, who are using this output type for the table analysis:

  • Financial services
  • Industrials
  • Large-scale retailers
  • Tech start-ups
  • Data consultants or integrators

Try Parsel for free today!

To get started, sign up today and convert your PDF into structured data with a few clicks.