It can be a time-consuming method to manually extract data from complex documents on a scale, which is why Google Cloud has introduced its new Document AI (DocAI) platform.

The centralized document processing console allows organizations to use the power of Artificial Intelligence and machine learning to automate the process of translating documents into structured information. 

Organizations can ensure their data is reliable and compliant, make informed business decisions and use their data to better fulfill consumer needs with the DocAI platform, which is currently available in preview.

In an announcement about DocAI, Google noted that using the platform, one of its customers improved data capture accuracy by 250 percent and lowered its overall processing costs by up to 60 percent.

DocAI can convert documents to structured data easily.

With the latest DocAI platform from Google Cloud, organizations can access all parser tools and solutions instantly, including Lending DocAI and Procurement DocAI, with a single API that enables document processing workflows to be easily generated and customized.

Users would first need to build a document processor to get started with the new framework. Although you can use general processors such as Form Parser from the business, for domain-specific documents.

Bulletproof your Domain for $4.88 a year

you can also take advantage of specialized processors such as Google’s W9 Parser. They can be viewed on a single dashboard after building a processor and checked by uploading your own document directly to the console.

Extracting DATA from an invoice with DocAI

In their blog post, Lewis Liu, Document AI’s product manager, and Yang Liang, product marketing manager, give some examples of how the platform can be used to collect data from both a W9 form and an invoice. When it came to the invoice, DocAI was able to automatically extract from the document the name of the supplier, invoice date, payment terms and other details.

General parsers such as OCR (Optical Character Recognition), Form parser and Document splitter are currently publicly available, but users can also request access to a variety of documents, including W9, 1040, W2, 1099-MISC, 1003 and other forms, as well as invoices and receipts, from specialized parsers.

What is Google DocAI

Vision OCR (optical character recognition) and type parser technology from Google Cloud uses industry-leading deep-learning neural network algorithms to exceptionally reliably perform text, character, and image recognition in over 200 languages.

Google DocAI Can Understand Data.

Google Cloud’s Natural Language products allow you to extract useful insights from your unstructured documents using the same deep machine learning technology that powers Google Search engine and Google’s Assistant.

Mortgage document processing is accelerated with Google DocAI.

Lending DocAI is a customized solution developed with industry-leading data accuracy for the mortgage space. To speed-up loan applications, it processes income and asset records, a notoriously slow and complicated operation. Lending DocAI leverages a collection of advanced models based on document forms used in mortgage lending, which automates certain routine document reviews so that more value-added decisions can be focused on by mortgage service providers.

Google Document AI Example

So, we were able to give google DocAI a test run and here is an example from uploading a regular invoice and having google DocAI convert it, into structured data.

Below image is an example of google doc ai.

Google Document AI Example1
Google Document AI Example 1

To show more of an example without having to blur out personal information we clicked on “Try our sample”

Which basically uploaded a templated invoice from AACME Plumbing. And here are the results

Google Document AI Example 2
Google Document AI Example 2

As you can see from the image above, the invoice was converted using Google’s DocAI which has multiple tabs to choose which data you’d like to extract from the invoice.

By default the “invoice schema” tab was selected.

Below is a screenshot of the “tables” tab from the same sample invoice.

Google Document AI Example 3
Google Document AI Example 3

As you can see from the screenshot, Google’s DocAI was able to detect table structure by simply uploading your invoice.

It was able to detect 2 different types of table formats based on the invoice.

Here is another screenshot of the second table detected by google’s docai

Google Document AI Example 4
Google Document AI Example 4

We decided to select the text from the table section generated with Google DocAI and opened up our handy dandy google drive. Created a testing folder as well as a google spreadsheet and copied and pasted the data into a google spreadsheet.

Here are the results.

Data extracted from DocAI and Pasted Into Spreadsheet
Data extracted from DocAI and Pasted Into SpreadsheetData extracted from DocAI and Pasted Into Spreadsheet

Bottom Line

Google Cloud has introduced its new Document AI (DocAI) platform. The centralized document processing console allows organizations to use the power of Artificial Intelligence and machine learning.

One of its customers improved data capture accuracy by 250 percent and lowered processing costs by 60 percent with DocAI. The platform is currently available in preview, but Google says it will be rolled out in production soon.

It uses industry-leading deep-learning neural network algorithms to exceptionally reliably perform text, character, and image recognition in over 200 languages.

Google Cloud’s new DocAI platform, the Document AI platform, and more.

Lending DocAI is a customized solution developed with industry-leading data accuracy for the mortgage space. To speed-up loan applications, it processes income and asset records, a notoriously slow and complicated operation.

It leverages a collection of advanced models based on document forms used in mortgage lending. It automates certain routine document reviews so that more value-added decisions can be focused on by mortgage service providers.

NVIDIA AI-powered neural filters