Pythian Blog: Technical Track

Extract Insights from Documents with Azure Document Intelligence

Most organizations that I speak to have tons of documents that contain valuable information but don't have the time or resources to manually process them. They need a way to automate the extraction of data from all these various types of documents, such as invoices, receipts, contracts, forms, and more. This is where modern services that apply ML and AI can come in handy to solve old problems in new ways.

One of these services is Azure Document Intelligence, which enables you to easily and quickly extract, analyze, and understand data from documents. In this blog post, we will cover what Azure Document Intelligence is, how it works, the billing model, and some of the main use cases and scenarios where it can help you.

What is Azure Document Intelligence?

Azure Document Intelligence is a cloud-based service that uses advanced AI and ML models to automatically extract structured and unstructured data from documents. You can use it to process documents in various formats that you would easily find scattered all around an enterprise, like PDF, JPG, PNG, TIFF, DOCX, XLSX, PPTX, and more. You can also use it to process documents in a bunch of different languages, such as English, Spanish, French, Chinese, Japanese, etc.

Azure Document Intelligence provides two main features: built-in models and the ability to train your own custom models as well. The different models allow you to extract key-value pairs, tables, and text from forms and other structured documents. You can read the text from image files, extract image or table information, and also get a structured JSON output that describes your document contents and layout.

There are also pre-existing models that cover some of the popular government forms and identifications mainly used in the USA. However, the service also offers the capability to grab a set of example forms from your own data and train a custom model that can do the analysis and extraction over those as well. So, regardless of the use case, either one of the main modes will work for you.

How does Azure Document Intelligence work?

To use Azure Document Intelligence, you need to create a resource in the Azure portal and get an endpoint and a key. Then, you can use one of the following methods to send your documents to the service:

  • Use the REST API to send HTTP requests with your documents as binary data or URLs.
  • Use the SDKs for .NET, Java, Python, or Node.js to programmatically interact with the service.
  • Use the Power Automate connector to integrate the service with other Microsoft products and services.
  • Use the web interface to upload your documents and see the results in a graphical way.

Once you send your documents to the service, it will analyze them and return the extracted data in JSON format. You can then use the data for various purposes, such as storing it in a database, displaying it in a dashboard, or feeding it into another service or application.

The web experience is particularly useful for experimenting and prototyping your approach to solving your particular problem by allowing you to easily upload and test the input documents as well as sample the output JSON without having to code anything. Once you have validated your approach, you can then easily move from the web portal to running the same process through code.

The billing model

Azure Document Intelligence follows a consumption-based pricing model. This means that you only pay for what you use. The price depends on the type and amount of documents that you process. 

You pay a small fixed amount per page for standard forms (such as invoices or receipts) and a different amount per page for custom forms (that you might have trained on your own). The current price (September 2023) is $0.01 per page on a standard form and $0.05 per page for a custom model. Prices do change all the time; you can get the latest pricing here: https://azure.microsoft.com/en-us/pricing/details/form-recognizer/ 

You can also use the free tier of Azure Document Intelligence to process up to 500 pages of standard forms or five pages of custom forms per month to get started and just get your feet wet with the service.

Use cases for document intelligence

This type of technology has many different use cases since we continue to create all sorts of printable documents and records (like receipts and invoices). I’ll give you some examples:

  • Invoice processing: you can extract information such as vendor name, invoice number, date, amount due, line items, taxes, etc., from invoices. You can then use this information to automate your accounts payable process and reduce manual errors and costs.
  • Receipt processing: you can extract information such as merchant name, date, total amount, items purchased, etc., from receipts. You can then use this information to automate your expense reporting process and simplify your tax filing.
  • Contract analysis: you can extract information such as parties involved, contract terms, clauses, dates, signatures, etc., from contracts. You can then use this information to review your contracts faster and easier and ensure compliance with your policies and regulations.
  • Health record analysis: you can use extract information such as patient name,
  • Diagnosis, medications, procedures, allergies, etc., from health records. You can then use this information to improve your patient care quality and outcomes and comply with privacy laws.

Breaking down a scenario

Let's say that you work for the IT services team of your organization, servicing multiple different business units across their IT infrastructure stack. You generate incident reports as part of the problem management process, describing the issues they encounter and the solutions you provide. You want to use Azure Document Intelligence to analyze these incident reports and extract useful information from them.

Something like this, for example:

Here are the steps that you can follow:

  1. Create a custom model using a few sample incident reports as training data; the service needs a minimum of 5, but more is better as long as they are a good representative set. You can label the fields that you want to extract, such as client name, incident number, issue description, resolution description, etc.
  2. Use the custom model to process new incident reports and get the extracted data in JSON format.
  3. Use the extracted data to generate insights and reports, such as the number of incidents per client, the average resolution time, the most common issues, the most effective solutions, etc.

For example, in the screenshot below, you can see how the service takes a PDF file and correctly identifies the section heading and the individual paragraphs underneath:

This screenshot shows the output from the web portal of running one of the PDFs through the layout recognizer. It shows the different elements recognized and also a friendly display, the actual JSON output, and even a code snippet to get you going. 

If this experiment was successful, you could then, for example, use this code in an automated process that would run this model whenever a new incident report is placed on a cloud storage container. You can extract the different sections, save them on a data lake or a data warehouse, automate the analysis of your incident reports, and gain valuable insights that can help you improve your service quality and customer satisfaction.

Conclusion

Azure Document Intelligence is a powerful service that can help you extract data from documents and gain insights from them. You can use it to process various types of documents in different formats and languages. You can also use it to handle various use cases and scenarios where you need to analyze documents. You can try it for free or pay as you go, depending on your needs. To learn more about Azure Document Intelligence, visit https://azure.microsoft.com/en-us/services/cognitive-services/document-intelligence/

Comments (1)

Subscribe by email