Member-only story

Create a Receipt Parsing Using OCR and a Large Language Model

10 min readSep 29, 2023

In this tutorial, I will go through how I leverage an OCR to capture data from receipts and then leverages a Large Language Model (LLM) to extract pertinent details such as the total amount, date and time of the receipt, and additional relevant information.

To perform OCR, I will utilize the docTR tool from Mindee as outlined below.

GitHub - mindee/doctr: docTR (Document Text Recognition) - a seamless, high-performing & accessible…

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by…

github.com

To retrieve the information from the receipt, I will use Azure’s OpenAI capabilities.

Construct the OCR Output Data

Let’s begin the installation process for docTR and the necessary libraries on your machine. I will not going through the detail of the installation process as you can find comprehensive instructions in the provided Git repository

GitHub - mindee/doctr: docTR (Document Text Recognition) - a seamless, high-performing & accessible…

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by…

github.com

Let’s test the installation if is successful without error by executing this below code with the provided receipt image in Jpeg.

Create a Receipt Parsing Using OCR and a Large Language Model

GitHub - mindee/doctr: docTR (Document Text Recognition) - a seamless, high-performing & accessible…

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by…

Construct the OCR Output Data

GitHub - mindee/doctr: docTR (Document Text Recognition) - a seamless, high-performing & accessible…

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by…

Written by Ferry Djaja

Responses (3)