Extracting Line Items from a Document with GPT-4o: It’s not a straightforward task

Ferry Djaja
9 min readAug 31, 2024

In this tutorial, I’ll guide you through the process of accurately extracting line items from documents. Although I initially thought this would be a straightforward task, it ended up taking me half a day of trial and error with various prompts. I’ll share my experience using the GPT-4o model with Python, as well as my experiments with commercial parsers and the results I obtained.

The sample image I’ve used is straightforward and uncomplicated, making it a good test case for extracting line items.

My goal is to accurately extract the relevant generic fields and line items from the document content. That sounds simple, doesn’t it?

Okay now let’s try with a simple prompt in ChatGPT with GPT-4o model.

This is the prompt I entered:

Parse the image data to identify and extract line items, and return them in a structured JSON format.

--

--