Powerful PDF+Image Parsing — Mistral OCR

Mistral AI has recently released a powerful OCR model — Mistral OCR — Their tagline for the model is 1000 pages can be parsed per dollar. Mistral OCR model is said to be multilingual and multimodal. Complex documents in the format of PDF and images can be parsed with the model. Example Scenarios Let us look into some of the example scenarios with different documents as inputs to the model and see how it works. For the first case, I’ve taken a sample image from the internet, with some handwritten notes in English language. Below is the image that I used. When this image has been passed as an input to the Mistral’s OCR model, it parsed very perfectly and described what’s in the image. Below is the output from the model. This shows how awesome is the model with respect to handwritten images (handwriting in it is not very awesome :D ) But let’s not stop there, and we shall try another scenario with a different language. As my mother tongue is Tamizh, I chose to try a document in that (Tamil Language). The document is basically about Indian Constitution with some amendments and welfare of the Indian government. It is a 21-page long document of around 4Mb file size. Below is the link to the document for reference. Link : https://raw.githubusercontent.com/amrs-tech/storepdf/refs/heads/main/part5_compressed.pdf The OCR model has truly spoken from the bottom of its heart

Mar 12, 2025 - 16:14

Powerful PDF+Image Parsing — Mistral OCR

Mistral AI has recently released a powerful OCR model — Mistral OCR — Their tagline for the model is 1000 pages can be parsed per dollar. Mistral OCR model is said to be multilingual and multimodal. Complex documents in the format of PDF and images can be parsed with the model.

Example Scenarios

Let us look into some of the example scenarios with different documents as inputs to the model and see how it works.

For the first case, I’ve taken a sample image from the internet, with some handwritten notes in English language. Below is the image that I used.

When this image has been passed as an input to the Mistral’s OCR model, it parsed very perfectly and described what’s in the image. Below is the output from the model.

This shows how awesome is the model with respect to handwritten images (handwriting in it is not very awesome :D ) But let’s not stop there, and we shall try another scenario with a different language.

As my mother tongue is Tamizh, I chose to try a document in that (Tamil Language). The document is basically about Indian Constitution with some amendments and welfare of the Indian government. It is a 21-page long document of around 4Mb file size. Below is the link to the document for reference.

Link : https://raw.githubusercontent.com/amrs-tech/storepdf/refs/heads/main/part5_compressed.pdf

The OCR model has truly spoken from the bottom of its heart