1. Products
  2.   OCR
  3.   Python
  4.   PaddleOCR
 
  

Open Source Python API to Integrate OCR Capabilities

Open Source Python Library That allows Software Developers to Easily Integrate Optical Character Recognition (OCR) Capabilities into Their Applications.

What is PaddleOCR?

PaddleOCR is a powerful, open-source Python library that allows software developers to seamlessly integrate optical character recognition into their applications. Built on the PaddlePaddle deep learning platform, it utilizes state-of-the-art models to deliver high accuracy and performance. The library simplifies the entire OCR process through a high-level API that handles complex, low-level details, making it exceptionally accessible for developers.

The library offers complete support for over 80 different languages and scripts, including Arabic, Chinese, English, French, German, Japanese, Korean, Russian, and Spanish, making it invaluable for multilingual projects. Beyond core recognition, PaddleOCR provides specialized models for Text Detection and Text Recognition, which can be combined using its Model Ensemble feature for superior accuracy. It also includes essential utilities for image preprocessing, like deskewing and binarization, alongside post-processing tools to refine output. This combination of a versatile language range, customizable models, and practical tools establishes PaddleOCR as a comprehensive and developer-friendly solution for adding robust OCR capabilities.

Previous Next

Getting Started with PaddleOCR

The recommend way to install PaddleOCR is using pip. Please use the following command for a smooth installation

Install PaddleOCR via pip

 pip install paddleocr 

You can also install it manually; download the latest release files directly from GitHub repository.

Image Text Recognition via PaddleOCR API

Image text recognition is the process of extracting text from images. It is a useful technique for various applications such as document scanning, digitization, and OCR (Optical Character Recognition). The open-source OCR (Optical Character Recognition) API provides a set of state-of-the-art OCR models that can recognize text from various images, including scanned documents, screenshots, and photographs. The library supports several important features related to image text recognition such as loading images, Initialize an OCR model, identify text region in the image, Recognize text from the image, extracting text from the result, and many more. The following example shows how to recognize text from an image inside Python applications.

Perform Image Text Recognition inside Python Projects

import paddleocr
ocr = paddleocr.OCR()

# load an image using the PIL
from PIL import Image

image = Image.open('example.jpg')
result = ocr.ocr(image)

# access the recognized text

for line in result:
    print(line[1][0])
    print(line[1][1])

OCR Document Recognition using Python API

Document recognition has been one of the prominent research areas for OCR. Documents are used almost every day in our life. When software developers apply OCR to a document, it can retrieve important information, retrieve form fields, analyze layout, store digitally and also for reading old manuscripts. The open-source PaddleOCR library allows software developers to load various types of documents, perform OCR operations and recognize and extract text from it using Python code. The text recognition is very accurate and the library can easily detect special characters and spaces accurately.

Perform OCR Document RecognitionF using Python API

img_path = './input_images/11-document-1.jpg'
result = ocr.ocr(img_path)

//Displaying the output.

Table Recognition Support inside Python Apps

The open source PaddleOCR library enables software developers to recognize table’s data inside their Python applications. The table recognition mainly contains three models, single line text detection-DB, single line text recognition-CRNN and table structure as well as cell coordinates prediction-SLANet. The following example shows how to recognize the image that contains the table. The following example shows how to use the draw_ocr method which takes in the image, the bounding boxes, the texts, the scores, and the path to the font file. It returns an image with the bounding boxes and the detected text. You can display the image using the show method.

How to Load an Image and Detect Text inside It via Python API?

from paddleocr import PaddleOCR, draw_ocr

# Load the image that contains the table.

# Load the image
img_path = 'table_image.png'
with open(img_path, 'rb') as f:
    img = f.read()

# Create an instance of the PaddleOCR object
ocr = PaddleOCR()


# Draw the bounding boxes around the detected table cells

boxes = [line[0] for line in result]
scores = [line[1] for line in result]
texts = [line[2][0] for line in result]
im_show = draw_ocr(img, boxes, texts, scores, font_path='arial.ttf')
im_show.show()

 English