Open Source Python API to Integrate OCR Capabilities

Open Source Python library that allows software developers to easily integrate optical character recognition (OCR) capabilities into their applications.

PaddleOCR is a powerful open source Python library that enables software developers to easily integrate optical character recognition (OCR) capabilities into their Python applications. It is built on top of PaddlePaddle, an open-source deep learning platform, and uses state-of-the-art deep learning models to achieve high accuracy and performance. PaddleOCR simplifies the OCR process by providing a high-level API that abstracts away many of the low-level details, making it easy for developers to add OCR capabilities to their applications.

PaddleOCR has provided complete support for a wide range of languages and scripts. It currently supports 80+ different languages, including Arabic, Chinese, English, French, German, Japanese, Korean, Russian, Spanish, and many others. This makes it a valuable tool for developers who need to work with multilingual content. In addition to its powerful OCR capabilities, the library also includes a number of useful utilities for working with images and text. For example, it includes tools for image preprocessing, such as deskewing and binarization, as well as post-processing tools for improving the accuracy of the OCR output.

PaddleOCR provides several different OCR models, each optimized for different use cases. For example, the Text Detection model is used to locate and extract text regions from an image, while the Text Recognition model is used to recognize the actual text within those regions. There is also a Model Ensemble feature that allows developers to combine multiple models to achieve even higher accuracy. Overall, PaddleOCR is a powerful and easy-to-use library for adding OCR capabilities to your Python applications. Its support for a wide range of languages and scripts, as well as its customizable models and postprocessing tools, make it a valuable tool for developers working with OCR.

At A Glance

An overview of PaddleOCR features.

Features Overview

Perform OCR
Add OCR Capabilities
Recognize Image text
Convet images of text
Recognized Font text
Search PDF
Other Languages
Create OCR apps
Save to browser
Extract Text
Multi-threading Support

PaddleOCR

PaddleOCR supports popular image file formats listed below.

Reader

PNG, JPEG, BMP, TIFF, TGA, DICOM

Writer

PNG, JPEG, BMP, TIFF

PaddleOCR

Platform Independence

PaddleOCR can work with .NET Framework 4.8 and Python 2.7 & above.

Python 2.7 & above.

PaddleOCR

Getting Started with PaddleOCR

The recommend way to install PaddleOCR is using pip. Please use the following command for a smooth installation

Install PaddleOCR via pip

 pip install paddleocr

You can also install it manually; download the latest release files directly from GitHub repository.

Image Text Recognition via PaddleOCR API

Image text recognition is the process of extracting text from images. It is a useful technique for various applications such as document scanning, digitization, and OCR (Optical Character Recognition). The open-source OCR (Optical Character Recognition) API provides a set of state-of-the-art OCR models that can recognize text from various images, including scanned documents, screenshots, and photographs. The library supports several important features related to image text recognition such as loading images, Initialize an OCR model, identify text region in the image, Recognize text from the image, extracting text from the result, and many more. The following example shows how to recognize text from an image inside Python applications.

Perform Image Text Recognition inside Python Projects

import paddleocr
ocr = paddleocr.OCR()

# load an image using the PIL
from PIL import Image

image = Image.open('example.jpg')
result = ocr.ocr(image)

# access the recognized text

for line in result:
    print(line[1][0])
    print(line[1][1])

OCR Document Recognition using Python API

Document recognition has been one of the prominent research areas for OCR. Documents are used almost every day in our life. When software developers apply OCR to a document, it can retrieve important information, retrieve form fields, analyze layout, store digitally and also for reading old manuscripts. The open-source PaddleOCR library allows software developers to load various types of documents, perform OCR operations and recognize and extract text from it using Python code. The text recognition is very accurate and the library can easily detect special characters and spaces accurately.

Perform OCR Document RecognitionF using Python API

img_path = './input_images/11-document-1.jpg'
result = ocr.ocr(img_path)

//Displaying the output.

Table Recognition Support inside Python Apps

The open source PaddleOCR library enables software developers to recognize table’s data inside their Python applications. The table recognition mainly contains three models, single line text detection-DB, single line text recognition-CRNN and table structure as well as cell coordinates prediction-SLANet. The following example shows how to recognize the image that contains the table. The following example shows how to use the draw_ocr method which takes in the image, the bounding boxes, the texts, the scores, and the path to the font file. It returns an image with the bounding boxes and the detected text. You can display the image using the show method.

Load an Image and Detect Text inside It via Python API

from paddleocr import PaddleOCR, draw_ocr

# Load the image that contains the table.

# Load the image
img_path = 'table_image.png'
with open(img_path, 'rb') as f:
    img = f.read()

# Create an instance of the PaddleOCR object
ocr = PaddleOCR()


# Draw the bounding boxes around the detected table cells

boxes = [line[0] for line in result]
scores = [line[1] for line in result]
texts = [line[2][0] for line in result]
im_show = draw_ocr(img, boxes, texts, scores, font_path='arial.ttf')
im_show.show()