Aspose.OCR for Python via .NET

Read & Extract Text from Images via Python API

Leading Powerful Python OCR API allows Developers to Read and Extract Text from Images, Photos, Screenshots, Scanned Documents, and PDF Files.

In today's digital age, converting images into editable text formats has become a vital task for many businesses and developers. Aspose.OCR for Python via .NET iprovides a robust solution for Optical Character Recognition (OCR) that enables software developers to extract text from images effortlessly. Aspose.OCR for Python via .NET is a powerful library designed for OCR tasks. It is part of the Aspose suite of products, which is renowned for providing high-quality document processing tools. This library allows developers to leverage the capabilities of the .NET framework while coding in Python, thus enabling cross-platform applications that can perform OCR operations seamlessly. It supports recognition of text in over 100 languages, including English, Spanish, French, German, Italian, Chinese, Japanese, and many more.

Aspose.OCR for Python via .NET is a .NET-based OCR library that allows developers to recognize and extract text from various image formats, including JPEG, PNG, GIF, BMP, and TIFF. The API uses advanced algorithms to ensure high accuracy in text recognition from various image formats. It supports recognizing text in different fonts and styles. To enhance recognition accuracy, Aspose.OCR offers preprocessing capabilities such as binarization, deskewing, and noise removal. This helps in preparing images for better OCR results. It can handle multiple images in a single process, enabling batch processing and saving time in scenarios where large volumes of images need to be processed. With its advanced features, including multi-language support, image pre-processing, layout analysis, and error handling, Aspose.OCR is an ideal choice for software developers willing to work on OCR-based projects.

At A Glance

An overview of Aspose.OCR for Python via .NET features.

Features Overview

Perform OCR
Add OCR Capabilities
Recognize Image text
Convet images of text
Recognized Font text
Search PDF
27 Recognition Languages
Create OCR apps
Save to browser
Extract Text
Multi-threading Support

Features Overview

Recognize rotated Image
Pre-processing filters
PDF to Images
Recognizes Chines Chars
Detects Popular typefaces
Processes whole image
Rotated images Support
Batch Recognition
Built-in Spell Checker
Split PDF
PDF to Excel
PDF to SVG

Aspose.OCR for Python via .NET

API mainly supports PDF format but can export PDF documents to a number of other formats.

Reader

PDF, PDF/A, TEX, XPS, SVG

Writer

PDF, TXT, PNG, JPEG , PDF/A, DOC, DOCX, TEX, XPS, SVG, XLSX, PPTX

Aspose.OCR for Python via .NET

Platform Independence

Aspose.OCR for Python via .NET can work with any Python based programming language.

Python 3.6 and above.

Aspose.OCR for Python via .NET

Getting Started with Aspose.OCR for Python via .NET

The recommend way to install Aspose.OCR for Python via .NET is using pip. Please use the following command for a smooth installation.

Install Aspose.OCR for Python via .NET via pip

 pip install aspose-ocr-python-net

You can download the SDK directly from Aspose.OCR Python Cloud SDK product page

OCR Operations with High Accuracy via Python

Aspose.OCR for Python via .NET is engineered for high precision and accuracy. The library incorporates advanced machine learning models that improve text extraction accuracy, even with skewed or low-resolution images. This feature makes it suitable for applications that require reliable text recognition, such as automated data extraction from scanned forms or documents. The following code snippet showcases a simple implementation where an image is loaded, processed, and its recognized text is displayed.

How to Load Images, Perform OCR and Extract Text via Python API?

# Initialize OCR engine
recognitionEngine = AsposeOcr()

# Add image to batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample.png")

# Extract text from image
result = recognitionEngine.recognize(input)
# Display the recognition result
print(result[0].recognition_text)

Image Preprocessing Capabilities

Aspose.OCR for Python via .NET library provides powerful image preprocessing features inside Python applications. The features enhance OCR accuracy, such as skew correction, noise removal, and image normalization. These preprocessing steps are crucial when working with images that may have been scanned under suboptimal conditions. The following example demonstrates how developers can perform the skew correction operation which ensures that text is accurately detected, even if the image is slightly tilted or not perfectly aligned.

How to Perform OCR Operation with Skew Correction inside Python Apps?

from aspose.ocr import OcrEngine, SkewCorrection

ocr_engine = OcrEngine()

# Enable skew correction
ocr_engine.set_skew_correction(SkewCorrection.True)

ocr_engine.image = "skewed_image.png"
recognized_text = ocr_engine.get_text()
print("Corrected and Recognized Text:", recognized_text)

Handwritten Text Recognition via Python

Aspose.OCR for Python via .NET is very easy to handle and can recognize both printed and handwritten text with just a couple of lines of Python code. This features is very useful for applications like digitizing handwritten notes or signatures. The software’s ability to interpret various styles of text enhances its utility in sectors like education and legal documentation. Here is an example that shows how to perform Handwritten Text Recognition using Python code.

How to Perform Handwritten Text Recognition via Python Library?

api = ocr.AsposeOcr()

''' add filters if you need '''
filters = ocr.models.preprocessingfilters.PreprocessingFilter()
#filters.add(ocr.models.preprocessingfilters.PreprocessingFilter.contrast_correction_filter())

''' initialize image collection and put images into it '''
input = ocr.OcrInput(ocr.InputType.SINGLE_IMAGE, filters)
input.add("Data\\OCR\\handwritten.jpg")

''' change recognition options if you need '''
settings = ocr.RecognitionSettings()
settings.detect_areas_mode=ocr.DetectAreasMode.PHOTO

''' run recognition '''
res = api.recognize_handwritten_text(input)
print(res[0].recognition_text)

OCR Custom Image Regions in Python Apps

Aspose.OCR for Python via .NET has provided complete support for recognizing text from a particular area of an image inside Python applications. Software developers can specify specific regions within an image for OCR, which is helpful in scenarios where only a portion of the image contains relevant text. Developers can define custom regions for OCR on an image, set recognition modes, and adjust other parameters to optimize the OCR process based on specific application requirements. Here is an example that shows how software developers can recognize a single line of text with just a couple of lines of Python code.

How to Recognize a Single Line of Text on Image via Python Apps?

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample_line.png")

# recognize without regions detection
settings = RecognitionSettings()
settings.recognize_single_line = True

result = api.recognize(input, settings)

print(result[0].recognition_text)