Aspose.OCR for .NET

.NET OCR API to Optical Character Recognition

Add Optical Character Recognition (OCR) functionality to their C# applications. It allows converting printed or handwritten text to digital format.

Optical Character Recognition (OCR) is a very useful process for transforming text images as well as printed documents into a machine-readable text format. Aspose.OCR for .NET is an advanced OCR library that makes it easy for software developers to convert printed as well as handwritten documents or text into digital format, making it easier to search, edit, and share it with other users. The C# .NET library has included a powerful image reader that can read various popular image file formats such as JPEG, PNG, TIFF, GIF, BMP images, PDF documents, TIFF, DjVu, and many more. It is also possible for Software developers to store the recognition results in the most popular document and data exchange format.

Aspose.OCR for .NET is one of the leading OCR libraries in the market that enables software developers to add OCR functionality to their .NET applications without any external dependencies. The library uses advanced algorithms to recognize text from scanned documents, images, handwritten text, smartphone photos, screenshots, specific areas of images, and other sources and then converts it into editable text with ease. It supports more than 26 languages, including English, Chinese, Korean, Spanish, French, German, Italian, Bulgarian, Kazakh, Russian, Japanese, and Arabic.

Aspose.OCR for .NET is very stable and has included a very useful feature for barcode recognition allowing software developers to recognize popular barcode formats, such as QR codes and UPC codes. The library has included some powerful pre-processing filters that allow programmers to recognize rotated, skewed, and noisy images with just a couple of lines of C# code. Moreover, It can be easily integrated with other Aspose libraries, such as Aspose.PDF and Aspose.Words, allowing developers to create powerful document processing workflows. It is also very easy to recognize images provided as web links and perform batch recognition of all images in a folder or archive.

At A Glance

An overview of Aspose.OCR for .NET features.

Features Overview

Perform OCR
Add OCR Capabilities
Recognize Image text
Convet images of text
Recognized Font text
Search PDF
27 Recognition Languages
Create OCR apps
Save to browser
Extract Text
Multi-threading Support

Features Overview

Recognize rotated Image
Pre-processing filters
PDF to Images
Recognizes Chines Chars
Detects Popular typefaces
Processes whole image
Rotated images Support
Batch Recognition
Built-in Spell Checker
Split PDF
PDF to Excel
PDF to SVG

Aspose.OCR for .NET

API mainly supports PDF format but can export PDF documents to a number of other formats.

Reader

PDF, PDF/A, TEX, XPS, SVG

Writer

PDF, TXT, PNG, JPEG , PDF/A, DOC, DOCX, TEX, XPS, SVG, XLSX, PPTX

Aspose.OCR for .NET

Platform Independence

Aspose.OCR for .NET can work with any .NET based programming language.

.NET

Aspose.OCR for .NET

Getting Started with Aspose.OCR for .NET

The recommend way to install Aspose.OCR for .NET is using NuGet. Please use the following command for a smooth installation.

Install Aspose.Pdf via NuGet Command

 Install-Package Aspose.OCR

You can download the library directly from Aspose.PDF product page

Detect Particular Area of an Image via C#

Aspose.OCR for .NET has provided the capability to detect a particular area in the image inside .NET applications. A scanned text image or photograph may encompass text paragraphs, tables, illustrations, formulas, and more. Detecting, ordering, and classifying areas of interest on a page is the cornerstone of successful and accurate OCR. To achieve the task there are several document areas detection algorithms part of the library helping software developers to detect a particular type of content. The following shows how to load an image and detect a particular area for text recognition using C# commands.

How to Load Image & Detect A Particular Image Area via C# API?

Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
// Add an image to OcrInput object
Aspose.OCR.OcrInput input = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage);
input.Add("source.png");
// Set document areas detection mode
Aspose.OCR.RecognitionSettings recognitionSettings = new Aspose.OCR.RecognitionSettings();
recognitionSettings.DetectAreasMode = Aspose.OCR.DetectAreasMode.DOCUMENT;
// Recognize image
List results = recognitionEngine.Recognize(input, recognitionSettings);
foreach(Aspose.OCR.RecognitionResult result in results)
{
	Console.WriteLine(result.RecognitionText);
}

Process Images via .NET API

Aspose.OCR for .NET allows software developers to perform different types of operations on images inside their own .NET applications. The library has included several fully automated and manual image processing filters that help users to enhance their images before performing the OCR operations, such as Skew correction, Rotation, Noise removal, Contrast correction, Resizing, Binarization, Conversion to grayscale, Color inversion, Dilation, Median filter and many more. There are many other options and settings that you can use to customize the OCR process. For improving the image recognition quality developers can apply multiple processing filters. Developers can easily apply numerous filters to specific regions of an image.

Apply Filters on Images using C# .NET API

Aspose.Drawing.Rectangle blackRectangle = new Aspose.Drawing.Rectangle(5, 161, 340, 113);
Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter filters = new Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter();
// (1) Invert black region
filters.Add(Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter.Invert(blackRectangle));
// (2) Denoise entire image
filters.Add(Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter.AutoDenoising());

Text Comparison in Images via .NET API

Aspose.OCR for .NET gives software developers the capability to compare text on two images inside their own C# applications. The library can compare texts on two images, regardless of the font, text size, case, styles, and colors. One way to compare images text by extracting text from the images and once you have extracted the text from the images, you can use any text comparison algorithm to compare them. Aspose.OCR for .NET provides a simple way to compare two strings using the String.Equals method. The following example t demonstrates how to compare text in two images using C# code.

How to Compare Text in Two Images using .NET API?

 
using System;
using Aspose.OCR;
using System.Drawing;
 
class Program
{
    static void Main()
    {
        // Load the images
        var image1 = Image.FromFile("image1.png");
        var image2 = Image.FromFile("image2.png");
 
        // Extract text from the images
        var ocrEngine = new OcrEngine();
        ocrEngine.Image = ImageStream.FromImage(image1);
        ocrEngine.Process();
        var text1 = ocrEngine.Text;
 
        ocrEngine.Image = ImageStream.FromImage(image2);
        ocrEngine.Process();
        var text2 = ocrEngine.Text;
 
        // Compare the extracted text
        var areEqual = string.Equals(text1, text2, StringComparison.OrdinalIgnoreCase);
        Console.WriteLine("Are the texts equal? " + areEqual);
    }
}
//Note that the above code only works for exact text matches.

Search Text in An Image using C# API

Aspose.OCR for .NET makes it easy for software developers to find text in an image inside their own .NET applications. The library has provided support for searching text in images as easy as finding the text fragment in a string. The library supports searching for a case-sensitive or case-insensitive string, and even validates an image text against a pattern. Software developers can use the ImageHasText method to search text inside an image with just a couple of lines of C# code. The following example shows how to load an image and search a particular text inside it.

How to Find Text in an Image via .NET?

Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
Aspose.OCR.RecognitionSettings recognitionSettings = new Aspose.OCR.RecognitionSettings();
recognitionSettings.Language = Aspose.OCR.Language.Ukr;
if(recognitionEngine.ImageHasText("source.png", "Aspose", recognitionSettings))
{
	Console.WriteLine(@"The image contains the word ""Aspose""");
}
else
{
	Console.WriteLine(@"The image doesn't contain the word ""Aspose""");
}