Aspose.OCR for C++

C++ OCR API for Adding OCR Capabilities

Integrate OCR Functionality inside C & C++ apps using Free OCR API. It can recognize & extract text from scanned images & PDFs, smartphone photos, screenshots, & areas of images.

What is Aspose.OCR for C++?

With the ongoing growth of the digital age, extracting text from images, scanned papers, and various sources efficiently is crucial. Optical character recognition (OCR) technology steps in to convert visual information into text that you can edit and search. Aspose.OCR for C++ stands out as a robust tool, giving developers a full set of features to easily add OCR functions to their C++ applications. If you need to pull text from scanned papers, pictures, or screenshots, Aspose.OCR has got you covered. Software developers can preprocess images by applying filters, adjusting contrast and brightness, deskewing, and noise removal, among other operations. You can smoothly add Aspose.OCR to your applications and combine it with other Aspose tools. The library offers a simple API for integrating OCR features into your C++ projects.

Aspose.OCR for C++ comes with a bunch of cool image tools that make OCR more accurate and efficient. This library has key features like fixing rotated and blurry images, reading text from many languages, bulk processing of images, batch recognition of all images, recognizes the whole image, extracts text from selected areas only, identifies words or paragraphs, saves the recognition results on the disk, image preprocessing support, Identifies the characters on an image, Identifies characters on an image, reads only certain areas of an image, and identifying text in various fonts. By writing a few lines of code, you can set up the OCR engine, import the image or document, and get the text. It’s made to work on different platforms, allowing you to create applications for Windows, Linux, and the web.

At A Glance

An overview of Aspose.OCR for C++ features.

Features Overview

Perform OCR
Add OCR Capabilities
Recognize Image text
Convet images of text
Recognized Font text
Search PDF
27 Recognition Languages
Create OCR apps
Save to browser
Extract Text
Multi-threading Support

Features Overview

Recognize rotated Image
Pre-processing filters
PDF to Images
Recognizes Chines Chars
Detects Popular typefaces
Processes whole image
Rotated images Support
Batch Recognition
Built-in Spell Checker
Split PDF
PDF to Excel
PDF to SVG

Aspose.OCR for C++

API mainly supports PDF format but can export PDF documents to a number of other formats.

Reader

PDF, PDF/A, TEX, XPS, SVG

Writer

PDF, TXT, PNG, JPEG , PDF/A, DOC, DOCX, TEX, XPS, SVG, XLSX, PPTX

Aspose.OCR for C++

Platform Independence

Aspose.OCR for C++ can work with any C++ based programming language.

C++ runtime.

Aspose.OCR for C++

Getting Started with Aspose.OCR for C++

The recommend way to install Aspose.OCR for C++ is using NuGet. Please use the following command for a smooth installation.

Install Aspose.OCR for C++ via NuGet Command

 NuGet\Install-Package Aspose.Ocr.Cpp -Version 23.4.0

You can download the library directly from Aspose.PDF product page

Efficient Text Extraction in C++ Apps

Aspose.OCR for C++ provides a reliable and efficient approach for extracting text from a wide variety of file formats, including scanned documents, images, PDF files, multi-page TIFF, pixel array, receipts and so on. It utilizes sophisticated OCR algorithms to recognize and extract text with high accuracy, preserving the original formatting and structure. The library supports a wide range of languages, making it suitable for multilingual applications and enabling developers to extract text from diverse sources effortlessly. The following example shows how to extract text from TIFF image via C++ API.

Extracting Text from TIFF Image via C++ API?

std::string image_path = "source.tiff";
const size_t len = 4096;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.language_alphabet = language::ukr;
size_t res_len = aspose::ocr::page_tiff("1.tif;2.tif", buffer, len, set);
std::wcout << buffer;

Read Certain Areas of Images via C++

Aspose.OCR for C++ makes it easy for software developers read a particular area of an image and extract text from that areas or regions inside C++ applications. This feature is particularly useful when you only need to extract text from specific sections of an image and want to exclude irrelevant content. The library offers a simple and efficient method to achieve this. Below is an example code snippet demonstrating how to read certain areas of images using Aspose.OCR for C++.

Extract Text from Specific Regions within an Image via C++ API

// Load the image
System::SharedPtr imageStream = System::MakeObject(new System::IO::FileStream(u"image.jpg", System::IO::FileMode::Open));
// Initialize OCR engine
System::SharedPtr ocrEngine = System::MakeObject();
// Set the image for OCR
ocrEngine->Image = imageStream;
// Set the rectangle coordinates for the specific area to read
System::SharedPtr areaRect = System::MakeObject(10, 10, 200, 100);
ocrEngine->Config->SetArea(areaRect);
// Perform OCR on the specified area
ocrEngine->Process();
// Retrieve the extracted text from the specific area
System::String extractedText = ocrEngine->Text;
// Display the extracted text
std::cout << "Extracted Text: " << extractedText.ToUtf8String() << std::endl;

Image Preprocessing via C++ API

Aspose.OCR for C++ has provided a standardized way to prepare your content for OCR and achieve accurate OCR results. The library offers a range of advanced image preprocessing techniques. These techniques enhance image quality, correct perspective distortion, remove noise, and optimize the text recognition process. By employing image preprocessing, developers can significantly improve OCR accuracy, especially when dealing with challenging images or documents with complex layouts. Multiple preprocessing filters can be applied to the same image to further improve the recognition quality.

Remove Noise from Image Automatically before Recognition via C++ API

 // Recognition settings
std::string image_path = "source.png";
const size_t len = 4096;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.auto_denoising = true;
size_t res_len = aspose::ocr::page_settings(image_path.c_str(), buffer, len, settings);
std::wcout << buffer;

// apply Preprocessing filter

std::string image_path = "source.png";
custom_preprocessing_filters filters_;
filters_.filter_1 = OCR_IMG_PREPROCESS_AUTODENOISING;
asposeocr_preprocess_page_and_save(image_path.c_str(), "result.png", filters_);

Save Recognition Results in Other Formats

Aspose.OCR for C++ enables software developers to recognize text from numerous popular file formats, such as PDF, JPEG, PNG, TIFF, BMP and more. The API allows developers to save recognition results in multiple formats so they can be shared, stored in a database, displayed, or analyzed. Software developers can save recognition results as file, text, JSON or XML. The library allows setting recognition confidence thresholds, enabling software developers to filter out text with low confidence levels. This feature proves invaluable when dealing with large volumes of text, ensuring that only reliable and accurate results are extracted. The following examples shows how to save recognition results as file using C++ commands.

Save Recognition Results as a Multi-page Document via C++ API

directory dir(".");
const string current_dir = dir.full_name();
const string image = current_dir + "p.png";
const size_t len = 6000;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.save_format = file_format::docx;
aspose::ocr::page_save(image.c_str(), "result.docx", settings);