Aspose.OCR for C++
C++ OCR API for Adding OCR Capabilities
Integrate OCR Functionality inside C & C++ apps using Free OCR API. It can recognize & extract text from scanned images & PDFs, smartphone photos, screenshots, & areas of images.
As the digital era continues to expand, the need for efficient text extraction from images, scanned documents, and other sources becomes necessary. This is where optical character recognition (OCR) technology plays a vital role in converting visual data into editable and searchable text. Aspose.OCR for C++ emerges as a powerful solution, offering developers a comprehensive toolkit to integrate OCR capabilities seamlessly into their C++ applications. Whether developers need to extract text from scanned documents, images, or even screenshots, Aspose.OCR provides a comprehensive solution to handle various OCR requirements.
Aspose.OCR for C++ offers a rich set of image processing features that enhance OCR accuracy and improve the recognition process. There are several important features part of the library, such as processes rotated and noisy images, recognizes text in a large number of languages, batch recognition of all images, recognizes the whole image, extracts text from selected areas only, identifies words or paragraphs, saves the recognition results on the disk, image preprocessing support, Identifies the characters on an image, Identifies characters on an image, reads only certain areas of an image and so on. Software developers can preprocess images by applying filters, adjusting contrast and brightness, deskewing, and noise removal, among other operations.
Aspose.OCR for C++ can be easily integrated into user’s applications as well as with other Aspose products. The library provides a very straightforward API that allows developers to seamlessly incorporate OCR capabilities into their C++ projects. With just a few lines of code, developers can initialize the OCR engine, load the image or document, and extract the text. It is designed to be cross-platform and can be used to develop cross-platform applications for Windows, Linux and web. By integrating Aspose.OCR into their projects, software developers can enhance productivity, improve data accessibility, and unlock new possibilities for text processing and analysis in their applications.
Getting Started with Aspose.OCR for C++
The recommend way to install Aspose.OCR for C++ is using NuGet. Please use the following command for a smooth installation.
Install Aspose.OCR for C++ via NuGet Command
NuGet\Install-Package Aspose.Ocr.Cpp -Version 23.4.0
You can download the library directly from Aspose.PDF product page
Efficient Text Extraction in C++ Apps
Aspose.OCR for C++ provides a reliable and efficient approach for extracting text from a wide variety of file formats, including scanned documents, images, PDF files, multi-page TIFF, pixel array, receipts and so on. It utilizes sophisticated OCR algorithms to recognize and extract text with high accuracy, preserving the original formatting and structure. The library supports a wide range of languages, making it suitable for multilingual applications and enabling developers to extract text from diverse sources effortlessly. The following example shows how to extract text from TIFF image via C++ API.
Extracting Text from TIFF Image via C++ API?
std::string image_path = "source.tiff";
const size_t len = 4096;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.language_alphabet = language::ukr;
size_t res_len = aspose::ocr::page_tiff("1.tif;2.tif", buffer, len, set);
std::wcout << buffer;
Read Certain Areas of Images via C++
Aspose.OCR for C++ makes it easy for software developers read a particular area of an image and extract text from that areas or regions inside C++ applications. This feature is particularly useful when you only need to extract text from specific sections of an image and want to exclude irrelevant content. The library offers a simple and efficient method to achieve this. Below is an example code snippet demonstrating how to read certain areas of images using Aspose.OCR for C++.
Extract Text from Specific Regions within an Image via C++ API
// Load the image
System::SharedPtr imageStream = System::MakeObject(new System::IO::FileStream(u"image.jpg", System::IO::FileMode::Open));
// Initialize OCR engine
System::SharedPtr ocrEngine = System::MakeObject();
// Set the image for OCR
ocrEngine->Image = imageStream;
// Set the rectangle coordinates for the specific area to read
System::SharedPtr areaRect = System::MakeObject(10, 10, 200, 100);
ocrEngine->Config->SetArea(areaRect);
// Perform OCR on the specified area
ocrEngine->Process();
// Retrieve the extracted text from the specific area
System::String extractedText = ocrEngine->Text;
// Display the extracted text
std::cout << "Extracted Text: " << extractedText.ToUtf8String() << std::endl;
Image Preprocessing via C++ API
Aspose.OCR for C++ has provided a standardized way to prepare your content for OCR and achieve accurate OCR results. The library offers a range of advanced image preprocessing techniques. These techniques enhance image quality, correct perspective distortion, remove noise, and optimize the text recognition process. By employing image preprocessing, developers can significantly improve OCR accuracy, especially when dealing with challenging images or documents with complex layouts. Multiple preprocessing filters can be applied to the same image to further improve the recognition quality.
Remove Noise from Image Automatically before Recognition via C++ API
// Recognition settings
std::string image_path = "source.png";
const size_t len = 4096;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.auto_denoising = true;
size_t res_len = aspose::ocr::page_settings(image_path.c_str(), buffer, len, settings);
std::wcout << buffer;
// apply Preprocessing filter
std::string image_path = "source.png";
custom_preprocessing_filters filters_;
filters_.filter_1 = OCR_IMG_PREPROCESS_AUTODENOISING;
asposeocr_preprocess_page_and_save(image_path.c_str(), "result.png", filters_);
Save Recognition Results in Other Formats
Aspose.OCR for C++ enables software developers to recognize text from numerous popular file formats, such as PDF, JPEG, PNG, TIFF, BMP and more. The API allows developers to save recognition results in multiple formats so they can be shared, stored in a database, displayed, or analyzed. Software developers can save recognition results as file, text, JSON or XML. The library allows setting recognition confidence thresholds, enabling software developers to filter out text with low confidence levels. This feature proves invaluable when dealing with large volumes of text, ensuring that only reliable and accurate results are extracted. The following examples shows how to save recognition results as file using C++ commands.
Save Recognition Results as a Multi-page Document via C++ API
directory dir(".");
const string current_dir = dir.full_name();
const string image = current_dir + "p.png";
const size_t len = 6000;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.save_format = file_format::docx;
aspose::ocr::page_save(image.c_str(), "result.docx", settings);