1. Products
  2.   OCR
  3.   Java
  4.   Aspose.OCR for Java

Aspose.OCR for Java

 
 

Java OCR API to Optical Character Recognition

Develop application with Optical Character Recognition (OCR) capabilities using Java API. Recognize Text from Scanned Documents, Images & other sources.

Aspose.OCR for Java is a leading OCR (Optical Character Recognition) engine that gives software developers the capability to integrate text recognition functionality into their Java applications. It is designed to be very simple to handle and allows software developers to recognize text from scanned documents, images, and other sources, and can be used to extract text from various file formats, including JPEG, PNG, BMP, TIFF, HTML, PDF and many others.

Aspose.OCR for Java uses advanced OCR algorithms to accurately recognize text from images, even in low-quality scans or photographs. It has included support for over 50 different languages, including English, Spanish, French, German, Chinese and more. There are several important features part of the library such as handwritten Text Recognition, extract text from screenshots, extract text from specific areas of an image, create searchable PDFs, smartphone photos support, noise removal support, image binarization support, Increased Accessibility, and many more.

Aspose.OCR for Java uses advanced OCR algorithms that can accurately recognize text from images, even in low-quality scans or photographs. The library can enhance accessibility for users by converting scanned documents into searchable digital documents with ease. With its advanced OCR algorithms, multiple language support, and easy integration, Aspose.OCR for Java is quickly becoming the go-to OCR software for developers looking to add OCR functionality to their Java applications.

Previous Next

Getting Started with Aspose.OCR for Java

The recommend way to install Aspose.OCR for Java is using Maven Repository. Please use the following command for a smooth installation.

Install Aspose.OCR for Java via Maven Repository

 <repositories>
	<repository>
	<Id>AsposeJavaAPI</Id>
	<name>Aspose Java API;/name>
	<url>http://repository.aspose.com/repo/</url>
    </repository>
</repositories>

You can download the library directly from Aspose.OCR product page

Extract Text from Images via Python API

Aspose.OCR for Java has included very useful features allowing software developers to extract text from various types of images inside Python applications. The library has included support for reading text from raster images such as JPEG, PNG, WBMP, BMP, GIF and many more. There are other useful features part of the library for handling text extraction such as reading text from multi-page TIFF images, extracting text from pixel array, Reading images in fastest recognition mode, recognizing single line, extracting text from receipts and many more. The following example shows how to extract text from an image using Java commands.

How to Extract Text from Image using Java API?

AsposeOCR api = new AsposeOCR();
// Customize recognition
RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setLanguage(Language.Ukr);
// Extract text from image
RecognitionResult result = api.RecognizePage("source.png", recognitionSettings);
// Show non-critical recognition problems
result.warnings.forEach((w) -> {
	System.out.println(w);
});
// Get recognition results as JSON
String resultJson = result.GetJson();

Read Specific Areas of an Image via Java API

Aspose.OCR for Java is a useful OCR library allowing software developers to find and read only particular areas of an image, not all text using a couple of lines of Java code. It is a very useful feature and can be very helpful in batch processing for uniform documents like visas, driver’s licenses, ID cards, and so on. It supports features like extracting text inside an image rectangle, extracting lines with coordinates, automatic search for word and line bounding boxes, and many more. The following example shows how to extract text inside a rectangle using Java code.

How to Extract Text inside a Rectangle via Java API?

AsposeOCR api = new AsposeOCR();
// Define image regions
ArrayList regions = new ArrayList();
regions.add(new Rectangle(231,101,430,42));
regions.add(new Rectangle(546,224,123,26));
// Specify recognition settings
RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setAutoDenoising(true);	
recognitionSettings.setRecognitionAreas(regions);
// Extract text from selected regions
RecognitionResult result = api.RecognizePage("source.png", recognitionSettings);
System.out.println("Name: " + result.recognitionAreasText.get(0));
System.out.println("Expiry: " + result.recognitionAreasText.get(1));

Select Document Specific Areas via Java API

A document image may contains numerous blocks of various content such as text paragraphs, drawings, diagrams, formulas, tables, maps and so on. Aspose.OCR for Java enables software developers to select and detect particular areas of interest on a page with ease and perform OCR operation on it. The library supports automatic area detection as well as you can override it by manually selecting areas of interest. The following example demonstrated how software developers can enable automatic document areas detection inside their Java applications.

Automatic Document Areas Detection via Java API

 // Create instance of OCR API
AsposeOCR api = new AsposeOCR();
// Enable automatic document areas detection
RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setAllowedCharacters(CharactersAllowedType.LATIN_ALPHABET);
recognitionSettings.setDetectAreas(true);
// Extract text from image
RecognitionResult result = api.RecognizePage("source.png", recognitionSettings);
System.out.println("Recognition result:\n" + result.recognitionText + "\n\n");

 English