Open Source Ruby Library for OCR operations on Images

Free Ruby Optical Character Recognition (OCR) Library for Loading, Reading and Converting Images, PDFs, or Scanned documents to text

Optical Character Recognition (OCR) technology has made significant advancements in recent years, making it easier to automate data extraction from various sources. It enables automation, improves data accessibility, and enhances productivity in various domains. One such powerful tool is the OcrSpace Ruby library, which provides developers with a straightforward solution for integrating OCR capabilities into their Ruby applications. This library simplifies the process of extracting text from images, scanned documents, and PDF files, making it an excellent choice for developers seeking a convenient OCR solution.

The OcrSpace Ruby library offers a straightforward and intuitive interface, making it easy to integrate OCR capabilities into Ruby applications. There are several important features part of the library, such as loading various types of images, extracting text from images, converting scanned documents to text, extracting text from documents written in multiple languages, recognizing text from low-resolution images, working with handwritten content, retrieving coordinates of the recognized text, detecting page numbers, identifying specific areas of interest, and many more.

The OcrSpace Ruby library is a wrapper around the Ocr.Space OCR API, which offers reliable and accurate text extraction from images and PDF documents. Built specifically for Ruby developers, this library simplifies the integration process, allowing programmers to quickly incorporate OCR functionality into their applications without dealing with the complexities of the underlying API. Whether you are building a document management system, automating data extraction, or enhancing accessibility, the OcrSpace Ruby library is an invaluable tool that streamlines the OCR process and empowers your applications with accurate text extraction capabilities.

At A Glance

An overview of OcrSpace features.

Features Overview

Perform OCR
Add OCR Capabilities
Recognize Image text
Load Images via URL
Convert PDF tp text
Recognized Font text
Search PDF
Other Languages
Create OCR apps
Save to browser
Extract Text
Multi-threading Support

OcrSpace

OcrSpace supports popular compression file formats listed below.

Reader

PNG, JPEG, BMP, TIFF, TGA, DICOM

Writer

PNG, JPEG, BMP, TIFF

OcrSpace

Platform Independence

OcrSpace only requires Ruby Runtime.

Ruby 5.1 and above.

OcrSpace

Getting Started with OcrSpace

The recommend way to install OcrSpace is using Rubygems. Please use the following command for a smooth installation.

Install OcrSpace via Rubygems

$ gem install ocr_space

You can download the compiled shared library from Github repository.

Convert Images from URL to Text via Ruby API

The open source OcrSpace library has included some powerful features for loading various types of images and convert them to text with a couple of lines of Ruby code. The library supports various OCR options, including extracting text from images, scanned documents, and PDF files. Whether software developers need to process invoices, receipts, or any other type of document, the OcrSpace Ruby library can handle it efficiently. The following example shows how software developers can convert images to text via URL using Ruby API.

Convert Images from URL via Ruby API

result = resource.convert url: "http://bit.ly/2ih9aXt"

puts result
=>  #[{"TextOverlay"=>{"Lines"=>[], "HasOverlay"=>false, "Message"=>"Text overlay is not provided as it is not requested"}, "FileParseExitCode"=>1, "ParsedText"=>"If you want to find the secrets of the \r\nuniverse, think in terms of energy, \r\nfrequency and vibration. \r\nAZ QUOTES \r\n", "ErrorMessage"=>"", "ErrorDetails"=>""}]

result = resource.clean_convert url: "http://bit.ly/2ih9aXt"

puts result

=> #If you want to find the secrets of the universe, think in terms of energy, frequency and vibration. AZ QUOTES

Advanced OCR Capabilities via Ruby API

The open source OcrSpace library has included some very useful and advanced features for handling OCR operations inside Ruby applications. It can accurately recognize text from low-resolution images, distorted text, and even handwritten content, ensuring reliable results across various scenarios. Apart from text extraction, the library also allows developers to extract other important information from documents. This includes retrieving coordinates of the recognized text, detecting page numbers, and identifying specific areas of interest within the document.

How to Extract Text from an Uploaded File via Ruby API?

result = resource.convert file: "/Users/suyesh/Desktop/nicola_tesla.jpg"

puts result #Raw result

=>  #{"TextOverlay"=>{"Lines"=>[], "HasOverlay"=>false, "Message"=>"Text overlay is not provided as it is not requested"}, "FileParseExitCode"=>1, "ParsedText"=>"If you want to find the secrets of the \r\nuniverse, think in terms of energy, \r\nfrequency and vibration. \r\nAZ QUOTES \r\n", "ErrorMessage"=>"", "ErrorDetails"=>""}

result = resource.clean_convert file: "/Users/suyesh/Desktop/nicola_tesla.jpg"

puts result