Open Source Ruby Library for OCR operations on Images

Free Ruby Optical Character Recognition (OCR) Library for Loading, Reading and Converting Images, PDFs, or Scanned documents to text

What is OcrSpace?

Modern Optical Character Recognition (OCR) technology has revolutionized data accessibility, allowing businesses to automate extraction and boost productivity across diverse domains. For Ruby developers, the OcrSpace Ruby library serves as a high-performance wrapper around the Ocr.Space OCR API, offering a streamlined solution to extract text from images, scanned documents, and PDF files. By providing an intuitive interface, this library eliminates the complexities of raw API integration, making it the premier choice for building document management systems or automated data entry pipelines. Whether you are converting static files into searchable data or enhancing digital accessibility, this library ensures a fast, reliable, and convenient path to sophisticated text recognition.

The versatility of the OcrSpace Ruby library lies in its comprehensive feature set, which extends far beyond basic transcription. It empowers applications to convert scanned documents to text across multiple languages and even successfully recognize text from low-resolution images or handwritten content. Advanced developers can leverage its ability to retrieve precise text coordinates, identify specific page numbers, and isolate custom areas of interest for targeted extraction. By simplifying the process of extracting text from documents, this invaluable tool allows programmers to focus on core logic while ensuring high-accuracy results. Integrating this Ruby-based OCR solution effectively streamlines financial and administrative workflows with minimal development overhead.

At A Glance

An overview of OcrSpace features.

Features Overview

Add OCR Operations
Add OCR Capabilities
Recognize Image text
Load Images via URL
Convert PDF tp text
Recognized Font text
Search PDF
Other Languages
Create OCR apps
Save to browser
Extract Text
Multi-threading Support

OcrSpace

OcrSpace supports popular compression file formats listed below.

Reader

PNG, JPEG, BMP, TIFF, TGA, DICOM

Writer

PNG, JPEG, BMP, TIFF

OcrSpace

Platform Independence

OcrSpace only requires Ruby Runtime.

Ruby 5.1 and above.

OcrSpace

Getting Started with OcrSpace

The recommend way to install OcrSpace is using Rubygems. Please use the following command for a smooth installation.

Install OcrSpace via Rubygems

$ gem install ocr_space

You can download the compiled shared library from Github repository.

Convert Images from URL to Text via Ruby API

The open source OcrSpace library has included some powerful features for loading various types of images and convert them to text with a couple of lines of Ruby code. The library supports various OCR options, including extracting text from images, scanned documents, and PDF files. Whether software developers need to process invoices, receipts, or any other type of document, the OcrSpace Ruby library can handle it efficiently. The following example shows how software developers can convert images to text via URL using Ruby API.

How to Convert Images from URL via Ruby API?

result = resource.convert url: "http://bit.ly/2ih9aXt"

puts result
=>  #[{"TextOverlay"=>{"Lines"=>[], "HasOverlay"=>false, "Message"=>"Text overlay is not provided as it is not requested"}, "FileParseExitCode"=>1, "ParsedText"=>"If you want to find the secrets of the \r\nuniverse, think in terms of energy, \r\nfrequency and vibration. \r\nAZ QUOTES \r\n", "ErrorMessage"=>"", "ErrorDetails"=>""}]

result = resource.clean_convert url: "http://bit.ly/2ih9aXt"

puts result

=> #If you want to find the secrets of the universe, think in terms of energy, frequency and vibration. AZ QUOTES

Advanced OCR Capabilities via Ruby API

The open source OcrSpace library has included some very useful and advanced features for handling OCR operations inside Ruby applications. It can accurately recognize text from low-resolution images, distorted text, and even handwritten content, ensuring reliable results across various scenarios. Apart from text extraction, the library also allows developers to extract other important information from documents. This includes retrieving coordinates of the recognized text, detecting page numbers, and identifying specific areas of interest within the document.

How to Extract Text from an Uploaded File via Ruby API?

result = resource.convert file: "/Users/suyesh/Desktop/nicola_tesla.jpg"

puts result #Raw result

=>  #{"TextOverlay"=>{"Lines"=>[], "HasOverlay"=>false, "Message"=>"Text overlay is not provided as it is not requested"}, "FileParseExitCode"=>1, "ParsedText"=>"If you want to find the secrets of the \r\nuniverse, think in terms of energy, \r\nfrequency and vibration. \r\nAZ QUOTES \r\n", "ErrorMessage"=>"", "ErrorDetails"=>""}

result = resource.clean_convert file: "/Users/suyesh/Desktop/nicola_tesla.jpg"

puts result