
Aspose.OCR Cloud SDK for Python
Python OCR API to Read & Extract Image's Text
Read and Extract text from Images, Photos, Screenshots, Scanned documents, and PDF Files via Leading Python OCR Library.
What is Aspose.OCR Cloud SDK for Python?
The Aspose.OCR Cloud SDK for Python empowers software developers with a flexible optical character recognition solution, free from external dependencies. This advanced OCR tool seamlessly extracts text from images, photos, screenshots, scanned documents, and PDFs across numerous European, Cyrillic, and Eastern scripts. Delivering results in popular document formats, the API simplifies integrating robust OCR functionality into nearly any device or platform—from netbooks and mini PCs to entry-level smartphones.
Ideal for Python developers, this straightforward SDK offers extensive features, including reading entire images or scanned PDFs, extracting text from specific regions, processing receipts and tables, and even converting results to natural speech. Built upon the Aspose.OCR Cloud API, the cloud-based engine supports 45 recognition languages, such as English, French, German, Spanish, Chinese, Japanese, and Arabic. With an intuitive interface, Python programmers can embed OCR capabilities using just a few code lines, effortlessly uploading images, performing recognition, and retrieving text—making it a premier choice for adding OCR to Python applications.
Getting Started with Aspose.OCR Cloud SDK for Python
The recommend way to install Aspose.OCR Cloud SDK for Python is using pip. Please use the following command for a smooth installation.
Install Aspose.OCR Cloud SDK for Python via pip
pip install aspose-ocr-cloudYou can download the SDK directly from Aspose.OCR Python Cloud SDK product page
Image Recognition using Python SDK
Aspose.OCR Cloud SDK for Python allows software developers to perform OCR operation to achieve image recognition inside their own Python applications. The API is very easy to use and image recognition can be performed from any platform with Internet access. You can easily use the OCR REST API to select and send images for recognition, fetch results and store it in any supported file formats with just a couple of lines of code. The following example shows how to perform OCR operation on images using Python code.
How to Perform OCR Operations on an Image inside Python Apps?
import asposeocrcloud
# create an instance of the OCR client
client = asposeocrcloud.OcrApi(api_key='your_api_key', app_sid='your_app_sid')
# read the image file
with open('image.jpg', 'rb') as image_file:
image_data = image_file.read()
# call the OCR API to extract text from the image
result = client.post_ocr(image_data=image_data, language='eng', use_default_dictionaries=True)
# print the extracted text
print(result.text)
Extract Text from PDF Files via Python API
Portable Document Format (PDF) is one of the world's most popular business document file format and is a file format developed by Adobe in 1992 to present documents. Aspose.OCR Cloud SDK for Python has included a very powerful feature for extracting text from PDF files inside Python applications. To achieve the task in easy way you need to upload the PDF file to the Aspose cloud storage and perform the OCR recognition on the uploaded PDF file. The following example shows how software developers can extract text from a PDF file using Python code.
How to Extract Text from a PDF File via Python API?
import asposeocrcloud
from asposeocrcloud.apis.ocr_api import OcrApi
from asposeocrcloud.configuration import Configuration
configuration = Configuration(api_key='your_api_key', app_sid='your_app_sid')
api = OcrApi(asposeocrcloud.ApiClient(configuration))
# Upload the PDF file to the Aspose cloud storage
with open('your_pdf_file.pdf', 'rb') as file:
api.upload_file(path='your_pdf_file.pdf', file=file)
# Perform the OCR recognition on the uploaded PDF file
result = api.post_recognize_ocr_from_url_or_content(file_path='your_pdf_file.pdf')
# Story the recognized text
recognized_text = result['text']
print(recognized_text)
Convert Text to Speech via Python API
Aspose.OCR Cloud SDK for Python enables software developers to convert text from image without installing any 3rd party software. Using the API, programmers can convert the recognition results into a natural human voice that can be played in the background or downloaded. First user’s need to send the image to Aspose OCR Cloud server and extract text from it and after that convert the text to speech using the Aspose OCR Cloud Text-to-Speech API. After the successful conversion you can save the speech file to disk.
How to Convert Text to Speech using Python API?
import os
from asposeocrcloud import OcrApi, OcrClient, SpeechApi
client_id = os.environ['CLIENT_ID']
client_secret = os.environ['CLIENT_SECRET']
ocr_api = OcrApi(OcrClient(client_id, client_secret))
speech_api = SpeechApi(OcrClient(client_id, client_secret))
# Upload the image containing the text
filename = 'image.png'
with open(filename, 'rb') as file:
response = ocr_api.post_recognize_from_content(file.read(), language='English', use_default_dictionaries=True)
# Extract the recognized text
text = ''
for result in response.parts:
for line in result.lines:
for word in line.words:
text += word.text + ' '
# Convert the text to speech
response = speech_api.post_recognize_from_text(text, language='en-US', voice_name='Ben')
# Save the speech file to disk
with open('output.wav', 'wb') as file:
file.write(response.content)
