Aspose.OCR Cloud SDK for Python
Python OCR API to Read & Extract Image's Text
Read and Extract text from Images, Photos, Screenshots, Scanned documents, and PDF Files via Python OCR Library.
Aspose.OCR Cloud SDK for Python is an advanced and flexible optical character recognition (OCR) solution that helps software developers to create OCR applications without any external dependencies. It allows software developers to read and extract text from images, photos, screenshots, scanned documents, and PDFs in a large number of European, Cyrillic, and Eastern scripts, returning results in the most popular document formats. The API makes it easy for developers to add OCR functionality to almost any device or platform, including netbooks, mini PCs, or even entry-level smartphones.
The Aspose.OCR Cloud SDK for Python is straightforward and easy to handle. It provides a wide range of features that make it an ideal OCR solution for developers working with Python, such as reading an entire image, reading a scanned PDF document, extracting text from a specific region of the image, extracting data from a scanned or photographed receipt, fetching PDF recognition results, extracting text from scanned or photographed tables, converting the recognition results into a natural human voice, and many more.
Aspose.OCR Cloud SDK for Python is built on top of the Aspose.OCR Cloud API, is a cloud-based OCR engine that supports 45 recognition languages including English, French, German, Spanish, Chinese, Japanese, Arabic, and many more. Using the OCR SDK, Python programmers can easily integrate OCR functionality into their Python applications without having to worry about the complexities of OCR technology. The SDK provides a simple and intuitive interface that allows users to upload images, perform OCR, and retrieve text in just a few lines of code. If you need to add OCR functionality to your Python applications, the Aspose.OCR Cloud SDK for Python is definitely worth checking out.
At A Glance
An overview of Aspose.OCR Cloud SDK for Python features.
- Perform OCR
- Add OCR Capabilities
- Recognize Image text
- Convet images of text
- Recognized Font text
- Search PDF
- 27 Recognition Languages
- Create OCR apps
- Save to browser
- Extract Text
- Multi-threading Support
- Recognize rotated Image
- Pre-processing filters
- PDF to Images
- Recognizes Chines Chars
- Detects Popular typefaces
- Processes whole image
- Rotated images Support
- Batch Recognition
- Built-in Spell Checker
- Split PDF
- PDF to Excel
- PDF to SVG
API mainly supports PDF format but can export PDF documents to a number of other formats.
Aspose.OCR Cloud SDK for Python can work with any Python based programming language.
- Python 4.5 and above.
Getting Started with Aspose.OCR Cloud SDK for Python
The recommend way to install Aspose.OCR Cloud SDK for Python is using pip. Please use the following command for a smooth installation.
Install Aspose.OCR Cloud SDK for Python via pip
pip install aspose-ocr-cloud
You can download the SDK directly from Aspose.OCR Python Cloud SDK product page
Image Recognition using Python Apps
Aspose.OCR Cloud SDK for Python allows software developers to perform OCR operation to achieve image recognition inside their own Python applications. The API is very easy to use and image recognition can be performed from any platform with Internet access. You can easily use the OCR REST API to select and send images for recognition, fetch results and store it in any supported file formats with just a couple of lines of code. The following example shows how to perform OCR operation on images using Python code.
Perform OCR on an image inside Python Apps
import asposeocrcloud # create an instance of the OCR client client = asposeocrcloud.OcrApi(api_key='your_api_key', app_sid='your_app_sid') # read the image file with open('image.jpg', 'rb') as image_file: image_data = image_file.read() # call the OCR API to extract text from the image result = client.post_ocr(image_data=image_data, language='eng', use_default_dictionaries=True) # print the extracted text print(result.text)
Extract Text from PDF Files via Python API
Portable Document Format (PDF) is one of the world's most popular business document file format and is a file format developed by Adobe in 1992 to present documents. Aspose.OCR Cloud SDK for Python has included a very powerful feature for extracting text from PDF files inside Python applications. To achieve the task in easy way you need to upload the PDF file to the Aspose cloud storage and perform the OCR recognition on the uploaded PDF file. The following example shows how software developers can extract text from a PDF file using Python code.
How to Extract Text from a PDF File via Python API?
import asposeocrcloud from asposeocrcloud.apis.ocr_api import OcrApi from asposeocrcloud.configuration import Configuration configuration = Configuration(api_key='your_api_key', app_sid='your_app_sid') api = OcrApi(asposeocrcloud.ApiClient(configuration)) # Upload the PDF file to the Aspose cloud storage with open('your_pdf_file.pdf', 'rb') as file: api.upload_file(path='your_pdf_file.pdf', file=file) # Perform the OCR recognition on the uploaded PDF file result = api.post_recognize_ocr_from_url_or_content(file_path='your_pdf_file.pdf') # Story the recognized text recognized_text = result['text'] print(recognized_text)
Convert Text to Speech via Python API
Aspose.OCR Cloud SDK for Python enables software developers to convert text from image without installing any 3rd party software. Using the API, programmers can convert the recognition results into a natural human voice that can be played in the background or downloaded. First user’s need to send the image to Aspose OCR Cloud server and extract text from it and after that convert the text to speech using the Aspose OCR Cloud Text-to-Speech API. After the successful conversion you can save the speech file to disk.
How to Convert Text to Speech using Python API?
import os from asposeocrcloud import OcrApi, OcrClient, SpeechApi client_id = os.environ['CLIENT_ID'] client_secret = os.environ['CLIENT_SECRET'] ocr_api = OcrApi(OcrClient(client_id, client_secret)) speech_api = SpeechApi(OcrClient(client_id, client_secret)) # Upload the image containing the text filename = 'image.png' with open(filename, 'rb') as file: response = ocr_api.post_recognize_from_content(file.read(), language='English', use_default_dictionaries=True) # Extract the recognized text text = '' for result in response.parts: for line in result.lines: for word in line.words: text += word.text + ' ' # Convert the text to speech response = speech_api.post_recognize_from_text(text, language='en-US', voice_name='Ben') # Save the speech file to disk with open('output.wav', 'wb') as file: file.write(response.content)