Open Source Python Library for Speech Recognitions
Python API that supports speech processing as well as recognition operations. It also supports MFCCs and filter-bank energies alongside the log-energy of filter-banks.
The SpeechPy library provides a set of useful techniques for speech processing as well as recognition and important post-processing operations using Python commands. Various advanced speech features like MFCCs and filter-bank energies alongside the log-energy of filter-banks are fully supported by the SpeechPy library.
The library also aims to provide all the necessary functionalities for deep learning applications such as speech recognition (AS) or automatic speech recognition (ASR). It has provided several important functions for calculating the main speech features such as calculate MFCC features from an audio signal, computing mel-filter-banks energy, computing log Mel-filter-bank energy features from an audio signal, extracts temporal derivative features, extracting mel frequency cepstral coefficient and many more.
At A Glance
An overview of SpeechPy features.
- Speech Processing
- Speech Recognition
- Compute MFCCs
- Filter-bank Energies
- MP3 Support
- Post Processing
- Extract Audio
- Audio to Text
SpeechPy only requires Python runtime.
- Python 2.6 & Above.
Getting Started with SpeechPy
The easiest way to install SpeechPy library is using the Python Package Index (PyPI). Please use the following command for a complete installation.
Install SpeechPy using PyPI
pip install speechpy
Speech Recognition via Python
Speech Recognition is mainly concerned with the recognition and translation of spoken language into text by computers. The open source Python library SpeechPy enables software developers to create applications supporting speech recognition features. It helps users to save time by speaking instead of typing. Thus helping users to communicate with their devices with less effort and making technological devices more accessible and easier to use.
Compute MFCC from Audio Signal
The Python library SpeechPy has provided complete support for computing MFCC features from an audio signal inside their own applications. The library has provided support for several important MFCC features such as sampling frequency of the signal, length of each frame in seconds, step between successive frames in seconds, apply filters from filter-bank, number of FFT points, lowest band edge of mel filters, highest band edge of mel filters, Number of cepstral coefficients and more.
Extract Audio using Autoencoders
The open source Python library SpeechPy enables computer programmers to extract audio data using Python code. Auto-encoder is a effective learning technique for neural networks that learns efficient data representations. It learn from each other how to compress data from the input layer into a shorter code, and then un-compress that code into whatever format best matches the original input.