Audio Toolbox™ provides MATLAB® and Simulink® support for pretrained audio deep learning networks.
Locate and classify sounds with YAMNet and estimate pitch with CREPE.
Extract VGGish or OpenL3 feature embeddings to input to machine learning
and deep learning systems. Use i-vector systems to produce compact
representations of audio signals for applications such as speaker
recognition, verification, identification, and diarization. Use
detectspeechnn to perform voice activity detection
Using pretrained deep learning networks requires Deep Learning Toolbox™. The Audio Toolbox pretrained networks are available in Deep Network Designer (Deep Learning Toolbox).
|Detect boundaries of speech in audio signal using AI (Since R2023a)
|Voice activity detection (VAD) neural network (Since R2023a)
|Preprocess audio for voice activity detection (VAD) network (Since R2023a)
|Postprocess frame-based VAD probabilities (Since R2023a)
|Deep Pitch Estimator
|Estimate pitch with CREPE deep learning neural network (Since R2023a)
|CREPE deep pitch estimation neural network (Since R2023a)
|Preprocess audio for CREPE deep pitch estimation (Since R2023a)
|Postprocess output of CREPE pitch estimation network (Since R2023a)
|Deep Network Designer
|Design, visualize, and train deep learning networks
- Audio Transfer Learning Using Experiment Manager
Configure an experiment that compares the performance of multiple pretrained networks applied to a speech command recognition task using transfer learning.
- Classify Human Voice Using YAMNet on Android Device (Simulink Support Package for Android Devices)
This example shows how to use the Simulink® Support Package for Android® Devices and a pretrained YAMNet network to classify human voices.