Extract OpenL3 features
specifies options using one or more
embeddings = openl3Features(
Name,Value arguments. For
embeddings = openl3Features(audioIn,fs,'OverlapPercentage',75)
applies a 75% overlap between consecutive frames used to create the audio
This function requires both Audio Toolbox™ and Deep Learning Toolbox™.
Download and unzip the Audio Toolbox™ model for OpenL3.
openl3Features at the command line. If the Audio Toolbox model for OpenL3 is not installed, the function provides a link to the location of the network weights. To download the model, click the link. Unzip the file to a location on the MATLAB path.
Alternatively, execute the following commands to download and unzip the OpenL3 model to your temporary directory.
downloadFolder = fullfile(tempdir,'OpenL3Download'); loc = websave(downloadFolder,'https://ssd.mathworks.com/supportfiles/audio/openl3.zip'); OpenL3Location = tempdir; unzip(loc,OpenL3Location) addpath(fullfile(OpenL3Location,'openl3'))
Read in an audio file.
[audioIn,fs] = audioread('MainStreetOne-16-16-mono-12secs.wav');
openl3Features function with the audio and sample rate to extract OpenL3 feature embeddings from the audio.
featureVectors = openl3Features(audioIn,fs);
openl3Features function returns a matrix of 512-element feature vectors over time.
[numHops,numElementsPerHop,numChannels] = size(featureVectors)
numHops = 111
numElementsPerHop = 512
numChannels = 1
Create a 10-second pink noise signal and then extract OpenL3 features. The
openl3Features function extracts features from mel spectrograms with 90% overlap.
fs = 16e3; dur = 10; audioIn = pinknoise(dur*fs,1,'single'); features = openl3Features(audioIn,fs);
Plot the OpenL3 features over time.
surf(features,'EdgeColor','none') view([30 65]) axis tight xlabel('Feature Index') ylabel('Frame') xlabel('Feature Value') title('OpenL3 Features')
To decrease the resolution of OpenL3 features over time, specify the percent overlap between mel spectrograms. Plot the results.
overlapPercentage = 10; features = openl3Features(audioIn,fs,'OverlapPercentage',overlapPercentage); surf(features,'EdgeColor','none') view([30 65]) axis tight xlabel('Feature Index') ylabel('Frame') zlabel('Feature Value') title('OpenL3 Features')
audioIn— Input signal
Input signal, specified as a column vector or matrix. If you specify a matrix,
openl3Features treats the columns of the matrix as individual
fs— Sample rate (Hz)
Sample rate of the input signal in Hz, specified as a positive scalar.
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
OverlapPercentage— Percentage overlap between consecutive spectrograms
90(default) | scalar in the range [0,100)
Percentage overlap between consecutive spectrograms, specified as a scalar in the range [0,100).
SpectrumType— Spectrum type
EmbeddingLength— Embedding length
Length of the output audio embedding, specified as
ContentType— Audio content type
Audio content type the neural network is trained on, specified as
'env' when you want to use a model trained on
'music' when you want to use a model trained on musical
embeddings— Compact representation of audio data
Compact representation of audio data, returned as an N-by-L-by-C array, where:
 Cramer, Jason, et al. "Look, Listen, and Learn More: Design Choices for Deep Audio Embeddings." In ICASSP 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2019, pp. 3852-56. DOI.org (Crossref), doi:/10.1109/ICASSP.2019.8682475.
This function fully supports GPU arrays. For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).