File Exchange

image thumbnail

Heart Sound Classifier

version (7.88 MB) by Bernhard Suhm
Heart Sound Classification demo as explained in the Machine Learning eBook, but now expanded to demonstrate Wavelet scattering


Updated 16 Oct 2019

View License

This submission provides the code explained by the (upcoming) eBook on the complete machine learning workflow. Based on the heart sound recordings of the PhysioNet 2016 challenge, a model is developed that classifies heart sounds into normal vs abnormal, and deployed in a prototype (heart) screening application. The workflow demonstrates:
1) using datastore for efficiently reading large number of data files from several folders
2) using tools from signal processing, wavelets and statistics for feature extraction
3) using ClassificationLearner app to interactively train, compare and optimize classifiers without writing any code
4) programmatically training an ensemble classifier with misclassification costs
5) applying an automated feature selection to select a smaller subset of relevant features
6) performing C code generation for deployment to an embedded system
7) applying Wavelet scattering to automatically extract features that outperform manually engineered ones

Cite As

Bernhard Suhm (2019). Heart Sound Classifier (, MATLAB Central File Exchange. Retrieved .

Comments and Ratings (35)

Bernhard Suhm

OK, everyone, this latest version (1.6) has a bunch of new things:
• Moved hyperparameter tuning and cost matrices into the Classification Learner (requires R2019b)
• Added links applying Deep Learning at the end, and bonus section applying Wavelet scattering
• Minor tweaks to ensure script also runs on MacOS

Bernhard Suhm

That's right, there was a backward code incompatibility in the code generation section. I'm about to release a significantly updated version, leveraging cool new features available in R2019b, and demonstrating an automated feature extraction technique (Wavelet scattering). Stay tuned.

@Arun Pradhan
Instead of loading the "TrainedEnsembleModel_FeatSel.mat", try to regenerate the mat file.

Lucas Holtz

@Jacob: the challenge website including the data seems to be relocated by now. I found the website data in their archive following this link:

The respecitve links for the .zip files can be found on the bottom of that page:

add these into your live script and you are good to go!


@Arun Pradhan
I encountered the same problem. Do you find any solution for it?

While running this scrip , at the stage of code generation ( Integrate Analytics with system) section , I am getting this errors

Error using rmfield (line 65)
A field named 'BinnedX' doesn't exist.

Error in classreg.learning.coderutils.classifToStruct (line 37)
dataSummary = rmfield(dataSummary,'BinnedX');

Error in classreg.learning.classif.CompactClassificationEnsemble/toStruct (line 397)
s = classreg.learning.coderutils.classifToStruct(this);

Error in saveCompactModel (line 23)
compactStruct = toStruct(compactObj); %#ok<NASGU>

Bernhard Suhm

@Jocab: the training data indeed is expected to be located in Data/training/training-a,b,... Not sure whether that solves your problem later in the script.

I'm getting an error when the fileDataStore tries to read from "training" folder. The when unzipped contains several folders called training-a, training-b, etc. Should the contents of these folders be combined into a single folder called training? I tried this and it got past this step, but then I got errors later. I think because some of the files in each of these individual training folders had the same names and therefore overwrote each other when copied to the same folder.

hi i want to predict heart attack with physionet PTB diagnostic database. should i down load the whole database for this. i want the classification be tow class case normal and heart attack

kumia kuma

Learning the machine leading with it makes things easier. Great!

jiao zhang

it is very convenient!


Bernhard Suhm

@Muhammad: you must not have run the section on the live script that generates or loads the model after selecting specific features. You can simply run load('TrainedEnsembleModel_FeatSel.mat') to create that variable.

Undefined function or variable 'trained_model_featsel'.

Error in code_genration (line 2)


@Youjie Ye
I had the same problem. The function importAudioFile is in the folder HelperFunctions. You need to add the folder HelperFunctions to the path. If the code didn't do it for you, then right click on this folder in the left panel in Matlab and select 'Add to Path', then 'Selected Folders'.

Youjie Ye

Hi Mr. suhm,
I'm reading your ebook, it is really a great tutorial. However, I met a error which make the program can not countinue running when I did hand-on exercise and ran the 'Access and Explore Data' of classifier. Here was the error information:

Error using fileDatastore (line 64)
Function importAudioFile does not exist.


Best regards,
Youjie (

Bernhard Suhm

Re: JJ's problems.
1) If you plan to run the actual feature extraction, which accesses the data via a datastore, make sure you have the training data downloaded. The first code section will do that, but by default it's disabled (getTrainingData = 0)
2) gcp is a MATLAB command that initiates a pool of computation resources (including multiple cores of your CPU if available), but you need the Parallel Computing toolbox installed, or you'll get an error.



I tried to extract features for the feature_table by running the code, but I encountered two difficulties: 1) fileDatastore creates an error in line 105 -> Error using fileDatastore (line 105) Cannot find files or folders matching ...
2) I need to know gcp in order to run this line of the code: n_parts = numpartitions(training_fds, gcp); what us gcp?


It is perfect and easy to use. However, there is still a problem in the link below which the Mathworks offered it as the "Mastering Machine Learning: A Step-by-Step Guide with MATLAB".
They did not update the latest version of your function.

Bernhard Suhm

With today's update, the code matches the example from our advanced machine learning eBook without further changes, and the code generation issue that several ran into should be resolved (now, the reduced model will use 15 features). Keep the feedback coming!


Very good job!

yiran duan

Shengwen Li

I get an error on line 102 , fileDatastore It says, " can't find files or folders matching" and then it lists a path. In the left pane there is a data folder with a validation subfolder with lots of .wav files in it. I tried to delete the data folder, so it could be reloaded, but I didn't have permission. The scripts runs up through the FFT plots.

Bernhard Suhm

My apologies, for some reason the last update didn't pick up those corrections. You can fix it by modifying a line in extractCodegenFeatures to "number_of_features = 14;"

Islam Alam

This is example is awesome.
However, there is an error with the HelperFunctions/trifbank.m where the first line is vfunction instead of function.

Then, when I corrected this typo, I got the following error, which I cannot solve at the moment:

??? X data must have 14 columns.

Error in ==> classifyHeartSounds Line: 20 Column: 20
Code generation failed: View Error Report

Could you please provide some help to solve the second error?

Thanks so much for such a great tutorial.

Error using mfcc
Error using Extract functions

How do i correct these errors could you help me out sir


Good day Mr. Suhm,

Could you please reupload/show me the code of the wavelet_features helper function?
I am doing a similar machine learning project with sEMG, have learned a lot from this lecture already and seeing/understanding the wavelet_features function would be of great help to me.

Best regards,
Riad ( )


Would be cool, but does unfortunately not work. One error is in HelperFunctions/trifbank.m where the first line is 'vfunction' instaed of 'function'.

In the life script, one line reads save('FeatureTable1', 'feature_table'); but should be save('FeatureTable', 'feature_table');

Codegen also fails with

??? X data must have 14 columns.

Error in ==> classifyHeartSounds Line: 20 Column: 20
Code generation failed: View Error Report

sunny shah

xiaojuan ni


- Moved hyperparameter tuning and cost matrices into the Classification Learner
- Added "bonus" section applying Wavelet scattering
- Fixed problem caused by 'binnedX' field introduced in R2019a
- Converted paths to be compatible with MacOS and Linux

Updated version to exactly match the exampled used for the "Advanced Machine Learning" eBook after obtaining permission to use code authored by a third party.

Fixed bugs uncovered by hanspeter

Actually use version without the signal_entropy feature

Removed reference to signal_entropy.m, which was owned by someone outside MathWorks.

MATLAB Release Compatibility
Created with R2019b
Compatible with R2016b to any release
Platform Compatibility
Windows macOS Linux