Identify Defects in Air Compressors Using Spectrogram Images

Since R2025a

This example uses:

This example shows how to detect and localize defects in acoustic recordings of air compressors using Mel spectrogram images and an EfficientAD anomaly detector.

EfficientAD is an accurate one-class anomaly detector that can detect and localize defects in various industrial applications [1]. The EfficientAD detector has low-latency and high throughput, and it can distinguish between normal and anomalous images in real-time.

The EfficientAD detector consists of two components.

A student-teacher model detects local (or structural) anomalies. In the student-teacher model, the teacher is a pretrained model that calculates structural features of images, such as scratches, dents, or discoloration. During training, the student network learns to predict the structural features of only normal images. During classification, the EfficientAD detector classifies an image as anomalous when the student network fails to predict the anomaly features that the teacher model calculates.
An autoencoder detects global (or logical) anomalies, such as an incorrect orientation, size, or position of an object.

This example trains the EfficientAD detector on spectrogram images of normal audio signals only. At inference, the model identifies defects on spectrogram images of normal and anomalous audio data. The example evaluates the inference results using quantitative metrics. The example also facilitates qualitative explanation of misclassifications using visualisation of localized defects.

Download Data

The data set consists of acoustic recordings collected on a single-stage reciprocating-type air compressor [2]. Each recording represents one of eight states, which includes the healthy state and these seven faulty states:

Leakage inlet valve (LIV) fault
Leakage outlet valve (LOV) fault
Non-return valve (NRV) fault
Piston ring fault
Flywheel fault
Rider belt fault
Bearing fault

Download the data set and unzip the data into a folder where you have write permission. The size of the data is 175 MB. This example assumes you are downloading the data in the temporary directory designated as tempdir in MATLAB®. If you choose to use a different folder, substitute that folder for tempdir in the following code.

airCompressorDataURL = "https://www.mathworks.com/supportfiles/" + ...
    "audio/AirCompressorDataset/AirCompressorDataset.zip";
dataDir = fullfile(tempdir,"AirCompressorDataset");
downloadAirCompressorData(airCompressorDataURL,dataDir);

Prepare Data for Training

Manage access of the audio files by using an audioDatastore (Audio Toolbox). The recordings are stored as WAV files in folders named for their respective state. Use the folder names as the class labels.

ads = audioDatastore(dataDir,IncludeSubfolders=true,...
    LabelSource="foldernames");

Get the class labels.

uniqueLabels = categories(ads.Labels,OutputType="string");

Preview Training Data

Extract Mel spectrograms for a sample recording of each class by using the yamnetPreprocess (Audio Toolbox) function. Each audio signal generates multiple Mel spectrograms. Display each sample recording and its first Mel spectrogram.

figure(Name="Imported Data",Units="Normalized",Position=[0.05, 0.25, 0.80, 1]);
for n = 1:numel(uniqueLabels)
    label = uniqueLabels(n);
    idx = find(ads.Labels==label,1);
    [x,fs] = audioread(ads.Files{idx});

    t = (0:size(x,1)-1)/fs;

    subplot(4,4,n);
    plot(t,x);
    title(label);

    melSpectYam = yamnetPreprocess(x,fs);
    melSpect = squeeze(melSpectYam(:,:,:,1));

    subplot(4,4,n+8);
    imagesc(CData=melSpect')
    axis([1 size(melSpect,2) 1 size(melSpect,1)])
    axis tight
    title(uniqueLabels(n));
    xlabel("Mel Band")
    ylabel("Frame")
end

Create Datastore of Spectrograms

Create a folder for each class label. These folders will store the Mel spectrograms that are used to train, calibrate, and test the anomaly detector.

spectDir = fullfile(pwd,"AirCompressorMelSpectrograms");
if ~exist(spectDir,"dir")
    mkdir(spectDir)
end
for n = 1:numel(uniqueLabels)
    labelDir = fullfile(spectDir,uniqueLabels(n));
    if ~exist(labelDir,"dir")
        mkdir(labelDir)
    end
end

Save the Mel spectrograms as JPG image files.

reset(ads)
k = 0;
while hasdata(ads)

    k = k+1;

    [audioIn,fileInfo] = read(ads);
    x = audioIn(:);
    fs = fileInfo.SampleRate;
    label = string(fileInfo.Label);

    melSpectYam = yamnetPreprocess(x,fs);
    melSpect = squeeze(melSpectYam(:,:,:,1));

    % Convert the spectrogram from a 2-D matrix to an RGB image with the
    % parula colormap
    im = ind2rgb(im2uint8(rescale(melSpect)),parula(256));
    % Resize as the network expects images to be at least 256-by-256 pixels
    im = imresize(im,4);
    imFilename = label+"_"+num2str(k,"%04d")+".jpg";
    imwrite(im,fullfile(spectDir,label,imFilename));
end

Manage access of the spectrogram images by using an imageDatastore. Specify the class labels as the folder names.

imds = imageDatastore(spectDir,IncludeSubfolders=true,LabelSource="foldernames");

Partition Data

Define the labels for the normal class and the anomaly class. Normal images have the label "Healthy". Anomaly images have any other class label.

C = uniqueLabels(imds.Labels);
normalClass = "Healthy";
anomalyClasses = C(~ismember(C,normalClass));

This example trains the network using only normal images. Allocate 50% of the normal images for training. The example calibrates and tests the network using both normal and anomalous images. Allocate 20% each of normal and anomaly images for calibration, and allocate the rest of the data set for testing. To partition the data according to the target ratios, use the splitAnomalyData function.

normalTrainRatio  = 0.5;
anomalyTrainRatio = 0;
normalCalRatio  = 0.2;
anomalyCalRatio = 0.2;
normalTestRatio  = 1 - (normalTrainRatio + normalCalRatio);
anomalyTestRatio = 1 - (anomalyTrainRatio + anomalyCalRatio);

anomalyClasses = anomalyClasses(:)';  % Format anomalyClasses as a row vector
[imdsTrain,imdsCal,imdsTest] = splitAnomalyData(imds,anomalyClasses, ...
    NormalLabelsRatio=[normalTrainRatio normalCalRatio normalTestRatio], ...
    AnomalyLabelsRatio=[anomalyTrainRatio anomalyCalRatio anomalyTestRatio]);

Splitting anomaly dataset
-------------------------
* Finalizing... Done.
* Number of files and proportions per class in all the datasets:

                       Input                Train             Validation               Test        
                 _________________    _________________    _________________    ___________________

                 NumFiles    Ratio    NumFiles    Ratio    NumFiles    Ratio    NumFiles     Ratio 
                 ________    _____    ________    _____    ________    _____    ________    _______
                                                                                                   
    Bearing        225       0.125        0         0         45       0.125      180       0.13564
    Flywheel       225       0.125        0         0         45       0.125      180       0.13564
    Healthy        225       0.125      113         1         45       0.125       67       0.05049
    LIV            225       0.125        0         0         45       0.125      180       0.13564
    LOV            225       0.125        0         0         45       0.125      180       0.13564
    NRV            225       0.125        0         0         45       0.125      180       0.13564
    Piston         225       0.125        0         0         45       0.125      180       0.13564
    Riderbelt      225       0.125        0         0         45       0.125      180       0.13564

Read a subset of the training images and display them as a montage.

exampleData = readall(subset(imdsTrain,1:9));
figure
montage(exampleData(:,1),Size=[3 3], ...
    BorderSize=5,BackgroundColor="w");

Train Network or Download Pretrained Network

By default, this example downloads a pretrained version of the EfficientAD anomaly detector by using the helper function downloadTrainedNetwork. The function is attached to this example as a supporting file. You can use the pretrained network to calibrate and evaluate the model without waiting for training to complete.

To train the detector, set the doTraining variable in the following code to true. Create an untrained model by using the efficientADAnomalyDetector function. Train the model by using the trainEfficientADAnomalyDetector function.

Train on one or more GPUs, if they are available. Using a GPU requires a Parallel Computing Toolbox™ license and a CUDA®-enabled NVIDIA® GPU. For more information, see GPU Computing Requirements (Parallel Computing Toolbox).

doTraining = false;
if doTraining
    % Define EfficientAD anomaly detector network architecture
    net = efficientADAnomalyDetector(Network="pdn-small");

    % Configure training options
    maxEpochs = 60;
    lr = 1.5e-3;
    miniBatchSize = 16;
    l2reg = 1e-4;
    options = trainingOptions("adam",...
        InitialLearnRate = lr,...
        L2Regularization=l2reg,...
        LearnRateSchedule="piecewise",...
        LearnRateDropPeriod=floor(0.95*maxEpochs),...
        LearnRateDropFactor=0.1,...
        MaxEpochs=maxEpochs,...
        VerboseFrequency=2,...
        MiniBatchSize=miniBatchSize,...
        Shuffle="every-epoch",...
        ValidationData=dsCal,...
        ValidationPatience=Inf,...
        OutputNetwork="best-validation",...
        Metrics = aucMetric(Name="auc"),...
        ObjectiveMetricName = "auc",...
        ResetInputNormalization=true,...
        Plots="training-progress",...
        PreprocessingEnvironment="parallel");
    detector = trainEfficientADAnomalyDetector(imdsTrain,net,options);
    modelDateTime = string(datetime("now",Format="yyyy-MM-dd-HH-mm-ss"));
    save(fullfile(tempdir,"trainedAirCompressorDefectDetectorEfficientAD_"+modelDateTime+".mat"), ...
        "detector");
else 
    trainedAirCompressorDefectDetectorNetURL = "https://ssd.mathworks.com/supportfiles/"+ ...
        "vision/data/trainedAirCompressorDefectDetectorEfficientAD.zip";
    downloadTrainedNetwork(trainedAirCompressorDefectDetectorNetURL,pwd);
    load("trainedAirCompressorDefectDetectorEfficientAD.mat");
end

Set Anomaly Threshold

After you train the EfficientAD detector or load a pretrained detector, you must set the anomaly score threshold for the detector. The detector classifies images based on whether their scores are above or below the threshold value. Calculate the anomaly threshold using the calibration images.

Predict the maximum anomaly score for each calibration image using the trained EfficientAD detector.

calScores = predict(detector,imdsCal);

Convert the ground truth labels of the calibration images into a binary classification. The label value 0 (false) corresponds to the normal ("Healthy") class. The label value 1 (true) corresponds to the anomaly class, which consists of all of the anomaly subclasses.

calLabels = imdsCal.Labels ~= "Healthy";

Separate the anomaly scores into two variables. One vector stores the scores of normal images and the other vector stores the scores of the anomalous images.

calScoresNormal = calScores(calLabels==0);
calScoresAnomaly = calScores(calLabels==1);

Plot a histogram of the anomaly scores for the normal and anomalous classes. The distributions are well separated by the model-predicted anomaly score.

numBins = 20;
[~,edges] = histcounts(calScores,numBins);
figure
hold on
hNormal = histogram(calScoresNormal,edges);
hAnomaly = histogram(calScoresAnomaly,edges);
hold off
legend([hNormal hAnomaly],"Normal","Anomaly")
xlabel("Max Anomaly Score")
ylabel("Number of Images")

Calculate the optimal anomaly threshold by using the anomalyThreshold function. Specify the first two input arguments as the binary ground truth labels and the predicted anomaly scores for the calibration images. Specify the third input argument as true because anomaly images have a label value of true. The anomalyThreshold function returns the optimal threshold value as a numeric scalar and the receiver operating characteristic (ROC) curve for the detector as an rocmetrics (Deep Learning Toolbox) object.

[thresh,roc] = anomalyThreshold(calLabels,calScores,true);

Set the Threshold property of the anomaly detector to the optimal value.

detector.Threshold = thresh;

Plot the ROC curve by using the plot (Deep Learning Toolbox) object function of rocmetrics. The ROC curve illustrates the performance of the classifier for a range of possible threshold values. Each point on the ROC curve represents the false positive rate (x-coordinate) and true positive rate (y-coordinate) when the calibration set images are classified using a different threshold value. The solid blue line represents the ROC curve. The area under the ROC curve (AUC) metric indicates classifier performance, and the maximum ROC AUC corresponding to a perfect classifier is 1.0.

plot(roc)
title("ROC AUC: "+ roc.AUC)

Evaluate Classification Model

Classify each test image as either normal or anomalous. The label value 0 (false) corresponds to the normal class. The label value 1 (true) corresponds to the anomaly class. The test images are full-size, which can cause out of memory errors on a GPU. To prevent out of memory errors, you can decrease the mini-batch size.

testLabelsPred = classify(detector,imdsTest);
testLabelsPred = testLabelsPred';

Get the ground truth labels for the test images. These ground truth labels are the subclass labels.

testSubclassLabelsGT = imdsTest.Labels;

To match the format of the predicted labels, convert the ground truth labels into a binary classification. Assign the value 0 (false) corresponds to the normal class and the value 1 (true) to any of the anomaly subclasses.

testLabelsGT = testSubclassLabelsGT ~= "Healthy";

Calculate anomaly detection metrics by using the evaluateAnomalyDetection function. This function supports calculating the metrics of subclasses, so you can specify the ground truth labels as the anomaly subclass labels.

metrics = evaluateAnomalyDetection(testLabelsPred,testSubclassLabelsGT,anomalyClasses);

Evaluating anomaly detection results
------------------------------------
* Finalizing... Done.
* Data set metrics:

    GlobalAccuracy    MeanAccuracy    Precision    Recall     Specificity    F1Score    FalsePositiveRate    FalseNegativeRate
    ______________    ____________    _________    _______    ___________    _______    _________________    _________________

       0.97965          0.96809        0.99758     0.98095      0.95522      0.9892         0.044776             0.019048

Display the confusion matrix.

M = metrics.ConfusionMatrix{:,:};
confusionchart(M,["Normal" "Anomaly"])
acc = sum(diag(M)) / sum(M,"all");
title("Accuracy: "+acc)

Display the accuracy of the normal and anomaly classes.

metrics.ClassMetrics

ans=2×2 table
               Accuracy    AccuracyPerSubClass
               ________    ___________________

    Normal     0.95522         {1×1 table}    
    Anomaly    0.98095         {7×1 table}

Display the accuracy of subclasses within the anomaly class.

metrics.ClassMetrics(2,"AccuracyPerSubClass").AccuracyPerSubClass{1}

ans=7×1 table
                 AccuracyPerSubClass
                 ___________________

    Bearing            0.98333      
    Flywheel           0.97222      
    LIV                      1      
    LOV                0.93889      
    NRV                      1      
    Piston             0.98333      
    Riderbelt          0.98889

Explain Classification Decisions

You can use the anomaly heatmap that the anomaly detector predicts to explain why the detector classifies an image as normal or anomalous. This approach is useful for identifying patterns in false negatives and false positives. You can use these patterns to identify strategies for increasing class balancing of the training data or improving the network performance.

Calculate a display range that reflects the range of anomaly scores observed across the entire calibration set, including normal and anomalous images. By using the same display range across images, you can compare images more easily than if you scale each image to its own minimum and maximum. Apply the display range to all heatmaps in this example.

minMapVal = inf;
maxMapVal = -inf;
reset(imdsCal)
while hasdata(imdsCal)
    img = read(imdsCal);
    map = anomalyMap(detector,img);
    minMapVal = min(min(map,[],"all"),minMapVal);
    maxMapVal = max(max(map,[],"all"),maxMapVal);
end
displayRange = [minMapVal 0.7*maxMapVal];

View Heatmap of Anomalous Image

A correctly classified anomaly, or true positive, is an image that is labeled as anomalous (1) by both the trained detector and the ground truth. Select a true positive image.

idxTruePositive = find(testLabelsGT & testLabelsPred);
img = readimage(imdsTest,idxTruePositive(1));

Calculate the heatmap of the image. Then, display an overlay of the heatmap on the image by using the anomalyMapOverlay function.

map = anomalyMap(detector,img);
imshow(anomalyMapOverlay(img,map,MapRange=displayRange,Blend="equal"))

View Heatmap of Normal Image

A correctly classified normal image, or true negative, is an image that is labeled as normal (0) by both the trained detector and the ground truth. Select a true negative image.

idxTrueNegative = find(~testLabelsGT & ~testLabelsPred);
img = readimage(imdsTest,idxTrueNegative(1));

Calculate the heatmap of the image. Then, display an overlay of the heatmap on the image.

map = anomalyMap(detector,img);
imshow(anomalyMapOverlay(img,map,MapRange=displayRange,Blend="equal"))

View Heatmap of False Positive Image

False positives are normal images that the network classifies as anomalous. For false positives, the ground truth label is 0 and the predicted label is 1. Select a false positive image, then calculate the heatmap of the image and display an overlay of the heatmap on the image.

Overlay the heatmap on the false positive image to gain insight into the misclassification. For this test image, anomalous scores are localized to image areas with uneven lightning. To decrease false positive results, you could adjust the image contrast during preprocessing, increase the number of images used for training, or choose a different threshold at the calibration step.

idxFalsePositive = find(~testLabelsGT & testLabelsPred);
if ~isempty(idxFalsePositive)
    img = readimage(imdsTest,idxFalsePositive(1));
    map = anomalyMap(detector,img);
    imshow(anomalyMapOverlay(img,map,MapRange=displayRange,Blend="equal"));
end

View Heatmap of False Negative Image

False negatives are images with defect anomalies that the network classifies as normal. For false negatives, the ground truth label is 1 and the predicted label is 0. Select a false negative image, then calculate the heatmap of the image.

Overlay the heatmap on the false negative image to gain insight into the misclassification. To decrease false negative results, consider adjusting the anomaly threshold of the detector.

idxFalseNegative = find(testLabelsGT & ~testLabelsPred);
if ~isempty(idxFalseNegative)
    img = readimage(imdsTest,idxFalseNegative(1));
    map = anomalyMap(detector,img);
    imshow(anomalyMapOverlay(img,map,MapRange=displayRange,Blend="equal"))
end

References

[1] Batzner, Kilian, Lars Heckler, and Rebecca König. "EfficientAD: Accurate Visual Anomaly Detection at Millisecond-Level Latencies." In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 128–138. 2024. https://arxiv.org/abs/2303.14535.

[2] Verma, Nishchal K., Rahul Kumar Sevakula, Sonal Dixit, and Al Salour. “Intelligent Condition Based Monitoring Using Acoustic Signals for Air Compressors.” IEEE Transactions on Reliability 65, no. 1 (March 2016): 291–309. https://doi.org/10.1109/TR.2015.2459684.