Train cascade object detector model
an interrupted training session. The
must match the output file name from the interrupted session. All
arguments saved from the earlier session are reused automatically.
Load the positive samples data from a MAT file. The file contains a table specifying bounding boxes for several object categories. The table was exported from the
Training Image Labeler app.
Load positive samples.
Select the bounding boxes for stop signs from the table.
positiveInstances = stopSignsAndCars(:,1:2);
Add the image folder to the MATLAB path.
imDir = fullfile(matlabroot,'toolbox','vision','visiondata',... 'stopSignImages'); addpath(imDir);
Specify the folder for negative images.
negativeFolder = fullfile(matlabroot,'toolbox','vision','visiondata',... 'nonStopSigns');
imageDatastore object containing negative images.
negativeImages = imageDatastore(negativeFolder);
Train a cascade object detector called 'stopSignDetector.xml' using HOG features. NOTE: The command can take several minutes to run.
trainCascadeObjectDetector('stopSignDetector.xml',positiveInstances, ... negativeFolder,'FalseAlarmRate',0.1,'NumCascadeStages',5);
Automatically setting ObjectTrainingSize to [35, 32] Using at most 42 of 42 positive samples per stage Using at most 84 negative samples per stage --cascadeParams-- Training stage 1 of 5 [........................................................................] Used 42 positive and 84 negative samples Time to train stage 1: 0 seconds Training stage 2 of 5 [........................................................................] Used 42 positive and 84 negative samples Time to train stage 2: 0 seconds Training stage 3 of 5 [........................................................................] Used 42 positive and 84 negative samples Time to train stage 3: 2 seconds Training stage 4 of 5 [........................................................................] Used 42 positive and 84 negative samples Time to train stage 4: 6 seconds Training stage 5 of 5 [........................................................................] Used 42 positive and 17 negative samples Time to train stage 5: 9 seconds Training complete
Use the newly trained classifier to detect a stop sign in an image.
detector = vision.CascadeObjectDetector('stopSignDetector.xml');
Read the test image.
img = imread('stopSignTest.jpg');
Detect a stop sign.
bbox = step(detector,img);
Insert bounding box rectangles and return the marked image.
detectedImg = insertObjectAnnotation(img,'rectangle',bbox,'stop sign');
Display the detected stop sign.
Remove the image directory from the path.
positiveInstances— Positive samples
Positive samples, specified as a two-column table or two-field structure.
The first table column or structure field contains image file names, specified as character
vectors or string scalars. Each image can be true color, grayscale, or
indexed, in any of the formats supported by
The second table column or structure field contains an M-by-4 matrix of M bounding boxes. Each bounding box is in the format [x y width height] and specifies an object location in the corresponding image.
You can use the Image
Labeler or Video
Labeler app to label objects of interest with bounding boxes. The
app returns a
groundTruth object. Use the
function to obtain a table from the object to use for
positiveInstances. The function automatically
determines the number of positive samples to use at each of the cascade
stages. This value is based on the number of stages and the true positive
rate. The true positive rate specifies how many positive samples can be
negativeImages— Negative images
ImageDatastoreobject | cell array | character vector | string scalar
Negative images, specified as an
ImageDatastore object, a path
to a folder containing images, or as a cell array of image file names.
Because the images are used to generate negative samples, they must not
contain any objects of interest. Instead, they should contain backgrounds
associated with the object.
outputXMLFilename— Trained cascade detector file name
Trained cascade detector file name, specified as a character vector or a string scalar with an
XML extension. For example,
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
'Haar'specifies Haar for the type of features to use.
'ObjectTrainingSize'— Object size for training
'Auto'(default) | two-element vector
Training object size, specified as the comma-separated pair.
This pair contains '
ObjectTrainingSize' and either
a two-element [height, width]
vector, or as
'Auto'. Before training, the function
resizes the positive and negative samples to
pixels. If you select
'Auto', the function determines
the size automatically based on the median width-to-height ratio of
the positive instances. For optimal detection accuracy, specify an
object training size close to the expected size of the object in the
image. However, for faster training and detection, set the object
training size to be smaller than the expected size of the object in
'NegativeSamplesFactor'— Negative sample factor
2(default) | real-valued scalar
Negative sample factor, specified as the comma-separated pair
consisting of '
NegativeSamplesFactor' and a real-valued
scalar. The number of negative samples to use at each stage is equal
[the number of positive samples used at each stage].
'NumCascadeStages'— Number of cascade stages
20(default) | positive integer
Number of cascade stages to train, specified as the comma-separated pair consisting of
NumCascadeStages' and a positive integer.
Increasing the number of stages may result in a more accurate detector
but also increases training time. More stages can require more training
images, because at each stage, some number of positive and negative
samples are eliminated. This value depends on the values of
TruePositiveRate. More stages can also enable
you to increase the
FalseAlarmRate. See the Train a Cascade Object Detector tutorial for more
'FalseAlarmRate'— Acceptable false alarm rate
0.5(default) | value in the range (0 1]
Acceptable false alarm rate at each stage, specified as the
comma-separated pair consisting of '
and a value in the range (0 1]. The false alarm rate is the fraction
of negative training samples incorrectly classified as positive samples.
The overall false alarm rate is calculated using the
stage and the number of cascade stages,
FalseAlarmRateincrease complexity of each stage. Increased complexity can achieve fewer false detections but can result in longer training and detection times. Higher values for
FalseAlarmRatecan require a greater number of cascade stages to achieve reasonable detection accuracy.
'TruePositiveRate'— Minimum true positive rate
0.995(default) | value in the range (0,1]
Minimum true positive rate required at each stage, specified
as the comma-separated pair consisting of '
and a value in the range (0 1]. The true positive rate is the fraction
of correctly classified positive training samples.
The overall resulting target positive rate is calculated using
TruePositiveRate per stage and the number
of cascade stages,
TruePositiveRateincrease complexity of each stage. Increased complexity can achieve a greater number of correct detections but can result in longer training and detection times.
'FeatureType'— Feature type
Feature type, specified as the comma-separated pair consisting
FeatureType' and one of the following:
The function allocates a large amount of memory, especially the Haar features. To avoid running out of memory, use this function on a 64-bit operating system with a sufficient amount of RAM.
Training a good detector requires thousands of training samples. Processing time for a large amount of data varies, but it is likely to take hours or even days. During training, the function displays the time it took to train each stage in the MATLAB® command window.
The OpenCV HOG parameters used in this function are:
 Viola, P., and M. J. Jones. "Rapid Object Detection using a Boosted Cascade of Simple Features." Proceedings of the 2001 IEEE Computer Society Conference. Volume 1, 15 April 2001, pp. I-511–I-518.
 Ojala, T., M. Pietikainen, and T. Maenpaa. “Multiresolution Gray-scale and Rotation Invariant Texture Classification With Local Binary Patterns.” IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 24, No. 7 July 2002, pp. 971–987.
 Dalal, N., and B. Triggs. “Histograms of Oriented Gradients for Human Detection.” IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Volume 1, 2005, pp. 886–893.