Image Retrieval Using Customized Bag of Features
This example shows how to create a Content Based Image Retrieval (CBIR) system using a customized bag-of-features workflow.
Introduction
Content Based Image Retrieval (CBIR) systems are used to find images that are visually similar to a query image. The application of CBIR systems can be found in many areas such as a web-based product search, surveillance, and visual place identification. A common technique used to implement a CBIR system is bag of visual words, also known as bag of features [1,2]. Bag of features is a technique adapted to image retrieval from the world of document retrieval. Instead of using actual words as in document retrieval, bag of features uses image features as the visual words that describe an image.
Image features are an important part of CBIR systems. These image features are used to gauge similarity between images and can include global image features such as color, texture, and shape. Image features can also be local image features such as speeded up robust features (SURF), histogram of gradients (HOG), or local binary patterns (LBP). The benefit of the bag-of-features approach is that the type of features used to create the visual word vocabulary can be customized to fit the application.
The speed and efficiency of image search is also important in CBIR systems. For example, it may be acceptable to perform a brute force search in a small collection of images of less than a 100 images, where features from the query image are compared to features from each image in the collection. For larger collections, a brute force search is not feasible and more efficient search techniques must be used. The bag of features provides a concise encoding scheme to represent a large collection of images using a sparse set of visual word histograms. This enables compact storage and efficient search through an inverted index data structure.
The Computer Vision Toolbox™ provides a customizable bag-of-features framework to implement an image retrieval system. The following steps outline the procedure:
Select the Image Features for Retrieval
Create a Bag Of Features
Index the Images
Search for Similar Images
In this example, you will go through these steps to create an image retrieval system for searching a flower dataset [3]. This dataset contains about 3670 images of 5 different types of flowers.
Download this dataset for use in the rest of this example.
% Location of the compressed data set url = 'http://download.tensorflow.org/example_images/flower_photos.tgz'; % Store the output in a temporary folder downloadFolder = tempdir; filename = fullfile(downloadFolder,'flower_dataset.tgz');
Note that downloading the dataset from the web can take a very long time depending on your Internet connection. The commands below will block MATLAB for that period of time. Alternatively, you can use your web browser to first download the set to your local disk. If you choose that route, re-point the 'url' variable above to the file that you downloaded.
% Uncompressed data set imageFolder = fullfile(downloadFolder,'flower_photos'); if ~exist(imageFolder,'dir') % download only once disp('Downloading Flower Dataset (218 MB)...'); websave(filename,url); untar(filename,downloadFolder) end flowerImageSet = imageDatastore(imageFolder,'LabelSource','foldernames','IncludeSubfolders',true); % Total number of images in the data set numel(flowerImageSet.Files)
ans = 3670
Step 1 - Select the Image Features for Retrieval
The type of feature used for retrieval depends on the type of images within the collection. For example, if searching an image collection made up of scenes (beaches, cities, highways), it is preferable to use a global image feature, such as a color histogram that captures the color content of the entire scene. However, if the goal is to find specific objects within the image collections, then local image features extracted around object keypoints are a better choice.
Let's start by viewing one of images to get an idea of how to approach the problem.
% Display a one of the flower images
figure
I = imread(flowerImageSet.Files{1});
imshow(I);
The displayed image is by Mario.
In this example, the goal is to search for similar flowers in the dataset using the color information in the query image. A simple image feature based on the spatial layout of color is a good place to start.
The following function describes the algorithm used to extract color features from a given image. This function will be used as a extractorFcn within bagOfFeatures
to extract color features.
type exampleBagOfFeaturesColorExtractor.m
function [features, metrics] = exampleBagOfFeaturesColorExtractor(I) % Example color layout feature extractor. Designed for use with bagOfFeatures. % % Local color layout features are extracted from truecolor image, I and % returned in features. The strength of the features are returned in % metrics. % Copyright 2014-2020 The MathWorks, Inc. [~,~,P] = size(I); isColorImage = P == 3; if isColorImage % Convert RGB images to the L*a*b* colorspace. The L*a*b* colorspace % enables you to easily quantify the visual differences between colors. % Visually similar colors in the L*a*b* colorspace will have small % differences in their L*a*b* values. Ilab = rgb2lab(I); % Compute the "average" L*a*b* color within 16-by-16 pixel blocks. The % average value is used as the color portion of the image feature. An % efficient method to approximate this averaging procedure over % 16-by-16 pixel blocks is to reduce the size of the image by a factor % of 16 using IMRESIZE. Ilab = imresize(Ilab, 1/16); % Note, the average pixel value in a block can also be computed using % standard block processing or integral images. % Reshape L*a*b* image into "number of features"-by-3 matrix. [Mr,Nr,~] = size(Ilab); colorFeatures = reshape(Ilab, Mr*Nr, []); % L2 normalize color features rowNorm = sqrt(sum(colorFeatures.^2,2)); colorFeatures = bsxfun(@rdivide, colorFeatures, rowNorm + eps); % Augment the color feature by appending the [x y] location within the % image from which the color feature was extracted. This technique is % known as spatial augmentation. Spatial augmentation incorporates the % spatial layout of the features within an image as part of the % extracted feature vectors. Therefore, for two images to have similar % color features, the color and spatial distribution of color must be % similar. % Normalize pixel coordinates to handle different image sizes. xnorm = linspace(-0.5, 0.5, Nr); ynorm = linspace(-0.5, 0.5, Mr); [x, y] = meshgrid(xnorm, ynorm); % Concatenate the spatial locations and color features. features = [colorFeatures y(:) x(:)]; % Use color variance as feature metric. metrics = var(colorFeatures(:,1:3),0,2); else % Return empty features for non-color images. These features are % ignored by bagOfFeatures. features = zeros(0,5); metrics = zeros(0,1); end
Step 2 - Create a Bag Of Features
With the feature type defined, the next step is to learn the visual vocabulary within the bagOfFeatures
using a set of training images. The code shown below picks a random subset of images from the dataset for training and then trains bagOfFeatures
using the 'CustomExtractor' option.
Set doTraining
to false to load a pretrained bagOfFeatures. doTraining
is set to false because the training process takes several minutes. The rest of the example uses a pre-trained bagOfFeatures
to save time. If you wish to recreate colorBag
locally, set doTraining
to true and consider Computer Vision Toolbox Preferences to reduce processing time.
doTraining = false; if doTraining %Pick a random subset of the flower images. trainingSet = splitEachLabel(flowerImageSet, 0.6, 'randomized'); % Specify the number of levels and branching factor of the vocabulary % tree used within bagOfFeatures. Empirical analysis is required to % choose optimal values. numLevels = 1; numBranches = 5000; % Create a custom bag of features using the 'CustomExtractor' option. colorBag = bagOfFeatures(trainingSet, ... 'CustomExtractor', @exampleBagOfFeaturesColorExtractor, ... 'TreeProperties', [numLevels numBranches]); else % Load a pretrained bagOfFeatures. load('savedColorBagOfFeatures.mat','colorBag'); end
Step 3 - Index the Images
Now that the bagOfFeatures
is created, the entire flower image set can be indexed for search. The indexing procedure extracts features from each image using the custom extractor function from step 1. The extracted features are encoded into a visual word histogram and added into the image index.
if doTraining % Create a search index. flowerImageIndex = indexImages(flowerImageSet,colorBag,'SaveFeatureLocations',false); else % Load a saved index load('savedColorBagOfFeatures.mat','flowerImageIndex'); end
Because the indexing step processes thousands of images, the rest of this example uses a saved index to save time. You may recreate the index locally by setting doTraining
to true.
Step 4 - Search for Similar Images
The final step is to use the retrieveImages
function to search for similar images.
% Define a query image
queryImage = readimage(flowerImageSet,200);
figure
imshow(queryImage)
The displayed image is by RetinaFunk.
% Search for the top 5 images with similar color content [imageIDs, scores] = retrieveImages(queryImage, flowerImageIndex,'NumResults',5);
retrieveImages
returns the image IDs and the scores of each result. The scores are sorted from best to worst.
scores
scores = 5×1
0.4776
0.2138
0.1386
0.1382
0.1317
The imageIDs
correspond to the images within the image set that are similar to the query image.
% Display results using montage. figure montage(flowerImageSet.Files(imageIDs),'ThumbnailSize',[200 200])
The displayed images are by RetinaFunk, Jenny Downing, Mayeesherr, daBinsi, and Steve Snodgrass.
Conclusion
This example showed you how to customize the bagOfFeatures
and how to use indexImages
and retrieveImages
to create an image retrieval system based on color features. The techniques shown here may be extended to other feature types by further customizing the features used within bagOfFeatures
.
References
[1] Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV. (2003) 1470-1477
[2] Philbin, J., Chum, O., Isard, M., A., J.S., Zisserman: Object retrieval with large vocabularies and fast spatial matching. In: CVPR. (2007)
[3] TensorFlow: How to Retrain an Image Classifier for New Categories.