Using histogram to find the frequency of combinations within the data set
Show older comments
I am new to Matlab and just started getting into generating histograms. I understand how to create a histogram that determines the frequency of values..
But how can I create a histogram that is looking for specific combinations of data?
For example I have the following data set:
someData = [1,3,4,4,2,1,1,0]
How would I be able to create a histogram that determines the frequency of one value in the data set is exactly equal to the next value within the same dataset??
In the above example the answer should be 2. (4 repeat and are adjacent to each other. 1 repeats and is adjacent to each other)
Help is MOST appreciated!
1 Comment
Clifford Shelton
on 1 Aug 2012
Edited: Clifford Shelton
on 1 Aug 2012
Answers (1)
Image Analyst
on 1 Aug 2012
0 votes
You can do this with the gray level co-occurrence matrix (GLCM). It's done by the function graycomatrix() in the Image Processing Toolbox. It's like a 2D histogram that counts how many times each gray level (numerical value) occurs next to every other gray level (value).
12 Comments
Clifford Shelton
on 1 Aug 2012
Edited: Clifford Shelton
on 1 Aug 2012
Image Analyst
on 1 Aug 2012
Edited: Image Analyst
on 1 Aug 2012
You can definitely do that but with different functions, and you might have to check for each particular pairing one at a time. You can get the number of connected groups of a particular number with bwlabel like this:
[labeledArray numberOfRegions] = bwlabel(yourArray == theNumber);
numberOfRegions would be the number of times that 5's are connected to each other, and labeledArray says where in the array they occur. If you then want to find their area, like how many of groups of three 5's are there, how many groups of six 5's are there, etc. then you need to get the area of the regions like this:
measurements = regionprops(labeledArray, 'Area');
allAreas = [measurements.Area];
Or you can do it all in one line (without using bwlabel):
measurements = regionprops(yourArray == theNumber, 'Area');
allAreas = [measurements.Area];
That will give you the area (number of elements) of each connected group of 5's. This is called "connected components labeling" and it very common and straightforward. If you wanted to do something like how many combinations of [2 3 1 3] exist, then you'd first have to use a filter to find those locations, and give a logical matrix of [true true true true] when that pattern occurs and false where it doesn't occur, then you'd call regionprops on the logical matrix.
Clifford Shelton
on 1 Aug 2012
Edited: Clifford Shelton
on 1 Aug 2012
Image Analyst
on 1 Aug 2012
Wow, you keep making it more and more specialized and esoteric. Each of those would require different operations because each request is so different, but they can all be done. See my BlobsDemo example for labelling in my File Exchange.
For your example [1,4,5,0,1,2,2,5,7,10] would have two non-zero regions: [1 4 5] and [1 2 2 5 7 10]. If you wanted to find the 2's you'd do bwlabel(a==2) and you'd get 1 region with area 2 because there are two pixels with the value 2. Trying it with all the other values (which you can get with unique()) will get you one region for 7 and 10 with area of 1 each, and two regions for 1, 4, and 5 - each of those regions would have an area of 1 because those values are by themselves, not connected to any other elements of those values. Just type these into the command line and see what pops out:
bwlabel(yourMatrix == 1)
bwlabel(yourMatrix == 2)
bwlabel(yourMatrix == 4)
bwlabel(yourMatrix == 5)
bwlabel(yourMatrix == 7)
bwlabel(yourMatrix == 10)
Again, if you use unique() you could get all the unique values and you could put that in a loop with only one call to bwlabel inside the loop.
For your (2) I suggest you use diff() first, then regionprops(), or paly around with graycomatrix().
For (3) you'd need to run that through conv() or imfilter() first to get the average in a sliding window, then use regionprops().
Clifford Shelton
on 1 Aug 2012
Clifford Shelton
on 1 Aug 2012
Image Analyst
on 2 Aug 2012
7 lines of code should do it. Try something like this (untested)
labeledImage = myData == 1;
blobMeasurements= regionprops(labeledImage, 'Area');
% Now I'll demonstrate how to select certain blobs based using the ismember function.
% Let's say that we wanted to find only those blobs
% with an area of 2.
allBlobAreas = [blobMeasurements.Area];
% Get a list of the blobs that meet our criteria and we need to keep.
allowableAreaIndexes = allBlobAreas == 2; % Take only area=2 objects.
keeperIndexes = find(allowableAreaIndexes);
% Extract only those blobs that meet our criteria, and
% eliminate those blobs that don't meet our criteria.
% Note how we use ismember() to do this.
keeperBlobsImage = ismember(labeledImage, keeperIndexes);
% Now we have an array with only the area=2 blobs.
% Count them with the bwlabel function
[labeled2 numberOfRegions] = bwlabel(keeperBlobsImage );
% numberOfRegions will be the number of times
% that there are 2 1's in a row.
Clifford Shelton
on 3 Aug 2012
Image Analyst
on 3 Aug 2012
If you give a longer dataset with multiple regions, rather than just one, and say which number of adjacent pairs of numbers you want then I might be able to figure out what's going wrong. You might want to look at BlobsDemo in my File Exchange: http://www.mathworks.com/matlabcentral/fileexchange/?term=authorid%3A31862. That's the basis for everything I've done in this thread. It has a 2D example with coins on a uniform background - essentially the same as your situation except your situation is 1D. Though, and I'm not sure many people know this, all the image processing functions work on 1D arrays. A 1D array (single row image) is just as valid an image as one with many rows.
Clifford Shelton
on 3 Aug 2012
Image Analyst
on 3 Aug 2012
For an average that moves over by 1 every time, use conv().
dataSet = [1,4,6,2,2,7,8,4,4]
slidingAverage = conv(dataSet, [1 1 1]/3, 'valid')
If you want to move in "jumps" of 3, then you'll have to use blockproc(). Here's a 2D demo for blockproc that you can probably easily adapt to 1D signals if you need to:
function blockproc_demo()
try
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
workspace; % Make sure the workspace panel is showing.
fontSize = 20;
% Change the current folder to the folder of this m-file.
if(~isdeployed)
cd(fileparts(which(mfilename)));
end
% Read in standard MATLAB demo image.
grayImage = imread('cameraman.tif');
[rows columns numberOfColorChannels] = size(grayImage);
% Display the original image.
subplot(2, 2, 1);
imshow(grayImage, []);
caption = sprintf('Original Image\n%d by %d pixels', ...
rows, columns);
title(caption, 'FontSize', fontSize);
% Enlarge figure to full screen.
set(gcf, 'Position', get(0,'Screensize'));
set(gcf, 'name','Image Analysis Demo', 'numbertitle','off')
% Block process the image.
windowSize = 3;
% Each 3x3 block will get replaced by one value.
% Output image will be smaller by a factor of windowSize.
myFilterHandle = @myFilter;
blockyImage = blockproc(grayImage,[windowSize windowSize], myFilterHandle);
[rowsP columnsP numberOfColorChannelsP] = size(blockyImage);
% Display the processed image.
% It is smaller, but the display routine imshow() replicates
% the image so that it looks bigger than it really is.
subplot(2, 2, 2);
imshow(blockyImage, []);
caption = sprintf('Image Processed in %d by %d Blocks\n%d by %d pixels\nCustom Box Filter', ...
windowSize, windowSize, rowsP, columnsP);
title(caption, 'FontSize', fontSize);
% Now let's do it an alternate way where we use an anonymous function.
% We'll take the standard deviation in the blocks.
windowSize = 8;
myFilterHandle2 = @(block_struct) ...
std2(block_struct.data) * ones(size(block_struct.data));
blockyImageSD = blockproc(grayImage, [windowSize windowSize], myFilterHandle2);
[rowsSD columnsSD numberOfColorChannelsSD] = size(blockyImageSD);
subplot(2, 2, 4);
imshow(blockyImageSD, []);
caption = sprintf('Image Processed in %d by %d Blocks\n%d by %d pixels\nAnonymous Standard Deviation Filter', ...
windowSize, windowSize, rowsSD, columnsSD);
title(caption, 'FontSize', fontSize);
% Note: the image size of blockyImageSD is 256x256, NOT smaller.
% That's because we're returning an 8x8 array instead of a scalar.
uiwait(msgbox('Done with demo'));
catch ME
errorMessage = sprintf('Error in blockproc_demo():\n\nError Message:\n%s', ME.message);
uiwait(warndlg(errorMessage));
end
return;
% Takes one 3x3 block of image data and multiplies it
% element by element by the kernel and
% returns a single value.
function singleValue = myFilter(blockStruct)
try
% Assign default value.
% Will be used near sides of image (due to boundary effects),
% or in the case of errors, etc.
singleValue = 0;
% Create a 2D filter.
kernel = [0 0.2 0; 0.2 0.2 0.2; 0 0.2 0];
% kernel = ones(blockStruct.blockSize); % Box filter.
% Make sure filter size matches image block size.
if any(blockStruct.blockSize ~= size(kernel))
% If any of the dimensions don't match.
% You'll get here near the edges,
% if the image is not a multiple of the block size.
% warndlg('block size does not match kernel size');
return;
end
% Size matches if we get here, so we're okay.
% Extract our block out of the structure.
array3x3 = blockStruct.data;
% Do the filtering. Multiply by kernel and sum.
singleValue = sum(sum(double(array3x3) .* kernel));
catch ME
% Some kind of problem...
errorMessage = sprintf('Error in myFilter():\n\nError Message:\n%s', ME.message);
% uiwait(warndlg(errorMessage));
fprintf(1, '%s\n', errorMessage);
end
return;
Clifford Shelton
on 3 Aug 2012
Categories
Find more on Neighborhood and Block Processing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!