How to Extract Data from Contour Image?

Hi,
I am new to data extraction from Image in Matlab, I have used alternate tools to extract data from 2D plots, but contours never. Could someone help me with extracting the contour map from the attached image.
The image represents a Turbocharger Turbine Map.
I wanted to extract the data from the image (in color) and recreate it. I have already extracted the lines in the image, just need to extract the conoturs. Any help is highly appreciated.
Thanks & regards
Raghu

5 Comments

What are the contour lines and what lines have you already extracted?
I see a green line, a blue line, black lines, gray lines, and "lines" that separate each color band from it's neighboring color bands. Please clarify exactly where the "contour lines" are. And specify if this RGB image is what you're stuck starting with, or do you have the underlying data and you plotted this figure yourself.
I really like how whoever created this PDF was so helpful to put images and unrelated line patterns behind the graphs. How dare substance ever stand in the way of style!
@Image Analyst, To start with
  1. I am stuck with only this image and have no relevant data with respect to the image
  2. Using webplotdigitizer (an external website) I am successfull in extracting the data from the blue, green and black lines. Those lines are of no concern to me. If there is any way the lines can be removed from the actual image, it would be great.
  3. That leaves the image with only the colorful curves which are called the Contours. If you notice the color grows from the somewhat from left center in Dark Red to different colors and ends with Dark Blue. The value of each curve is defined in the color pallete in the left side of the image. I can manually print the image, add points to each color curve, draw horizontal & vertical lines for each point to find and x & y coordinates, but that is a tedious process. I would like to solve this using Matlab or as a matter of fact with any tool.
I deal with such images in my domain on a day to day basis. Would be more helpful if I can translate or build a code to extract the data from such colorful curves.
I hope I have made my query clear to understand. Please let me know if you have any solution for processing such an image.
I have used pixlr to remove the background noise from the image. please refer to the new image.
Thanks & Regards
Raghu
Are the curves given in @DGM's answer below what you want?
@Image Analyst, Yes it does. Thank you for taking time to solve the issue and help me out.
Raghu

Sign in to comment.

 Accepted Answer

As far as I'm concerned, you'd do it the same way you'd extract the lines. Manually.
It depends whether you want the isolines or whether you're trying to reconstruct the z-data. Since it's a contour map, the isolines might be useful, but you are getting grossly quantized zdata and you can easily be misled as to which z-values are actually represented by each color.
Why can't we just use the colors to find the regions? Because it's a 4:2:0 JPG. This is all that's left of your hue data.
inpict = imread('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1657171/TurbineMap.JPG');
[H S V] = rgb2hsv(inpict);
imshow(H,'border','tight')
The segmentation results you get are going to be inaccurate and require so much oversight and cleanup that I find it extremely hard to believe that it would be worth the hassle. You have 22 isolines. You can draw those faster than it would take to get a clean segmentation of the destroyed image.
There might be some things you could do to at least avoid the H discontinuity, but that's just an extra layer of complication. Note also that the bin edges are not uniformly spaced.
I might expand on this, but I have deliveries to do atm.
EDIT: Here's a thing. I still don't think it's productive, but it gets a not-quite useful set of masks for each color region. There would still need to be a bunch of cleanup work as well as the calibration required to map the pixel coordinates to data coordinates.
inpict = imread('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1657171/TurbineMap.JPG');
% try to stretch H and V
% in a vain attempt to increase the distinctness of the map
% and to avoid splitting the histogram in a severely low-contrast region
[H S V] = rgb2hsv(inpict);
H = mod(H+0.015,1);
H = mod(H*1.35,1);
V = max(((V-0.37)*1.55),0).^1.5;
inpict = hsv2rgb(cat(3,H,S,V));
% defines the quantization
% bin edges are _not_ uniformly spaced
numbins = 21;
binedges = [0.4:0.02:0.66 0.67:0.01:0.70 0.72:0.01:0.75];
% for sampling the colorbar
sampsize = [37 17]; % [x y]
xcenters = 1402;
ycenters = round(linspace(190,1081,numbins));
% sample the colorbar to get a crude approximation of the colormap
CT0 = zeros(numbins,3);
sampr = floor(sampsize/2);
xrange = -sampr(1):sampr(1);
yrange = -sampr(2):sampr(2);
for k = 1:numbins
thissample = im2double(inpict(yrange+ycenters(k),xrange+xcenters,:));
CT0(k,:) = mean(thissample,1:2);
end
% get a mask of the entire contour region
% i'm also including the colorbar for sake of visualizing
[H,S,~] = rgb2hsv(inpict);
roimk = S > 0.5;
roimk = imclose(roimk,ones(5));
roimk = bwareafilt(roimk,2); % the two largest blobs
roimk = imfill(roimk,'holes');
% try to do some inpainting to get rid of grid lines, etc
% due to the JPG compression, this won't be a complete mask
% COMMENTED BECAUSE THIS IS TOO SLOW FOR THE FORUM
%gridmask = S<0.9; % pick some threshold
%gridmask = imdilate(gridmask,ones(5));
% regionfill() only operates on grayscale images, so this needs a loop.
%for k = 1:size(inpict,3)
% inpict(:,:,k) = regionfill(inpict(:,:,k),gridmask);
%end
% try to generate masks for each colored region
% i'm just going to use rgb2ind() instead of messing in HSV
indpict = rgb2ind(inpict,CT0,'nodither');
maskstack = false([size(roimk) 1 numbins]);
for k = 1:numbins
thismask = indpict == k-1;
thismask = bwareaopen(thismask,100);
maskstack(:,:,:,k) = thismask;
end
maskstack = maskstack & roimk;
% visualize
montage(maskstack,'border',5,'background','m')
That's just a set of crude masks. Cleaning them up programmatically would be a new layer of difficulty. Ignoring how you'd do that without again resorting to manually doing the work, let's pretend it happened.
You can find the contour lines from the masks:
% once you've cleaned up the masks as best as you can,
% you can try to find the regions where the masks are adjacent
edgestack = false([size(roimk) 1 numbins-1]);
for k = 1:numbins-1
lomask = maskstack(:,:,:,k);
himask = maskstack(:,:,:,k+1);
thismask = imdilate(lomask,ones(3)) & imdilate(himask,ones(3));
edgestack(:,:,:,k) = thismask;
end
% visualize
imshow(any(edgestack,4),'border','tight')
Of course that's basically unreadable on the forum, but there are a bunch of jagged lines there. Note that these are only the interior contour levels, not the ones corresponding to the map extrema.
Are those jagged lines even adequate? I wouldn't think so.
Will this code work on other images? Not without being adapted to the geometry and flaws in each one.

3 Comments

Thank you for the suggestion. I think I getting to understand the method you suggest to extract the XYZ from the figure.
I am not much of a coder, I vaguely understand the code you have shared with. I tried to understand the code, step by step. Although I am lost at some sections of the code. If you could take some time to help me understand it would be great.
  1. Under the section For Sampling of Color Code, what is 37 and 17? How did you choose xcenters 1407 and ycenter of 190, 1081?
  2. Under the Section - sample the colorbar to get a crude approximation of the colormap - Why CT0 of 3 columns were chosen, why samplesize was divided by 2,
  3. In the next section, I understand that you tried to clean the image by removing unnecessary lines or dark spots to trace the exact border of the image. But why and how did you choose s>0.5.
  4. what is maskstack in the code is doing, what is its impact, I tried to do an imshow(maskstack), but it was an error.
Thank you again for sharing the code and the idea?
Regards
Raghu
Alternatively, this took me 45 minutes to do, and it's already correctly scaled.
% using the following FEX tools:
% https://www.mathworks.com/matlabcentral/fileexchange/72225-load-svg-into-your-matlab-code
% filename of manually-fit svg file
fname = 'turbinemap.svg.fakeextension.txt';
% data range from original image axis labels
% this is where the rectangle is drawn in the SVG
xrange = [1 4];
yrange = [0.35 4];
% spline discretization parameter [0 1]
% when set to 1, spline is treated as a polyline
coarseness = 1;
% get plot box geometry
str = fileread(fname);
str = regexp(str,'((?<=<rect)(.*?)(?=\/>))','match');
pbx = regexp(str,'((?<=x=")(.*?)(?="))','match');
pby = regexp(str,'((?<=y=")(.*?)(?="))','match');
pbw = regexp(str,'((?<=width=")(.*?)(?="))','match');
pbh = regexp(str,'((?<=height=")(.*?)(?="))','match');
pbrect = [str2double(pbx{1}{1}) str2double(pby{1}{1}) ...
str2double(pbw{1}{1}) str2double(pbh{1}{1})];
% get coordinates representing the curve
S = loadsvg(fname,coarseness,false);
% if there are multiple paths you want to extract
% you'll need to do do the rescaling, etc for each element of S
for k = 1:numel(S) % there are multiple curves
x = S{k}(:,1);
y = S{k}(:,2);
% rescale to fit data range
x = xrange(1) + diff(xrange)*(x-pbrect(1))/pbrect(3);
y = yrange(1) + diff(yrange)*(pbrect(4) - (y-pbrect(2)))/pbrect(4);
% shove the prepared data back into S for later
S{k} = [x y];
end
% plot
for k = 1:numel(S)
x = S{k}(:,1);
y = S{k}(:,2);
plot(x,y); hold on
end
grid on;
xlim(xrange)
ylim(yrange)
The center positions were chosen by using the datatip tool and just manually picking two approximate locations.
The [37 17] is a rectangular window size [width height] that will be used to sample the interior region of each patch in the colorbar. I just picked a value that was large enough to get a fair average while staying away from the artifacts around the edges. Similarly, the particular widths were approximated by using the datatip tool.
The colortable CT0 is three columns, since it's an RGB color table. One row for each sample, one column for each color channel. The samplesize is divided by two simply because I chose to define the window in terms of its width (which I think is more intuitive), while it's easier to address the locations based on their radius from the sample center (their half-width). Note that xrange,yrange are integer vectors centered on zero, and they're incrementally offset by xcenters,ycenters. That's just the way I chose to do the addressing.
Again, the particular value of threshold for S was chosen by viewing S with imshow() and just inspecting transition regions with the datatip tool. It's actually not very sensitive. If you look at the histogram of S, you'll see that you could split the histogram just about anywhere in the middle.
inpict = imread('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1657171/TurbineMap.JPG');
[H S V] = rgb2hsv(inpict);
imhist(S)
The variables maskstack and edgestack are 4D logical images. Many tools don't know what to do with 4D images, but montage() does. In order to view the frames with imshow(), you'd use address them one at a time.
imshow(maskstack(:,:,:,4)) % view the fourth frame
It's just a bunch of regular 2D logical images concatenated on dim4 for convenience. Note here how it lets us use implicit array expansion to work on all frames at once
maskstack = maskstack & roimk;
... and how it lets us visualize the union of the masks.
imshow(any(edgestack,4),'border','tight')
Some people prefer to use cell arrays instead of monolithic numeric/logical arrays. There are reasons to use one over the other. As to why I'd use something that's clumsy to view, I normally don't use imshow() or montage(). I use MIMT imshow2(), which does support browsing 4D images. I'm pretty sure there are other third party viewers as well, but montage() would work in a pinch.
As to what maskstack is doing, well ... not much of anything yet. We're trying to find the contour lines between each solid-colored region. Rather than trying to directly find edge features in the image, which would be very heavily influenced by the noise, the best use of the color information is to find the colored regions -- not the boundaries. That's my thought at least. If we can find decent masks representing the colored regions, then the region where masks are adjacent should be the contour line. As you can see from the frames of maskstack and edgestack, there are still plenty of problems with the process.

Sign in to comment.

More Answers (1)

@DGM, Thank you. Your support is highly appreciated. I am glad to see a solution and this helps any future plots such as these can be executed with one single code. Thanks again.
Raghu

Categories

Products

Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!