Calculate number appearance within a range

I need help calculating the number of times a number appears within a range.

 Accepted Answer

The histcounts and histogram functions are perfect for this —
LD = load(websave('TT','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1268555/TT.mat'));
TT = LD.TT
TT = 31×1
0 1.0000 14.9000 8.7000 1.0000 0 0 0 40.2000 10.3000
Edges = [1 2 10 20 50 100];
[BinCounts,Edges,Bin] = histcounts(TT, Edges);
BinCounts
BinCounts = 1×5
2 4 5 4 0
figure
histogram(TT, Edges)
xticks(Edges)
grid
xlim([min(Edges) max(Edges)])
xlabel('Bin Limits')
ylabel('Counts')
Thi histogram plot gives a better depiction of the bin limits than a bar plot of the histcounts results would.
.

11 Comments

Hello @Star Strider, here is something a bit strange. For example TT array contains 1.000 four times: TT(2), TT(5), TT(18), TT(28) but comparison doesn't recognize them. That is why I defined and used variable tol.
I think the reason is a rounding error.
It is straightforward to see how the histcounts (and histogram) functions allocate the various ‘TT’ values —
LD = load(websave('TT','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1268555/TT.mat'));
TT = LD.TT
TT = 31×1
0 1.0000 14.9000 8.7000 1.0000 0 0 0 40.2000 10.3000
Edges = [1 2 10 20 50 100];
[BinCounts,Edges,Bin] = histcounts(TT, Edges);
BinAssignments = table(TT, Bin)
BinAssignments = 31×2 table
TT Bin ____ ___ 0 0 1 0 14.9 3 8.7 2 1 0 0 0 0 0 0 0 40.2 4 10.3 3 0 0 0 0 0.1 0 3.7 2 1.3 1 0 0
In the absence of specific preferences to allocate them differently, that is how I would allocate them as well.
.
For example TT array contains 1.000 four times
It contains something that is displayed as 1.000 four times. Does that mean those values are exactly one? No. Let's look at three values in the default display format.
x = 1 - 1e-8
x = 1.0000
y = 1 + 1e-8
y = 1.0000
z = 1
z = 1
Neither x nor y are exactly, down to the last bit, equal to z. They are also not equal to each other.
x == z % false
ans = logical
0
y == z % also false
ans = logical
0
x == y % false as well
ans = logical
0
As always, my pleasure!
One approach would be to define the upper limit of ‘Edges’ as Inf (or any other suitably large number such as realmax) however Inf is essentially limitless —
LD = load(websave('TT','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1268555/TT.mat'));
TT = LD.TT
TT = 31×1
0 1.0000 14.9000 8.7000 1.0000 0 0 0 40.2000 10.3000
Edges = [1 2 10 20 50 100 Inf];
[BinCounts,Edges,Bin] = histcounts(TT, Edges);
BinCounts
BinCounts = 1×6
2 4 5 4 0 0
BinAssignments = table(TT, Bin)
BinAssignments = 31×2 table
TT Bin ____ ___ 0 0 1 0 14.9 3 8.7 2 1 0 0 0 0 0 0 0 40.2 4 10.3 3 0 0 0 0 0.1 0 3.7 2 1.3 1 0 0
figure
histogram(TT, Edges)
xticks(Edges)
grid
xlim([min(Edges) max(Edges)])
xlabel('Bin Limits')
Ax = gca;
Ax.XAxis.TickLabelRotation = 45;
Ax.XAxis.FontSize = 7;
ylabel('Counts')
Setting the upper limit to Inf makes the tick labels difficult to see if they are close together.
.
Should it not also work with the code in my Comment to your earlier post?
It works when I run it, so I do not understand the reason it does not work when you run it. Obviously something changed, however I have no idea what that change is so I cannot change my code to accommodate it.
It should be straightforward to simply substitute the ‘Edges’ vector here for the one in this assignment earlier:
[N,~,Bin] = histcounts(tMonth{k}.Rain, Edges, 'Normalization','probability');
It should simply give a different result, although the plots might not be as easy to interpret, and the ‘Cntrs’ vector would have to be calculated differently since it currently assumes that the bin limits are all the same.
I wouid still like to understand what the problem is with with my earlier code and your current inability to run it successfully. If I know what that problem is, I should be able to solve it.
.
Using my previous Comment code with the current ‘Edges’ vector —
LD = load(websave('t','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1267645/t.mat'));
t = LD.t
t = 513077×1 timetable
date_time Rain __________________ ____ 01-Jan-19 00-00-00 0 01-Jan-19 00-03-00 0 01-Jan-19 00-04-00 0 01-Jan-19 00-05-00 0 01-Jan-19 00-06-00 0 01-Jan-19 00-07-00 0 01-Jan-19 00-08-00 0 01-Jan-19 00-09-00 0 01-Jan-19 00-10-00 0 01-Jan-19 00-11-00 0 01-Jan-19 00-12-00 0 01-Jan-19 00-13-00 0 01-Jan-19 00-14-00 0 01-Jan-19 00-15-00 0 01-Jan-19 00-16-00 0 01-Jan-19 00-17-00 0
Rainmax = max(t.Rain)
Rainmax = 4.3000
Rainmin = min(t.Rain(t.Rain>0))
Rainmin = 0.1000
%Monthly timetables from t
for a = 1:12
MMidx = month(t.date_time) == a;
tMonth{a,:} = t(MMidx,:);
end
tMonth
tMonth = 12×1 cell array
{42923×1 timetable} {40027×1 timetable} {44449×1 timetable} {43114×1 timetable} {44542×1 timetable} {41875×1 timetable} {43988×1 timetable} {44233×1 timetable} {42688×1 timetable} {44173×1 timetable} {40673×1 timetable} {40392×1 timetable}
% tMonth{1}
Edges = [1 2 10 20 50 100 Inf]; % Change As Necessary To Produce The Desired REsults
Cntrs = Edges(1:end-1)+diff(Edges)/2;
for k = 1:numel(tMonth)
[BinCounts,~,Bin] = histcounts(tMonth{k}.Rain, Edges, 'Normalization','probability');
Nv{k,:} = BinCounts;
Binv{:,k} = Bin;
MMM{k} = month(tMonth{k}.date_time(1,:),'shortname');
end
Nv
Nv = 12×1 cell array
{[ 2.5627e-04 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 2.2451e-05 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 4.6852e-05 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[9.8345e-04 9.8345e-05 0 0 0 0]} {[ 0 0 0 0 0 0]}
% Binmin = min(cellfun(@(x)min(x(x>0)),Nv, 'Unif',0))
% Binmax = max(cellfun(@max,Nv, 'Unif',0))
figure
tiledlayout(6,2, 'TileSpacing','compact')
for k = 1:numel(tMonth)
nexttile
bar(Cntrs, Nv{k}, 'FaceColor','flat', 'EdgeColor','flat')
grid
ylabel(MMM{k})
ylim([1E-5 1]) % Optional, Change To Produce The Desired Result
set(gca,'YScale','log') % Optional (Shows Detail), Change To Produce The Desired Result
end
xlabel('Bin Centers')
The lower llimit of the ‘Edges’ vector is now set to 1, and several months have days with less that 1, so all the bins for those months are empty. (I commented-out ‘Binmin’ and ‘Binmax’ because they only calculate minima greater than 0 and since several months have essentially empty bins, this would require returning empty values for those months. These variables are simply for my benefit in understanding the data anyway, and are not necessary for, or used in, the code otherwise.)
.
Yes!
Try this —
LD = load(websave('t','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1267645/t.mat'));
t = LD.t
t = 513077×1 timetable
date_time Rain __________________ ____ 01-Jan-19 00-00-00 0 01-Jan-19 00-03-00 0 01-Jan-19 00-04-00 0 01-Jan-19 00-05-00 0 01-Jan-19 00-06-00 0 01-Jan-19 00-07-00 0 01-Jan-19 00-08-00 0 01-Jan-19 00-09-00 0 01-Jan-19 00-10-00 0 01-Jan-19 00-11-00 0 01-Jan-19 00-12-00 0 01-Jan-19 00-13-00 0 01-Jan-19 00-14-00 0 01-Jan-19 00-15-00 0 01-Jan-19 00-16-00 0 01-Jan-19 00-17-00 0
Rainmax = max(t.Rain)
Rainmax = 4.3000
Rainmin = min(t.Rain(t.Rain>0))
Rainmin = 0.1000
%Monthly timetables from t
for a = 1:12
MMidx = month(t.date_time) == a;
tMonth{a,:} = t(MMidx,:);
end
tMonth
tMonth = 12×1 cell array
{42923×1 timetable} {40027×1 timetable} {44449×1 timetable} {43114×1 timetable} {44542×1 timetable} {41875×1 timetable} {43988×1 timetable} {44233×1 timetable} {42688×1 timetable} {44173×1 timetable} {40673×1 timetable} {40392×1 timetable}
% tMonth{1}
Edges = [1 2 10 20 50 100 Inf]; % Change As Necessary To Produce The Desired REsults
Cntrs = Edges(1:end-1)+diff(Edges)/2;
for k = 1:numel(tMonth)
[BinCounts,~,Bin] = histcounts(tMonth{k}.Rain, Edges, 'Normalization','probability');
Nv{k,:} = BinCounts;
Binv{:,k} = Bin;
MMM{k,:} = month(tMonth{k}.date_time(1,:),'shortname');
end
% MMM
% TQ = cell2table(MMM)
Nv
Nv = 12×1 cell array
{[ 2.5627e-04 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 2.2451e-05 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[ 4.6852e-05 0 0 0 0 0]} {[ 0 0 0 0 0 0]} {[9.8345e-04 9.8345e-05 0 0 0 0]} {[ 0 0 0 0 0 0]}
% Binmin = min(cellfun(@(x)min(x(x>0)),Nv, 'Unif',0))
% Binmax = max(cellfun(@max,Nv, 'Unif',0))
figure
tiledlayout(6,2, 'TileSpacing','compact')
for k = 1:numel(tMonth)
nexttile
bar(Cntrs, Nv{k}, 'FaceColor','flat', 'EdgeColor','flat')
grid
ylabel(MMM{k})
ylim([1E-5 1]) % Optional, Change To Produce The Desired Result
set(gca,'YScale','log') % Optional (Shows Detail), Change To Produce The Desired Result
end
xlabel('Bin Centers')
Frequencies = array2table(cell2mat(Nv), 'RowNames',[MMM{:}], 'VariableNames',{'[1, 2]','(2, 10]','(10, 20]','(20, 50]','(50, 100]','(100 Inf]'})
Frequencies = 12×6 table
[1, 2] (2, 10] (10, 20] (20, 50] (50, 100] (100 Inf] __________ __________ ________ ________ _________ _________ Jan 0.00025627 0 0 0 0 0 Feb 0 0 0 0 0 0 Mar 0 0 0 0 0 0 Apr 0 0 0 0 0 0 May 2.2451e-05 0 0 0 0 0 Jun 0 0 0 0 0 0 Jul 0 0 0 0 0 0 Aug 0 0 0 0 0 0 Sep 4.6852e-05 0 0 0 0 0 Oct 0 0 0 0 0 0 Nov 0.00098345 9.8345e-05 0 0 0 0 Dec 0 0 0 0 0 0
Since I remember that you recently upgraded to R2022b, this should work.
The number of variable names has to be the same as the number of variables, so I added the last class.
I added the months as 'RowNames' although you can also add them as a separate variable (or not include them at all) if you prefer, since I converted them to a column vector in this version of my code. If you want to use full month names instead, replace 'shortname' with 'name' in the ‘MMM’ assignment in the first loop.
You should be able to write the ‘Frequencies’ table to the Excel file. (I did not do that experiment.)
.
It works in R2022b. I am not certain what that error specifically refers to. Did you change my code in any way?
The ‘MMM’ vector is a cell array in my code, and the ‘[MMM{:}]’ concatenation has worked everywhere else I’ve used it, for the past several years. Troubleshoot it by omitting the 'RowNames' name-value pair and see if the error persists.
This is likely a remediable problem, however I first have to know what the problem is.
.
As always, my pleasure!
@Star Strider can you please explain, how you upload file and use load command in th code, just like in the example
load(websave('TT','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1268555/TT.mat'));
@Askic V — That is exactly how I do it, although I always load into a output variable so that I have some control over what loads, and so I can change the name of the variable used in the subsequent script if necessary. (Since I rarely use websave, credit for discovering this goes to @Karim, who was appropriately rewarded for discovering it.)

Sign in to comment.

More Answers (1)

Perhaps this code snippet will help you:
load TT
classes = [1 2 10 20 50 100];
tol = 1e-10;
res = zeros(size(diff(classes)));
for i = 1:numel(classes)-1
res(i) = sum(TT >= classes(i)-tol & TT < classes(i+1)+tol);
end
res

Categories

Asked:

on 19 Jan 2023

Edited:

on 6 Jan 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!