how to use Discretize syntax

17 views (last 30 days)
kav
kav on 1 Nov 2019
Edited: John D'Errico on 1 Nov 2019
Hello all,
I have a data set, in that one column got 1 -29 range of numbers i want to group them same nembers like
1 -4 = claas1
5-9 = class2 so on class7
how can i do this with discretize syntax ?

Answers (3)

John D'Errico
John D'Errico on 1 Nov 2019
Edited: John D'Errico on 1 Nov 2019
Funny. Actually, both of the other answers have missed an important point. While they properly describe how to use discretize, both Star and Adam got the class boundaries wrong. :) What they did not notice was the classes described as 1:4, 5:9, etc are not equal in width. So using colon to create classes as they did will create the wrong class bounds.
Assume you want 1-4 in class 1, 5-9 in class 2, 10-14 in class 3, 15-19 in class 4, etc. You only told us what the first two classes were, so I cannot be sure. But both Adam and Star said to use bin edges of 1:4:29.
1:4:29
ans =
1 5 9 13 17 21 25 29
The bins should be a list of the LOWER bounds for each class.
So 1:4:29 as the bin edges will put 5 and 9 in DIFFERENT classes.
Instead, the correct set of bin edges would have been something like 1 5 10 ...
I'll suggest that instead, the bin edges might properly have been 0:5:30.
g = discretize(data, 0:5:30);
That makes the first bin go down to 0, but who cares? Since you never have a point at 0, that it starts at 0 is irrelevant. Otherwise, if your boundaries are unequal in width all the way, you might do something like this:
g = discretize(data, [1 4 10 17 20 23 30]);
  3 Comments
Star Strider
Star Strider on 1 Nov 2019
We do not currently know how Archana Katageri wants to define the bins, since a complete description was not provided.
John D'Errico
John D'Errico on 1 Nov 2019
Edited: John D'Errico on 1 Nov 2019
Yes, but my point was that 1:4:29 does not work. It puts 5 and 9 in different bins.
[4:9;discretize(4:9,1:4:29)]
ans =
4 5 6 7 8 9
1 2 2 2 2 3
And the request was 5:9 should be in bin 2. I think the problem was the bin bounds indicated by the question were non-uniform in width, even for only 2 bins. Odds are that was just a mistake by the OP. But, you never know...

Sign in to comment.


Adam Danz
Adam Danz on 1 Nov 2019
data = randi(29,100,1); % demo data (100 values between 1:29
g = discretize(data, 1:4:29) % g is the group number, groups defined by 1:4:29
% Check results
T = table(data, g, 'VariableNames',{'Data','Group'})

Star Strider
Star Strider on 1 Nov 2019
Edited: Star Strider on 1 Nov 2019
Try this:
Data = [randi(29, 20, 1) rand(20, 5)]; % Create Data, ID Numbers 1-29 In First Column
ClassVct = 1:4:29; % Define Bins
ClassMtx = discretize(Data(:,1), ClassVct); % Discretize Data By Bins
Experiment to get the result you want.
EDIT — (01 Nov 2019 at 13:27)
To separate the identified rows in ‘Data’ into cell arrays for each group identified in ‘ClassVct’:
Groups = accumarray(ClassMtx, (1:numel(ClassMtx))', [], @(x){Data(x,:)});
To test that, check with:
Groups{1}
Groups{end}
or similar.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!