How to store data when a string condition of another column is met?

1 view (last 30 days)
I have imported an excel file, which has three columns: EVENT_ID, BEGIN_DATE, STATE_ABBR.
EVENT_ID BEGIN_DATE STATE_ABBR
1 01-Apr-2000 CT
2 02-Jun-2000 CT
3 13-Aug-2000 CT
4 29-Oct-2000 CT
5 31-Apr-2002 CT
6 06-Jun-2009 CT
7 08-Jan-2011 DE
8 09-Dec-2016 DE
9 11-Feb-2019 MA
10 20-Aug-2020 MD
11 19-Nov-2020 MD
12 22-Apr-2021 MD
. . .
. . .
. . .
I want to store the number of times (COUNT) a year (YEAR) has repeated in BEGIN_DATE corresponding to STATE_ABBR. In this case, the output should look like:
YEAR COUNT STATE_ABBR
2000 4 CT
2002 1 CT
2009 1 CT
2011 1 DE
2016 1 DE
2019 1 MA
2020 2 MD
2021 1 MD
. . .
. . .
. . .
Any idea on how to do this?

Accepted Answer

Stephen23
Stephen23 on 15 Oct 2021
T = readtable('SED_FULL_NE.xlsx')
T = 1300×34 table
EVENT_ID CZ_NAME_STR BEGIN_LOCATION BEGIN_DATE EVENT_TYPE MAGNITUDE TOR_F_SCALE DEATHS_DIRECT INJURIES_DIRECT DAMAGE_PROPERTY_NUM DAMAGE_CROPS_NUM STATE_ABBR CZ_TIMEZONE MAGNITUDE_TYPE EPISODE_ID CZ_TYPE CZ_FIPS WFO INJURIES_INDIRECT DEATHS_INDIRECT SOURCE FLOOD_CAUSE TOR_LENGTH TOR_WIDTH BEGIN_RANGE BEGIN_AZIMUTH END_RANGE END_AZIMUTH END_LOCATION END_DATE END_TIME BEGIN_LAT BEGIN_LON END_LAT __________ ______________________________ ______________ ___________ ___________ __________ ___________ _____________ _______________ ___________________ ________________ __________ ___________ ______________ __________ _______ _______ _______ _________________ _______________ ________________________ ___________ __________ __________ ___________ _____________ __________ ___________ ____________ ___________ ________ __________ __________ __________ 5.2897e+06 {'NORTHERN FAIRFIELD (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 5 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'SOUTHERN NEW LONDON (ZONE)'} {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 12 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'SOUTHERN MIDDLESEX (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 11 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'SOUTHERN NEW HAVEN (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 10 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'SOUTHERN FAIRFIELD (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 9 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'NORTHERN NEW LONDON (ZONE)'} {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 8 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'NORTHERN MIDDLESEX (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 7 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'NORTHERN NEW HAVEN (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 6 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'SOUTHERN FAIRFIELD (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 9 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'SOUTHERN MIDDLESEX (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 11 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'NORTHERN NEW LONDON (ZONE)'} {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 8 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'SOUTHERN NEW HAVEN (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 10 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'NORTHERN MIDDLESEX (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 7 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'NORTHERN NEW HAVEN (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 6 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'SOUTHERN NEW LONDON (ZONE)'} {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 12 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'NORTHERN FAIRFIELD (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 5 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char}
[G,YEAR,STATE_ABBR] = findgroups(year(T.BEGIN_DATE),T.STATE_ABBR);
COUNT = accumarray(G,ones(numel(G),1));
out = table(YEAR,COUNT,STATE_ABBR) % you can SORTROWS this if required
out = 39×3 table
YEAR COUNT STATE_ABBR ____ _____ __________ 2000 4 {'DE'} 2000 5 {'MD'} 2000 20 {'NJ'} 2000 11 {'PA'} 2001 20 {'DE'} 2001 26 {'MD'} 2001 85 {'ME'} 2001 60 {'NH'} 2001 94 {'NJ'} 2001 29 {'NY'} 2001 41 {'PA'} 2002 24 {'CT'} 2002 44 {'DE'} 2002 55 {'MD'} 2002 61 {'NY'} 2005 4 {'DE'}
  4 Comments
Stephen23
Stephen23 on 15 Oct 2021
Edited: Stephen23 on 15 Oct 2021
COUNT = accumarray(G,ones(numel(G),1));
counts the ones corresponding to each group in G:
[G,YEAR,STATE_ABBR] = findgroups(T.BEGIN_DATE.Year,T.STATE_ABBR);
returns a group number for each unique pair of data-points in the two input arguments.

Sign in to comment.

More Answers (1)

KSSV
KSSV on 15 Oct 2021
Edited: KSSV on 15 Oct 2021
T = readtable('https://in.mathworks.com/matlabcentral/answers/uploaded_files/768171/SED_FULL_NE.xlsx') ;
id = [1 4 12] ;
T = T(:,id) ;
thedates = T.(2) ;
[theyear,ia,ib]=unique(year(thedates)) ;
count=accumarray(ib,1) ;
state_abbr = T.(3)(ia) ;
T = table(theyear,count,state_abbr)
T = 12×3 table
theyear count state_abbr _______ _____ __________ 2000 40 {'DE'} 2001 355 {'DE'} 2002 184 {'CT'} 2005 9 {'DE'} 2007 76 {'DE'} 2008 96 {'DE'} 2010 59 {'MD'} 2012 96 {'CT'} 2015 39 {'NJ'} 2016 156 {'CT'} 2017 53 {'CT'} 2020 137 {'CT'}
  10 Comments
Rabeca Mohammed
Rabeca Mohammed on 15 Oct 2021
Could you please explain the following codes?
[year,ia,ib]=unique(thedates.Year) ;
count=accumarray(ib,1);
state_abbr = T.(3)(ia) ;
What are ia, ib? Why is ib in the accumarray function? Why is 1 in it too? What is T.(3)(ia)?
Rabeca Mohammed
Rabeca Mohammed on 15 Oct 2021
My bad, the ouput is wrong. In the Excel file, for the year 2000 (BEGIN_DATE = 2000), DE (STATE_ABBR = DE) has only 4 entries. So, count should be 4, not 40. COUNTs of other STATE_ABBRs are also inconsistent to the excel file.

Sign in to comment.

Categories

Find more on Data Import from MATLAB in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!