Find unique groups in table with members both before and after date

I have a table with strings in cells and a datetime. I need to find the unique groups of entries that occur before and after a particular date- June 1, 2022 in this case.
Here is my example table:
T=table({'Joe','Joe','Joe','Joe','Steve','Steve','Mary','Mary','Susan','Susan'}',...
{'2022-02-03','2021-12-23','2022-08-01','2022-07-15','2021-02-08',...
'2022-04-03','2022-09-03','2022-02-03','2022-08-13','2022-04-23'}',...
{'Car','Van','Car','Van','Car','Car','Truck','Truck','Car','Van'}',...
{'B213','G344','A342','G6767','N1356','A34','M235','A4567','G2345','A23'}',...
'VariableNames',{'Name' 'Date' 'Type' 'Data'}); % my data comes as cells from a database
T.Date=datetime(T.Date,'InputFormat','yyyy-MM-dd'); % Converting to datetime
T = convertvars(T,@iscell,'categorical'); % Convert to categorical - preferred to cells?
T
T = 10×4 table
Name Date Type Data _____ ___________ _____ _____ Joe 03-Feb-2022 Car B213 Joe 23-Dec-2021 Van G344 Joe 01-Aug-2022 Car A342 Joe 15-Jul-2022 Van G6767 Steve 08-Feb-2021 Car N1356 Steve 03-Apr-2022 Car A34 Mary 03-Sep-2022 Truck M235 Mary 03-Feb-2022 Truck A4567 Susan 13-Aug-2022 Car G2345 Susan 23-Apr-2022 Van A23
Now I want to find the groups of matching Name and Type where at least one occurs after 01-Jun-2022 and at least on occurred before that date. I also need to carry the Data field with me. So the solution would be:
T_ans=T;
T_ans([5:6 9:10],:)=[];
sortrows(T_ans,[1 3])
ans = 6×4 table
Name Date Type Data ____ ___________ _____ _____ Joe 03-Feb-2022 Car B213 Joe 01-Aug-2022 Car A342 Joe 23-Dec-2021 Van G344 Joe 15-Jul-2022 Van G6767 Mary 03-Sep-2022 Truck M235 Mary 03-Feb-2022 Truck A4567
With Steve elimanted becsue his matching Type dates were all before 01-Jun-2022, and Susan eliminated because she did not have 2 or more matching Types.
I'm not sure how to proceed- I am thinking of looping through all the unique 'Name' and 'Type' groups and seeing if there is one before and one after 01-Jun-2022, then listing all memebers if true.
I can;t figure out what to do after this though:
[a,b,c]=unique([T.Name,T.Type],'rows');
a
a = 6×2 categorical array
Joe Car Joe Van Mary Truck Steve Car Susan Car Susan Van
Thanks for any help.

 Accepted Answer

T=table({'Joe','Joe','Joe','Joe','Steve','Steve','Mary','Mary','Susan','Susan'}',...
{'2022-02-03','2021-12-23','2022-08-01','2022-07-15','2021-02-08',...
'2022-04-03','2022-09-03','2022-02-03','2022-08-13','2022-04-23'}',...
{'Car','Van','Car','Van','Car','Car','Truck','Truck','Car','Van'}',...
{'B213','G344','A342','G6767','N1356','A34','M235','A4567','G2345','A23'}',...
'VariableNames',{'Name' 'Date' 'Type' 'Data'}); % my data comes as cells from a database
T.Date=datetime(T.Date,'InputFormat','yyyy-MM-dd'); % Converting to datetime
T = convertvars(T,@iscell,'categorical'); % Convert to categorical - preferred to cells?
T
T = 10×4 table
Name Date Type Data _____ ___________ _____ _____ Joe 03-Feb-2022 Car B213 Joe 23-Dec-2021 Van G344 Joe 01-Aug-2022 Car A342 Joe 15-Jul-2022 Van G6767 Steve 08-Feb-2021 Car N1356 Steve 03-Apr-2022 Car A34 Mary 03-Sep-2022 Truck M235 Mary 03-Feb-2022 Truck A4567 Susan 13-Aug-2022 Car G2345 Susan 23-Apr-2022 Van A23
uName = unique(T.Name);
uType = unique(T.Type);
idx = [];
for i=1:length(uName)
for j=1:length(uType)
T1 = T(T.Name == uName(i) & T.Type == uType(j), :);
if (sum(T1.Date > datetime("01-Jun-2022")) >= 1) && (sum(T1.Date < datetime("01-Jun-2022")) >=1)
idx =[idx; [i j]];
end
end
end
idx
idx = 3×2
1 1 1 3 2 2
Tout = [];
for i=1:size(idx, 1)
Tout = [Tout;
T((T.Name==uName(idx(i,1))) & (T.Type==uType(idx(i,2))), :)];
end
Tout
Tout = 6×4 table
Name Date Type Data ____ ___________ _____ _____ Joe 03-Feb-2022 Car B213 Joe 01-Aug-2022 Car A342 Joe 23-Dec-2021 Van G344 Joe 15-Jul-2022 Van G6767 Mary 03-Sep-2022 Truck M235 Mary 03-Feb-2022 Truck A4567

3 Comments

Thanks, this is close, but a couple isues: this only includes Names that have two or more dates on either side of 01-Jun-2022, but it should be one or more. That's an easy fix.
if (sum(T1.Date > datetime("01-Jun-2022")) >= 1) && (sum(T1.Date < datetime("01-Jun-2022")) >=1)
The triciker one is to include both T.Name and T.Type as the unique groups.
uName = unique([T.Name,T.Type],'rows');
This gives an index of where there are matches to uName, but I'm not sure how to only choose the ones where both match.
V{:,[1 3]} == uName(i,:)
The answer I am looking for should include Joe and Mary types as indicated above.
Thanks!
In case anyone ever reads this with a similar issue, I did come up with a method using ismember, but this solution is much better.

Sign in to comment.

More Answers (1)

The groupfilter function was introduced in R2019b to simplify these types of workflows. Just specify which variables define the groups, which variable to send through a filter, and specify a filter function. Using groupfilter, you can do this in one function call.
myfilter = @(t)(sum(t > datetime("01-Jun-2022")) >= 1) && (sum(t < datetime("01-Jun-2022")) >=1);
T2 = groupfilter(T,["Name" "Type"],myfilter,"Date")

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!