Find max within due date and find max within due date by class section
    8 views (last 30 days)
  
       Show older comments
    
I am able to find the max of the date_submitted. How do I find the max of the date_submitted but within the Lab_due_date column? Also, how do I  find the max of the date_submitted within the Lab_due_date by class_section?
0 Comments
Accepted Answer
  Ayush Modi
      
 on 20 May 2024
        
      Edited: Ayush Modi
      
 on 20 May 2024
  
      Hi,
To find the latest date_submitted that is before the Lab_due_date for each class_section, you first need to format the data in the two column to be comparable. You can acheive this using datatime function.
data.date_submitted = datetime(data.date_submitted, 'InputFormat', 'MM/dd/yyyy HH:mm');
data.Lab_due_date = datetime(data.Lab_due_date, 'InputFormat', 'MM/dd/yyyy, hh:mm a zzz', 'TimeZone', 'America/Chicago', 'Locale', 'en_US');
data.Lab_due_date.TimeZone = '';% Group by user_id
Then, group the data by the desired category using findgroups function.
grouped = findgroups(data.user_id);
To get the date_submitted that is before the Lab_due_date:
uniqueGroups = unique(data.class_section); % Assuming class_section is a cell array of strings
maxDatesByGroup = NaT(size(uniqueGroups)); % Preallocate with Not-a-Time (NaT)
for i = 1:length(uniqueGroups)
    % Filter data for the current group using strcmp for cell array comparison
    currentGroupFilter = strcmp(data.class_section, uniqueGroups{i});
    currentGroupData = data(currentGroupFilter, :);
    % Ensure Lab_due_date and date_submitted are datetime arrays before comparison
    validSubmissions = currentGroupData.date_submitted <= currentGroupData.Lab_due_date(1);
    % Find the maximum date_submitted that meets the criteria
    if any(validSubmissions)
        maxDatesByGroup(i) = max(currentGroupData.date_submitted(validSubmissions));
    end
end
Please refer to the following MathWorks documentation for more information on:
3 Comments
More Answers (1)
  Peter Perkins
    
 on 29 May 2024
        This is much simpler than the accepted answers suggests. And you want to use categorical for categorical data, not text.
%t = readtable("lab3_14_37_lower_participants_copy.csv");
%t.date_submitted.TimeZone = "America/Chicago";
%t.Lab_due_date = datetime(t.Lab_due_date,InputFormat="MM/dd/uuuu, HH:mm a z",TimeZone="America/Chicago")
I'm gonna assume you don't want to work with time zones; your updated file does not contain the CDT offsets.
t = readtable("https://www.mathworks.com/matlabcentral/answers/uploaded_files/1702791/lab3_14_37_lower_participants_copy_update.csv");
t = convertvars(t,["user_id" "email" "class_section"],"categorical");
Your file has exactly one unique due date. Let's add some more data.
n = 50;
t = [t; t; t]
t.date_submitted(51:100) = t.date_submitted(51:100) + calmonths(1);
t.Lab_due_date(51:100) = t.Lab_due_date(51:100) + calmonths(1);
t.date_submitted(101:150) = t.date_submitted(101:150) + calmonths(2);
t.Lab_due_date(101:150) = t.Lab_due_date(101:150) + calmonths(2);
Now that that's done, computing the max submit dates is one line. No loops.
groupsummary(t,"Lab_due_date","day","max","date_submitted")
With two grouping vars, still one line.
groupsummary(t,["Lab_due_date" "class_section"],["day" 'none'],"max","date_submitted")
2 Comments
  Peter Perkins
    
 on 29 May 2024
				doc groupsummary: "You also can specify method as a function handle that returns one output per group whose first dimension has length 1. For table input data, the function operates on each table variable separately."
See Also
Categories
				Find more on Dates and Time in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

