How to I validate a string to ensure it contains a valid Date .
Show older comments
I know how to convert a string to a date...... but how do I validate if the string is a valid date.
Example
MyStr = '23max2023'; % invalid may typo max
infmt = "ddMMMyyyy";
MyDate = datetime(MyStr,"InputFormat",infmt)
This generates an error if invalid.... is there an elegant method to check if a date is valid without having to generate an error and checking the type of error.
Answers (1)
the cyclist
on 11 Mar 2023
Edited: the cyclist
on 11 Mar 2023
Here is one way:
MyStr = '23max2023'; % Invalid string will give an empty array
isMyDateFormat = regexp(MyStr, '\d{2}(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d{4}')
MyStr = '23may2023'; % Valid string will give a 1
isMyDateFormat = regexp(MyStr, '\d{2}(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d{4}')
This method specifically check for the pattern of exactly 2 digits, then a lower-case standard month abbreviation, then exactly 4 digits. One could get even more sophisticated if needed.
Alternatively, you could wrap your code in a try-catch structure, to try the datetime command, then do something else if it fails:
MyStr = '23max2023';
infmt = "ddMMMyyyy";
try
MyDate = datetime(MyStr,"InputFormat",infmt)
catch
fprintf("Invalid date format")
end
8 Comments
Matt O'Brien
on 11 Mar 2023
Moved: Stephen23
on 11 Mar 2023
Matt O'Brien
on 11 Mar 2023
Moved: Stephen23
on 11 Mar 2023
Matt O'Brien
on 11 Mar 2023
Moved: Stephen23
on 11 Mar 2023
"I used the lower function to create a string with all lowercase alpha chars."
Simpler and more efficient: use REGEXPI.
"I then checked that the results of the regexp has a length of 0"
Note that this is not robust (nor is the provided answer) to prefixed/suffixed text:
rgx = '\d{2}(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d{4}';
regexpi('NotValidDate23Jan2023NorThis', rgx)
regexpi('012345678923Jan2023123456789', rgx)
You can ensure the entire text is checked by using ^ and $:
rgx = '^\d{2}(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d{4}$';
regexpi('23Jan2023', rgx)
regexpi('012345678923Jan2023123456789', rgx)
In practice defining booleans as negative of something is usually a bad approach (it leads to confusion in writing the code later), so we define it as the positive statement:
IsValidDate = isscalar(regexpi('23Jan2023', rgx))
Rather than trying to reverse engineer the date validation, I would recommend using the TRY/CATCH approach.
the cyclist
on 11 Mar 2023
The regexp based approach will work, as long as you don't need to deal with month names in a different locale as shown in the "Date and Time from Text in Foreign Language" example on the datetime documentation page. Looking at the try / catch based approach from @the cyclist's original answer with a bit of modification:
MyStr = '23max2023';
infmt = "ddMMMyyyy";
try
MyDate = datetime(MyStr,"InputFormat",infmt)
catch ME
fprintf("Invalid date format. Error ID:\n%s\nError message:\n%s\n", ...
ME.identifier, ME.message)
end
Compare this with a format that is completely mismatched with the date string:
MyStr = '23max2023';
infmt = "ddyyyy";
try
MyDate = datetime(MyStr,"InputFormat",infmt)
catch ME
fprintf("Invalid date format. Error ID:\n%s\nError message:\n%s\n", ...
ME.identifier, ME.message)
end
The first error identifier suggests that you misspelled the month abbreviation (or you're using an abbreviation in a different locale.) The second suggests a larger error. You could check which error was thrown and handle those two cases differently (and if something else happened, rethrow the error and let the calling code handle it or not as appropriate.)
Matt O'Brien
on 11 Mar 2023
Steven Lord
on 11 Mar 2023
To me, a recent MatLab coder, it is staggering to think that the following format does not exist...
[ MyDateVal, MyDateStatus] = datetime(.............
If datetime behaved that way instead of throwing an error, every single call to datetime would need to be followed by an if call checking that MyDateStatus indicated success before proceeding to use MyDateVal. With the error-based approach, as long as the next line runs the datetime call succeeded.
I wouldn't want to see code that was forced to look like the following, to take the "return result and status" pattern to the extreme:
x = 1:10;
y = 11;
[result, didSucceeed] = plus(x, y);
if ~didSucceed % We couldn't add x and y
% handle the error
end
IMO better to just let plus (or datetime) throw the error itself. The function has more information about what exactly went wrong, so it may be able to distinguish between the "you gave me a month name I didn't recognize but that could be a valid month name in a different locale" and "your data doesn't match the format you asked for at all" cases and provide more specifically tailored messages.
Categories
Find more on Dates and Time in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!