How to I validate a string to ensure it contains a valid Date .

I know how to convert a string to a date...... but how do I validate if the string is a valid date.
Example
MyStr = '23max2023'; % invalid may typo max
infmt = "ddMMMyyyy";
MyDate = datetime(MyStr,"InputFormat",infmt)
This generates an error if invalid.... is there an elegant method to check if a date is valid without having to generate an error and checking the type of error.

Answers (1)

Here is one way:
MyStr = '23max2023'; % Invalid string will give an empty array
isMyDateFormat = regexp(MyStr, '\d{2}(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d{4}')
isMyDateFormat = []
MyStr = '23may2023'; % Valid string will give a 1
isMyDateFormat = regexp(MyStr, '\d{2}(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d{4}')
isMyDateFormat = 1
This method specifically check for the pattern of exactly 2 digits, then a lower-case standard month abbreviation, then exactly 4 digits. One could get even more sophisticated if needed.
Alternatively, you could wrap your code in a try-catch structure, to try the datetime command, then do something else if it fails:
MyStr = '23max2023';
infmt = "ddMMMyyyy";
try
MyDate = datetime(MyStr,"InputFormat",infmt)
catch
fprintf("Invalid date format")
end
Invalid date format

8 Comments

What a super quick response... appreciated....
I like the regex solution for a few reasons... especially as I can re purpose this syntax for similar scenarios. I have always avoided regex as was not sure how much time it would take me to become comfortable with it.
A quick question... is it easy to cater for any combination of capitals for months ie May,may,MAY.
I am nervous of converting the text to lower case as not sure what will happen to non alpha / special chars, etc.
I see the simplicity in the try catch ... but for the moment, I wish to avoid setting up try catch and also feel I should not have to trigger an error condition to complete the most basics of tasks.
I will experiment with the regex / lower case scenario ... but any suggestions on most efficient means to cater for combos of upper and lower alphanumerics would be great.
I re-read the description for 'lower' function and can see converting the string to lower will only adjust upper alphas... Will try that.
I used the lower function to create a string with all lowercase alpha chars.
I then checked that the results of the regexp has a length of 0 ... to determine if an error condition arises.
This seems to work. [I can fine tune the code in my app]
isMyDateFormat = regexp(MyStr, '\d{2}(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d{4}');
Chk = length(isMyDateFormat);
% could also use ...... isMyDateFormat == []
% if isMyDateFormat == [] % does not work
if Chk == 0
My_ErrCount = My_ErrCount + 1;
continue % loop next
end
isMyDateFormat = [] or isMyDateFormat == [] does not work..
Also
This works to check a valid scenario
'isMyDateFormat = 1' using ==
but checking for a value ~= 1 does not work ... so I reverted to using the length of the resultant regexp to check for an error condition.
I wish to flag your solution, but will wait to get your feedback on my comments.
"I used the lower function to create a string with all lowercase alpha chars."
Simpler and more efficient: use REGEXPI.
"I then checked that the results of the regexp has a length of 0"
Note that this is not robust (nor is the provided answer) to prefixed/suffixed text:
rgx = '\d{2}(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d{4}';
regexpi('NotValidDate23Jan2023NorThis', rgx)
ans = 13
regexpi('012345678923Jan2023123456789', rgx)
ans = 11
You can ensure the entire text is checked by using ^ and $:
rgx = '^\d{2}(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d{4}$';
regexpi('23Jan2023', rgx)
ans = 1
regexpi('012345678923Jan2023123456789', rgx)
ans = []
In practice defining booleans as negative of something is usually a bad approach (it leads to confusion in writing the code later), so we define it as the positive statement:
IsValidDate = isscalar(regexpi('23Jan2023', rgx))
IsValidDate = logical
1
Rather than trying to reverse engineer the date validation, I would recommend using the TRY/CATCH approach.
The power of the community coming through here. Thanks for the additional info, @Stephen23.
The regexp based approach will work, as long as you don't need to deal with month names in a different locale as shown in the "Date and Time from Text in Foreign Language" example on the datetime documentation page. Looking at the try / catch based approach from @the cyclist's original answer with a bit of modification:
MyStr = '23max2023';
infmt = "ddMMMyyyy";
try
MyDate = datetime(MyStr,"InputFormat",infmt)
catch ME
fprintf("Invalid date format. Error ID:\n%s\nError message:\n%s\n", ...
ME.identifier, ME.message)
end
Invalid date format. Error ID: MATLAB:datetime:ParseErrSuggestLocale Error message: Unable to convert '23max2023' to datetime using the format 'ddMMMyyyy'. If the date/time text contains day, month, or time zone names in a language foreign to the 'en_US' locale, those might not be recognized. You can specify a different locale using the 'Locale' parameter.
Compare this with a format that is completely mismatched with the date string:
MyStr = '23max2023';
infmt = "ddyyyy";
try
MyDate = datetime(MyStr,"InputFormat",infmt)
catch ME
fprintf("Invalid date format. Error ID:\n%s\nError message:\n%s\n", ...
ME.identifier, ME.message)
end
Invalid date format. Error ID: MATLAB:datetime:ParseErr Error message: Unable to convert '23max2023' to datetime using the format 'ddyyyy'.
The first error identifier suggests that you misspelled the month abbreviation (or you're using an abbreviation in a different locale.) The second suggests a larger error. You could check which error was thrown and handle those two cases differently (and if something else happened, rethrow the error and let the calling code handle it or not as appropriate.)
Thank you. I will use this approach. I understand and agree with the approach of not re-engineering a date validation function.
To me, a recent MatLab coder, it is staggering to think that the following format does not exist...
[ MyDateVal, MyDateStatus] = datetime(.............
Anyway...I will use the try ... catch syntax.
All feedback appreciated.
I need to figure out how to flag this to be the correct answer.
To me, a recent MatLab coder, it is staggering to think that the following format does not exist...
[ MyDateVal, MyDateStatus] = datetime(.............
If datetime behaved that way instead of throwing an error, every single call to datetime would need to be followed by an if call checking that MyDateStatus indicated success before proceeding to use MyDateVal. With the error-based approach, as long as the next line runs the datetime call succeeded.
I wouldn't want to see code that was forced to look like the following, to take the "return result and status" pattern to the extreme:
x = 1:10;
y = 11;
[result, didSucceeed] = plus(x, y);
if ~didSucceed % We couldn't add x and y
% handle the error
end
IMO better to just let plus (or datetime) throw the error itself. The function has more information about what exactly went wrong, so it may be able to distinguish between the "you gave me a month name I didn't recognize but that could be a valid month name in a different locale" and "your data doesn't match the format you asked for at all" cases and provide more specifically tailored messages.

Sign in to comment.

Categories

Products

Release

R2022b

Asked:

on 11 Mar 2023

Edited:

on 11 Mar 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!