Readtable not reading time as expected

46 views (last 30 days)
Hi there
I have run into a puzzling mishap by readtable. I'm running it as such::
T = readtable('file.data','FileType','text','NumHeaderLines',7);
Here is a cropped screenshot of the dataset I'm running it on (headerlines not shown):
I am specifically puzzled by the last column shown. This is the time of day. In the resulting table it comes back looking like this:
So it seems readtable has misinterpreted what is a time signature as a duration value. And that it is only able to determine this duration value when the third column (a nanosecond value) cycles back to zero.
Any ideas as to how I might resolve this problem?
Thanks in advance!
  2 Comments
Jon
Jon on 16 Nov 2023
Someone may be able to help you just by looking at your screenshot, but it would be helpful for me if you could attach your data file so I could try to import it myself and see what is going on.
Jakob Sievers
Jakob Sievers on 16 Nov 2023
Ah yes, ofcourse. I've attached a zipped version here :)

Sign in to comment.

Accepted Answer

Jon
Jon on 16 Nov 2023
Edited: Jon on 16 Nov 2023
I think you are having problems because your time column is of the form 19:10:54:300 and in particular the last separator is a colon not a decimal point. I couldn't find a simple way of dealing with this as it seems all of the MATLAB time formats use a decimal fraction for the last portion.
Here is some code that I wrote to handle this. Probably could be streamlined but I think it does what you want
filename = '2022-08-29T191054_AIU-2371.data'
% Define import options
opts = detectImportOptions(filename,"FileType","text")
% Modify options to have time column read in as a character type for further
% manipulation
opts = setvaropts(opts,'Time','Type','char')
% Read in the table using the customized options
T = readtable('2022-08-29T191054_AIU-2371.data',opts);
% Replace milliseconds with fractions of a second to allow using MATLAB time
% formatting
T.Time = cellfun(@fun,T.Time,'UniformOutput',false);
% Convert to duration
T.Time = duration(T.Time,'InputFormat','hh:mm:ss.SSSSSSSSS')
function t = fun(tChar)
% Replace nanoseconds with fractions of a second to allow using MATLAB time
% formatting
parts = strsplit(tChar,':');
fsec = str2double(parts{4})/1e9; % fraction of second
parts{4} = num2str(fsec,'%.9f'); % e.g '0.000000300'
parts{4} = parts{4}(2:end); % e.g. '.000000300'
% build new character vector with decimal time
t = [parts{1},':',parts{2},':',parts{3},parts{4}];
end
  9 Comments
Jakob Sievers
Jakob Sievers on 17 Nov 2023
Thank you very much for taking your time to look deeper into this question. I'm glad we both learned something from this :-)
Jakob Sievers
Jakob Sievers on 22 Nov 2023
I finally had a moment to test this out and it did the job splendedly. Thanks again. I will accept this as the answer :)

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 16 Nov 2023
Use detectImportOptions() on the file. Then use setvartype() to set variable 8 to datetime instead of duration. Then use setvaropts() to set the InputFormat to 'HH:mm:ss:SSS'
Now readtable() the file passing in the modified options.
Variables 7 and 8 will now both be datetime.
Take
day_offset = TheTable{:,8} - dateshift(TheTable{:,8}, 'start', 'day');
TheTable.DateTime = TheTable{:,7} + day_offset;
Now TheTable.DateTime variable will hold the full-precision date and time together.
Yes, there are other ways, including passing a format to readtable() indicating what the datetype is for each column... but using %D and %T properly gets a bit messy. And in order for a time format to not automatically end at the space between date and time, you have to do the hack of telling the parser that whitespace is not a field delimiter... gets ugly.
  4 Comments
Jakob Sievers
Jakob Sievers on 17 Nov 2023
Thank you so much for looking into this question! I'll explore all answers as soon as I can!
Peter Perkins
Peter Perkins on 17 Nov 2023
Yes, the root cause here is that reading in duration text timestamps in the format hh:mm:ss:SSS is not currently supported. So Walter is correct, use datetime's more flexible parsing, and Chris is right, use timeofday to convert those to durations.
I have made note of adding support for this duration format, it's something we've had questions about before.

Sign in to comment.

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!