Data Import from PostgreSQL produces NaT's

8 views (last 30 days)
Paul
Paul on 1 Feb 2023
Edited: Paul on 24 Mar 2023
I need to fetch data from a Postgres database. One of the columns in the table I am querying has type "timestamp with time zone". The values in this column can have fractional seconds down to microseconds. I am connecting to the database server using the native postgresql data source. I do a simple fetch from this table. Values in the timestamp column which are a whole number of seconds (e.g. 31-Jan-2023 17:02:00.000000) appear correctly in the returned matlab table object, but where the timestamp has a non-zero fractional component (e.g. 31-Jan-2023 17:02:00.001234), the imported value is "NaT".
If I inspect the original table in pgAdmin, all the timestamp values are correctly displayed. How can I ensure that all the timestamp values are correctly imported?
My Postgres version is 9.3.25
Here's what I'm doing (exact database names, passwords etc. obfuscated).
vendor = 'PostgreSQL';
dbname = 'target_database';
opts = databaseConnectionOptions('native',vendor);
opts = setoptions(opts, ...
'DataSourceName', 'PostgreSQLDataSource', ...
'DatabaseName', dbname, ...
'Server', 'localhost', ...
'PortNumber', 5432);
saveAsDataSource(opts);
%% Make connection to database
conn = postgresql('PostgreSQLDataSource','myuser','mypassword');
%Set query to execute on the database
query = 'select * from target_index_table';
%% Execute query and fetch results
data = fetch(conn,query);
%% Close connection to database
close(conn)
%% Clear variables
clear conn query

Answers (1)

HimeshNayak
HimeshNayak on 17 Mar 2023
Hi Paul,
As per my understanding, when you are importing data from PostgreSQL database, the “timestamp” values with a non-zero fractional component are getting converted to “NaT”.
The timestamp with all zeros in the sub-second precision is returned by the driver in a different format than the values that do have non-zero sub-second precision. When we convert these string values to datetimes, the datetime function chooses one of these formats, and then sets any values that don't meet that format to NaT.
The workaround I'd suggest for now is to use “databaseImportOptions”. You can use the options to change the data type from a datetime to a ‘string’ or ‘char vector’. This will allow you to read in the raw text values sent back from the database. If you require datetime values, you can correct the inconsistent formats once imported and then pass them to the datetime function.
For more information, refer the following links:
Regards
HimeshNayak
  2 Comments
Paul
Paul on 17 Mar 2023
Thank you very much, Himesh, I will try that out!
Paul
Paul on 24 Mar 2023
Edited: Paul on 24 Mar 2023
The workaround does sort-of work.
This works:
conn = postgresql('PostgreSQLDataSource','myuser','mypassword');
iopts = databaseImportOptions(conn,'target_index_table');
iopts = setoptions(iopts,{'timestamp'}, 'Type', 'string');
%Set query to execute on the database
query = 'select * from target_index_table';
%% Execute query and fetch results
data = fetch(conn,query,iopts);
But if I make the query more complicated:
conn = postgresql('PostgreSQLDataSource','myuser','mypassword');
iopts = databaseImportOptions(conn,'target_index_table');
iopts = setoptions(iopts,{'timestamp'}, 'Type', 'string');
%Set query to execute on the database
query = 'select * from target_index_table where entrynumber > 100';
%% Execute query and fetch results
data = fetch(conn,query,iopts);
I get an error:
Error using database.relational.connection/fetch
Unable to use database import options with
select * from target_index_table where entrynumber > 100'.
Am I doing something wrong here?
Edit to add: I don't want to import the entire table in one go - it could potentially be huge (hundreds of thousands of rows), and doing so negates the reason for using an RDBMS in the first place.

Sign in to comment.

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!