Performance issues with readtable vs. xlsread

26 views (last 30 days)
I am working in R2020a and have converted the Excel file read in one of my MATLAB scripts from xlsread to readtable, since xlsread is now deprecated.
However, I am having performance issues with readtable compared to xlsread.
The Excel file is a 15MB .xlsb file with six sheets, and I am using named ranges to locate the data on the approrpriate sheet.
tic; [perforations, perforations_header] = xlsread(xls_file,sheet,'PERFORATIONS'); toc
Elapsed time is 4.354259 seconds.
tic; T=readtable(xls_file,'Sheet',sheet,'Range','PERFORATIONS'); toc
Elapsed time is 15.777517 seconds.
perforations = table2array(T); perforations_header=T.Properties.VariableNames; toc
Elapsed time is 15.803216 seconds.
So, two questions:
(1) Any suggestions on how to improve performance for readtable?
(2) Despite xlsread being deprecated, it clearly performs better than readtable. Is there any downside with staying with xlsread?
My thanks!
Mike
P.s. Attached is a cut down version of the Excel .xlsb file. It contains two named ranges: 'PERFORATIONS' and 'CELLS'. The full file was too large to attach.
  2 Comments
Walter Roberson
Walter Roberson on 3 Dec 2020
Consider experimenting with readtable() with the 'UseExcel', true flag.
Mike
Mike on 3 Dec 2020
Very interesting!
Rough figures for the 'PERFORATIONS' range:
xlsread => 4 seconds
readtable => 16 seconds
readtable('UseExcel',true) => 14 seconds
So, some improvement but still hasn't reached the performance of xlsread.
I'm starting with a very small example despite the use of an .xlsb file. Most of my "in anger" files will be x100 larger...
Thanks for the suggestion!
Mike

Sign in to comment.

Accepted Answer

Star Strider
Star Strider on 2 Dec 2020
If you need to use xlsread, use it as long as you have it available.
Since you obviously want the header information as well, one option could be to use readtable once to read everything in once (likely using 'VariableNamingRule','preserve' in your call to it), perhaps extracting the variable names separately as <tablename>.Properties.VariableNames to a cell, then using save to put all the information you want in a .mat file. Then load the .mat file as necessary to retrieve the necessary information into your workspace. Either save the headers and data separately as different variables, or save it as a table.
I have no idea how efficient this would be, since I have no idea what your Excel file contains, or how large it is. It is simply an option to consider.
  2 Comments
Mike
Mike on 2 Dec 2020
Edited: Mike on 2 Dec 2020
Thank you for the rapid reply. I have some control over the workflow for the project I'm working on, so I'll follow your lead and look into replacing the Excel files with a .mat save/load instead.
I don't have much experience with MATHWORKS/MATLAB. What does it mean for a function to be deprecated? Is there a management of change process to review possible transition issues before a function is removed?
Mike
Star Strider
Star Strider on 3 Dec 2020
My pleasure!
My idea is to read them in once with readtable, save them to .mat files, then load them as necessary when you need them later.
If a function is deprecated, it will not be supported (and will likely not appear) beginning in some future release, although code using it may still work. (It’s complicated!)

Sign in to comment.

More Answers (0)

Categories

Find more on Data Import from MATLAB in Help Center and File Exchange

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!