readtable produces apparent gibberish when reading csv file

25 views (last 30 days)
OK, I have a readtable puzzle.
I have a comma separated list, stored in 'a.csv' I'd like to read:
```
bathTargets,mixtures,firstFile,lastFile
lower,normal_saline,745_043_0000.abf,745_043_0060.abf
upper,"normal_saline,ptx",745_043_0000.abf,745_043_0060.abf
```
When I call readtable('a.csv') I get very odd output
```
>> readtable('a.csv')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before
creating variable names for the table. The original column headers are saved in the
VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
ans =
1×6 table
lower_normal saline_745 x043 x0000_abf_745 x043_1 x0060_abf
_________________ ___________________ ____ ________________ ______ ____________
{'upper,"normal'} {'saline,ptx",745'} 43 {'0000.abf,745'} 43 {'0060.abf'}
```
If I make a similar csv table, b.csv, as
```
bathTargets,mixtures,firstFile,lastFile
lower,"normal_saline",745_043_0000.abf,1
upper,"normal_saline,ptx",745_043_0000.abf,2
```
then readtable does what I expect:
```
readtable('b.csv')
ans =
2×4 table
bathTargets mixtures firstFile lastFile
___________ _____________________ ____________________ ________
{'lower'} {'normal_saline' } {'745_043_0000.abf'} 1
{'upper'} {'normal_saline,ptx'} {'745_043_0000.abf'} 2
```
If I open a.csv in Excel, it looks correct, and I can save it as a.xlcs and I get what I expect:
```
readtable('a.xlsx')
ans =
2×4 table
bathTargets mixtures firstFile lastFile
___________ _____________________ ____________________ ____________________
{'lower'} {'normal_saline' } {'745_043_0000.abf'} {'745_043_0060.abf'}
{'upper'} {'normal_saline,ptx'} {'745_043_0000.abf'} {'745_043_0060.abf'}
```
Does anyone know why reading a.csv fails?
Thanks
Steve

Accepted Answer

Voss
Voss on 31 Oct 2024 at 17:44
Looks like readtable decides the delimiter for a.csv is '_' (underscore). I don't know why.
opts = detectImportOptions('a.csv');
opts.Delimiter
ans = 1x1 cell array
{'_'}
Specifying that ',' (comma) is the delimiter returns the correct result:
readtable('a.csv','Delimiter',',')
ans = 2x4 table
bathTargets mixtures firstFile lastFile ___________ _____________________ ____________________ ____________________ {'lower'} {'normal_saline' } {'745_043_0000.abf'} {'745_043_0060.abf'} {'upper'} {'normal_saline,ptx'} {'745_043_0000.abf'} {'745_043_0060.abf'}
  1 Comment
Steve Van Hooser
Steve Van Hooser on 31 Oct 2024 at 18:57
Thanks...I had tried
readtable('a.csv',"NumHeaderLines",1,"Delimiter",',')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
ans = 1x4 table
lower normal_saline x745_043_0000_abf x745_043_0060_abf _________ _____________________ ____________________ ____________________ {'upper'} {'normal_saline,ptx'} {'745_043_0000.abf'} {'745_043_0060.abf'}
But just using the delimiter works:
readtable('a.csv',"Delimiter",',')
ans = 2x4 table
bathTargets mixtures firstFile lastFile ___________ _____________________ ____________________ ____________________ {'lower'} {'normal_saline' } {'745_043_0000.abf'} {'745_043_0060.abf'} {'upper'} {'normal_saline,ptx'} {'745_043_0000.abf'} {'745_043_0060.abf'}
Not sure why setting "NumHeaderLines" to 1 causes it to fail.
But thanks, you found the issue.
Best
Steve

Sign in to comment.

More Answers (0)

Products


Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!