readmatrix collapses blanks/NaNs, but I want to keep those empty cells

55 views (last 30 days)
Attached are several .csv files of sample data. They're simple 11x3-ish matrices.
Here are the various screenshots if you don't want to download anything.
In Figure 1 there are several files: one is a "full" set, another is identical except some rows are deleted, a third has the deleted rows instead replaced with a character string, and the last is a single vector taken from one of the partial sets.
Figure 2 has the simple code used to import. It is simply as follows:
clear
clc
close all
opts = detectImportOptions('nonan.csv')
opts =
DelimitedTextImportOptions with properties: Format Properties: Delimiter: {','} Whitespace: '\b\t ' LineEnding: {'\n' '\r' '\r\n'} CommentStyle: {} ConsecutiveDelimitersRule: 'split' LeadingDelimitersRule: 'keep' TrailingDelimitersRule: 'ignore' EmptyLineRule: 'skip' Encoding: 'UTF-8' Replacement Properties: MissingRule: 'fill' ImportErrorRule: 'fill' ExtraColumnsRule: 'addvars' Variable Import Properties: Set types by name using setvartype VariableNames: {'X', 'Y', 'Z'} VariableTypes: {'double', 'double', 'double'} SelectedVariableNames: {'X', 'Y', 'Z'} VariableOptions: [1-by-3 matlab.io.VariableImportOptions] Access VariableOptions sub-properties using setvaropts/getvaropts VariableNamingRule: 'modify' Location Properties: DataLines: [2 Inf] VariableNamesLine: 1 RowNamesColumn: 0 VariableUnitsLine: 0 VariableDescriptionsLine: 0 To display a preview of the table, use preview
full=readmatrix('nonan.csv',opts)
full = 10×3
1 11 21 2 12 22 3 13 23 4 14 24 5 15 25 6 16 26 7 17 27 8 18 28 9 19 29 10 20 30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
partial=readmatrix('nans.csv',opts)
partial = 8×3
1 11 21 2 12 22 5 15 25 6 16 26 7 17 27 8 18 28 9 19 29 10 20 30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
vectornan=readmatrix('vectornan.csv',opts)
vectornan = 8×3
1 NaN NaN 2 NaN NaN 5 NaN NaN 6 NaN NaN 7 NaN NaN 8 NaN NaN 9 NaN NaN 10 NaN NaN
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
nanpoop=readmatrix('nanpoop.csv',opts)
nanpoop = 10×3
1 11 21 2 12 22 NaN NaN NaN NaN NaN NaN 5 15 25 6 16 26 7 17 27 8 18 28 9 19 29 10 20 30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Figures 3 and 4 are the detectImportOptions and VariableImportOptions settings.
I want to import the nan'd set of data while maintaining the NaNs because the size of the vector/matrix is important. However, you can see in the workspace that the blank spaces are ignored and collapsed (the matrices are smaller) while the character vectors are correctly identified as "NaN". I realize that blanks are technically different than "NaN", but this used to never be a problem and now I'm running into this issue and correctly importing as part of the script is just not working and I can't figure it out. Also, there is no reason why "vectornan" should be importing as a matrix because I copy-pasted only one column; it should be a vector.
I CAN import manually using the import menu, and that DOES correctly identify blank spaces as "NaN". So I don't know why that works but the readmatrix isn't.

Accepted Answer

Voss
Voss on 29 Oct 2024 at 15:17
Use 'EmptyLineRule','read' to keep the lines that are all-NaN. Using only that option, all sample files are read correctly:
full=readmatrix('nonan.csv','EmptyLineRule','read')
full = 10×3
1 11 21 2 12 22 3 13 23 4 14 24 5 15 25 6 16 26 7 17 27 8 18 28 9 19 29 10 20 30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
partial=readmatrix('nans.csv','EmptyLineRule','read')
partial = 10×3
1 11 21 2 12 22 NaN NaN NaN NaN NaN NaN 5 15 25 6 16 26 7 17 27 8 18 28 9 19 29 10 20 30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
vectornan=readmatrix('vectornan.csv','EmptyLineRule','read')
vectornan = 10×1
1 2 NaN NaN 5 6 7 8 9 10
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
nanpoop=readmatrix('nanpoop.csv','EmptyLineRule','read')
nanpoop = 10×3
1 11 21 2 12 22 NaN NaN NaN NaN NaN NaN 5 15 25 6 16 26 7 17 27 8 18 28 9 19 29 10 20 30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Also, to address this:
"there is no reason why "vectornan" should be importing as a matrix because I copy-pasted only one column; it should be a vector"
The reason vectornan is a matrix with three columns is that you are calling readmatrix on vectornan.csv using the options detected from nonan.csv, which includes three variables, so the result of that readmatrix call includes three variables.
  3 Comments
Voss
Voss on 29 Oct 2024 at 15:43
"Do you know if there's a way to permanently and/or globally apply that rule so that I don't have to make that change to each line?"
I'm not aware of a way to set the readmatrix default options globally.
Star Strider
Star Strider on 29 Oct 2024 at 16:01
Moved: Voss on 29 Oct 2024 at 16:27
Do you know if there's a way to permanently and/or globally apply that rule so that I don't have to make that change to each line?
Create your own function file (or anonymous function) that contains the required function call (readtable or readmatrix) and the desired name-value pair arguments, then call it with the file name and return the result (matrix or table) as the function output. Be careful to not name it the same as an existing MATLAB function.

Sign in to comment.

More Answers (0)

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Products


Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!