Get every nth row of a tall array
6 views (last 30 days)
Show older comments
I have a tall array and would like to collect every 26th row of one variable into an array. I tried:
U = tall(udata);
hhws = [];
udata.ReadSize = 26*500; % data is in 26 row chunks, so sizing so below works
while hasdata(udata)
U = read(udata);
hhws = [hhws;U.Var13(14:26:end)]; % want every 26th row starting with the 14th row
end
This produced the error:
Error using matlab.io.datastore.TabularTextDatastore/readData (line 78)
Unable to parse a "Numeric" field when reading row 10765, field 1.
Actual Text: "******** 7.909"
Expected: A number or literal "NaN", "Inf". (possibly signed, case insensitive)
Error in matlab.io.datastore.TabularDatastore/read (line 174)
[t, info] = ds.readData();
Caused by:
Reading the variable name 'Var1' using format '%f' from file: '<file path and file name>' starting at offset 1011702139.
Seems like maybe there's a problem with how I'm reading the file in? Is the method above viable assuming I get through this error? Thanks!
0 Comments
Accepted Answer
dpb
on 18 Aug 2022
Edited: dpb
on 20 Aug 2022
Actual Text: "******** 7.909"
The problem is in the data file itself -- there's an oveflow field indicator of "*" in a numeric field that fails because can't be converted to a numeric value by a formatted read.
You would need to add
'TreatAsMissing',{'********',''}
to the datastore when create it.
I've not really used the datastore much; I didn't see it there, but with detectImportOptions and the resulting text import object, there's also an 'ImportErrorRule' parameter that can be used to substitute a 'FillValue' which in that case could be made to return inf instead of nan to identify the specific instances as being the overflow and leave the missing just as empty. Seems an oversight unless I just missed it in the doc, but surely didn't find it; the options available aren't as extensive for the datastore, it seems.
4 Comments
dpb
on 22 Aug 2022
That does seem peculiar; the empty record is default; it's supposed to use either.
That might be worth a support Q? to TMW to ask if that is an expected result.
More Answers (0)
See Also
Categories
Find more on Large Files and Big Data in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!