Read from text file between header and footer

Is there an easy way to read numerical data from a text file below a 3 line header and above a footer starting with a sequence of asterisks. The data is in 3 complete columns of unknown length. Delimiter and asterisks can be modified if necessary.
Tried readtable but footer is causing problems as there doesn't appear to be an option from detectImportOptions.
see attached for example data file
ld dim real
ratio ht speed
[] [s] [m/s]
5.0000000e-01 3.4497665e+02 2.1712214e+03
1.0000000e+00 3.8184070e+02 2.3923590e+03
1.5000000e+00 3.9916530e+02 2.4919967e+03
2.0000000e+00 4.0763144e+02 2.5342191e+03
2.5000000e+00 4.1118036e+02 2.5473084e+03
...
**********************************************
2.8682373e+00 4.1181815e+02
2.6080322e+00 2.5476604e+03

 Accepted Answer

One option is textscan
type('data.txt')
ld dim real ratio ht speed [] [s] [m/s] 5.0000000e-01 3.4497665e+02 2.1712214e+03 1.0000000e+00 3.8184070e+02 2.3923590e+03 1.5000000e+00 3.9916530e+02 2.4919967e+03 2.0000000e+00 4.0763144e+02 2.5342191e+03 2.5000000e+00 4.1118036e+02 2.5473084e+03 3.0000000e+00 4.1174642e+02 2.5434127e+03 3.5000000e+00 4.1034564e+02 2.5274253e+03 4.0000000e+00 4.0753190e+02 2.5019786e+03 4.5000000e+00 4.0361995e+02 2.4695291e+03 5.0000000e+00 3.9881033e+02 2.4323257e+03 5.5000000e+00 3.9323845e+02 2.3920574e+03 6.0000000e+00 3.8701946e+02 2.3499201e+03 6.5000000e+00 3.8029476e+02 2.3068784e+03 7.0000000e+00 3.7327720e+02 2.2638295e+03 7.5000000e+00 3.6629421e+02 2.2215880e+03 8.0000000e+00 3.5957346e+02 2.1807905e+03 8.5000000e+00 3.5315798e+02 2.1418418e+03 9.0000000e+00 3.4706390e+02 2.1049334e+03 9.5000000e+00 3.4130076e+02 2.0701014e+03 1.0000000e+01 3.3586329e+02 2.0372850e+03 1.0500000e+01 3.3073554e+02 2.0063724e+03 1.1000000e+01 3.2589619e+02 1.9772286e+03 ********************************************** 2.8682373e+00 4.1181815e+02 2.6080322e+00 2.5476604e+03
fidi = fopen('data.txt','rt')
fidi = 3
C = textscan(fidi, '%f%f%f', 'HeaderLines',3, 'CollectOutput',1)
C = 1×1 cell array
{22×3 double}
fclose(fidi);
A = cell2mat(C)
A = 22×3
1.0e+03 * 0.0005 0.3450 2.1712 0.0010 0.3818 2.3924 0.0015 0.3992 2.4920 0.0020 0.4076 2.5342 0.0025 0.4112 2.5473 0.0030 0.4117 2.5434 0.0035 0.4103 2.5274 0.0040 0.4075 2.5020 0.0045 0.4036 2.4695 0.0050 0.3988 2.4323
figure
plot(A(:,1), A(:,[2 3]))
grid
It will stop automatically at the line of asterisks, however if you want to re-start it to read the last two lines, that is an option.
EDIT — Forgot fclose call, now added.
.

4 Comments

Many thanks for that.
Perhaps a silly question, but how does it know to stop at the asterisks? Is it simply that it doesn't comply with the format expected and stops without issuing an error?
Interestingly, how might i restart it to read the remaining data. Something to do with the position argument return i suspect.
My pleasure!
That becomes slightly more involved for this specific file —
fidi = fopen('data.txt','rt');
k1 = 1;
while ~feof(fidi)
if k1 == 1
C = textscan(fidi, '%f%f%f', 'HeaderLines',3, 'CollectOutput',true, 'Delimiter','\t');
else
C = textscan(fidi, '%f%f%f', 'HeaderLines',1, 'CollectOutput',true, 'Delimiter','\t');
end
A = cell2mat(C);
if isempty(A) %find( Empty Matrix Indicates End-Of-File
break
end
D{k1,:} = A;
fseek(fidi, 0, 0);
k1 = k1 + 1;
end
fclose(fidi);
Out = cell2mat(D);
D{1}
ans = 22×3
1.0e+03 * 0.0005 0.3450 2.1712 0.0010 0.3818 2.3924 0.0015 0.3992 2.4920 0.0020 0.4076 2.5342 0.0025 0.4112 2.5473 0.0030 0.4117 2.5434 0.0035 0.4103 2.5274 0.0040 0.4075 2.5020 0.0045 0.4036 2.4695 0.0050 0.3988 2.4323
D{2}
ans = 2×3
1.0e+03 * 0.0029 0.4118 NaN 0.0026 2.5477 NaN
figure
hold on
for k = 1:numel(D)
plot(D{k}(:,1), D{k}(:,[2 3]), 'DisplayName',"Section "+k)
end
hold off
grid
legend('Location','best')
The only problem is in the second part, where it does not fill the missing value with NaN in the second line. There apparently is no provision in textscan to fill the empty positions, so it just concatenates them into a (2x2) matrix rather than a (2x3) with the missing values as NaN.
However, readtable has an app for that (or rather an import options fix for it) —
opts = fixedWidthImportOptions('NumVariables',3, 'VariableWidths',[15 16 16], 'DataLines',[4 25; 27 28]);
T1 = readtable('data.txt', opts)
T1 = 24×3 table
Var1 Var2 Var3 _________________ _________________ _________________ {'5.0000000e-01'} {'3.4497665e+02'} {'2.1712214e+03'} {'1.0000000e+00'} {'3.8184070e+02'} {'2.3923590e+03'} {'1.5000000e+00'} {'3.9916530e+02'} {'2.4919967e+03'} {'2.0000000e+00'} {'4.0763144e+02'} {'2.5342191e+03'} {'2.5000000e+00'} {'4.1118036e+02'} {'2.5473084e+03'} {'3.0000000e+00'} {'4.1174642e+02'} {'2.5434127e+03'} {'3.5000000e+00'} {'4.1034564e+02'} {'2.5274253e+03'} {'4.0000000e+00'} {'4.0753190e+02'} {'2.5019786e+03'} {'4.5000000e+00'} {'4.0361995e+02'} {'2.4695291e+03'} {'5.0000000e+00'} {'3.9881033e+02'} {'2.4323257e+03'} {'5.5000000e+00'} {'3.9323845e+02'} {'2.3920574e+03'} {'6.0000000e+00'} {'3.8701946e+02'} {'2.3499201e+03'} {'6.5000000e+00'} {'3.8029476e+02'} {'2.3068784e+03'} {'7.0000000e+00'} {'3.7327720e+02'} {'2.2638295e+03'} {'7.5000000e+00'} {'3.6629421e+02'} {'2.2215880e+03'} {'8.0000000e+00'} {'3.5957346e+02'} {'2.1807905e+03'}
T1(end-1:end,:)
ans = 2×3 table
Var1 Var2 Var3 _________________ _________________ _________________ {'2.8682373e+00'} {'4.1181815e+02'} {0×0 char } {'2.6080322e+00'} {0×0 char } {'2.5476604e+03'}
missing = varfun(@ismissing, T1);
missing(end-1:end,:)
ans = 2×3 table
ismissing_Var1 ismissing_Var2 ismissing_Var3 ______________ ______________ ______________ false false true false true false
T2 = varfun(@(x)fillmissing(x,'constant',{'NaN'}), T1)
T2 = 24×3 table
Fun_Var1 Fun_Var2 Fun_Var3 _________________ _________________ _________________ {'5.0000000e-01'} {'3.4497665e+02'} {'2.1712214e+03'} {'1.0000000e+00'} {'3.8184070e+02'} {'2.3923590e+03'} {'1.5000000e+00'} {'3.9916530e+02'} {'2.4919967e+03'} {'2.0000000e+00'} {'4.0763144e+02'} {'2.5342191e+03'} {'2.5000000e+00'} {'4.1118036e+02'} {'2.5473084e+03'} {'3.0000000e+00'} {'4.1174642e+02'} {'2.5434127e+03'} {'3.5000000e+00'} {'4.1034564e+02'} {'2.5274253e+03'} {'4.0000000e+00'} {'4.0753190e+02'} {'2.5019786e+03'} {'4.5000000e+00'} {'4.0361995e+02'} {'2.4695291e+03'} {'5.0000000e+00'} {'3.9881033e+02'} {'2.4323257e+03'} {'5.5000000e+00'} {'3.9323845e+02'} {'2.3920574e+03'} {'6.0000000e+00'} {'3.8701946e+02'} {'2.3499201e+03'} {'6.5000000e+00'} {'3.8029476e+02'} {'2.3068784e+03'} {'7.0000000e+00'} {'3.7327720e+02'} {'2.2638295e+03'} {'7.5000000e+00'} {'3.6629421e+02'} {'2.2215880e+03'} {'8.0000000e+00'} {'3.5957346e+02'} {'2.1807905e+03'}
T2(end-1:end,:)
ans = 2×3 table
Fun_Var1 Fun_Var2 Fun_Var3 _________________ _________________ _________________ {'2.8682373e+00'} {'4.1181815e+02'} {'NaN' } {'2.6080322e+00'} {'NaN' } {'2.5476604e+03'}
format long
A = str2double(table2array(T2))
A = 24×3
1.0e+03 * 0.000500000000000 0.344976650000000 2.171221400000000 0.001000000000000 0.381840700000000 2.392359000000000 0.001500000000000 0.399165300000000 2.491996700000000 0.002000000000000 0.407631440000000 2.534219100000000 0.002500000000000 0.411180360000000 2.547308400000000 0.003000000000000 0.411746420000000 2.543412700000000 0.003500000000000 0.410345640000000 2.527425300000000 0.004000000000000 0.407531900000000 2.501978600000000 0.004500000000000 0.403619950000000 2.469529100000000 0.005000000000000 0.398810330000000 2.432325700000000
A(end-1:end,:)
ans = 2×3
1.0e+03 * 0.002868237300000 0.411818150000000 NaN 0.002608032200000 NaN 2.547660400000000
... and we have liftoff!
(It took a few minutes for me to surf the fixedWidthImportOptions documentation because I don’t use it very often, then experiment until I got the result I wanted.)
.
That was very kind of you.
It would perhaps be better to insert NaN's for blanks.
I can see you are checking for the end of file as things proceed through the 'header' and 'footer'. Breaking when an empty matrix appears.
One more question if i may?
In laymens terms, what exactly is fseek(fidi, 0 , 0) doing?
Thank you.
The fseek call moves to the next position in the file, beyond the non-numeric text tthat stopped textscan.
If you have more of these files, use readtable with fixedWidthImportOptions, as I did here, in the last part of my previous Comment. That includes inserting the NaN value in the blanks.
Reprising that section here —
opts = fixedWidthImportOptions('NumVariables',3, 'VariableWidths',[15 16 16], 'DataLines',[4 25; 27 28]);
T1 = readtable('data.txt', opts)
T1 = 24×3 table
Var1 Var2 Var3 _________________ _________________ _________________ {'5.0000000e-01'} {'3.4497665e+02'} {'2.1712214e+03'} {'1.0000000e+00'} {'3.8184070e+02'} {'2.3923590e+03'} {'1.5000000e+00'} {'3.9916530e+02'} {'2.4919967e+03'} {'2.0000000e+00'} {'4.0763144e+02'} {'2.5342191e+03'} {'2.5000000e+00'} {'4.1118036e+02'} {'2.5473084e+03'} {'3.0000000e+00'} {'4.1174642e+02'} {'2.5434127e+03'} {'3.5000000e+00'} {'4.1034564e+02'} {'2.5274253e+03'} {'4.0000000e+00'} {'4.0753190e+02'} {'2.5019786e+03'} {'4.5000000e+00'} {'4.0361995e+02'} {'2.4695291e+03'} {'5.0000000e+00'} {'3.9881033e+02'} {'2.4323257e+03'} {'5.5000000e+00'} {'3.9323845e+02'} {'2.3920574e+03'} {'6.0000000e+00'} {'3.8701946e+02'} {'2.3499201e+03'} {'6.5000000e+00'} {'3.8029476e+02'} {'2.3068784e+03'} {'7.0000000e+00'} {'3.7327720e+02'} {'2.2638295e+03'} {'7.5000000e+00'} {'3.6629421e+02'} {'2.2215880e+03'} {'8.0000000e+00'} {'3.5957346e+02'} {'2.1807905e+03'}
T1(end-1:end,:)
ans = 2×3 table
Var1 Var2 Var3 _________________ _________________ _________________ {'2.8682373e+00'} {'4.1181815e+02'} {0×0 char } {'2.6080322e+00'} {0×0 char } {'2.5476604e+03'}
% missing = varfun(@ismissing, T1);
% missing(end-1:end,:)
T2 = varfun(@(x)fillmissing(x,'constant',{'NaN'}), T1)
T2 = 24×3 table
Fun_Var1 Fun_Var2 Fun_Var3 _________________ _________________ _________________ {'5.0000000e-01'} {'3.4497665e+02'} {'2.1712214e+03'} {'1.0000000e+00'} {'3.8184070e+02'} {'2.3923590e+03'} {'1.5000000e+00'} {'3.9916530e+02'} {'2.4919967e+03'} {'2.0000000e+00'} {'4.0763144e+02'} {'2.5342191e+03'} {'2.5000000e+00'} {'4.1118036e+02'} {'2.5473084e+03'} {'3.0000000e+00'} {'4.1174642e+02'} {'2.5434127e+03'} {'3.5000000e+00'} {'4.1034564e+02'} {'2.5274253e+03'} {'4.0000000e+00'} {'4.0753190e+02'} {'2.5019786e+03'} {'4.5000000e+00'} {'4.0361995e+02'} {'2.4695291e+03'} {'5.0000000e+00'} {'3.9881033e+02'} {'2.4323257e+03'} {'5.5000000e+00'} {'3.9323845e+02'} {'2.3920574e+03'} {'6.0000000e+00'} {'3.8701946e+02'} {'2.3499201e+03'} {'6.5000000e+00'} {'3.8029476e+02'} {'2.3068784e+03'} {'7.0000000e+00'} {'3.7327720e+02'} {'2.2638295e+03'} {'7.5000000e+00'} {'3.6629421e+02'} {'2.2215880e+03'} {'8.0000000e+00'} {'3.5957346e+02'} {'2.1807905e+03'}
T2(end-1:end,:)
ans = 2×3 table
Fun_Var1 Fun_Var2 Fun_Var3 _________________ _________________ _________________ {'2.8682373e+00'} {'4.1181815e+02'} {'NaN' } {'2.6080322e+00'} {'NaN' } {'2.5476604e+03'}
format longG
A = str2double(table2array(T2))
A = 24×3
1.0e+00 * 0.5 344.97665 2171.2214 1 381.8407 2392.359 1.5 399.1653 2491.9967 2 407.63144 2534.2191 2.5 411.18036 2547.3084 3 411.74642 2543.4127 3.5 410.34564 2527.4253 4 407.5319 2501.9786 4.5 403.61995 2469.5291 5 398.81033 2432.3257
A(end-1:end,:)
ans = 2×3
1.0e+00 * 2.8682373 411.81815 NaN 2.6080322 NaN 2547.6604
You would need to manually keep track of the number of lines after the asterisks line (here the last two), and make approipriate changes to the 'NumVariables', 'VariableWidths', and 'DataLines' name-value pairs in the fixedWidthImportOptions call, however that should be straightforward.
I no longer have access to R2017a, and while there used to be online documentation for a few previous releases, that appears to no long be an option. Everything I have listed here should be available in — and compatible with — R2017a, however I cannot check to be certain.
.

Sign in to comment.

More Answers (0)

Products

Release

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!