fprintf and fscanf same format fail to read file. appreciate help

Question

0 votes

I wrote a text file (on a windows PC). (A is large matrix of size [29909961, 4] elements)

fid=fopen(filename,'w')
fprintf(fid,'%5i%5i   %17.4E%17.4E\r\n', A);
fclose(fid)

but fscanf fails to read it in again. have tried a bit with different options, but no success

fid=fopen(filename,'r')
K51=fscanf(fid,'%5i%5i   %17.4E%17.4E\r\n');
%K51=fscanf(fid,'%5i%5i%20.4E%17.4E%\r%*c%\n%*c');
fclose(fid)

appreciate the help!

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

dpb on 7 Feb 2020

Edited: dpb on 8 Feb 2020

Open in MATLAB Online

1 vote

MATLAB is NOT Fortran.

As Stephen notes above and is documented, fscanf returns a single array of a given class; if there are any doubles in the input file, then the output returned has to be double.

What is also documented but you have to read every stinkin' line through to the end to find it is:

"Format specifiers for the reading functions sscanf and fscanf differ from the formats for the writing functions sprintf and fprintf. The reading functions do not support a precision field.

So, the thing that is breaking your input read is the ".4" in the e-format fields in your format string. It simply isn't supported by MATLAB. (I don't know if this true of underlying C fscanf or is only a limitation in the TMW-suppled vectorized versions of MATLAB.)

To read all numeric data you don't need the specific format and are better off without it...just use '%f'

>> x=[randi(100,20,2) rand(20,2)];   % dummy data set
>> fid=fopen('jorgen.dat','w');
>> fprintf(fid,'%5i %5i   %17.4E %17.4E\n', A.');
>> fid=fclose(fid);
>> fid=fopen('jorgen.dat','r');
>> fscanf(fid,'%f',[4,inf]).'
ans =
0000   17.0000    0.1066    0.8530
0000   80.0000    0.9619    0.6221
0000   32.0000    0.0046    0.3509
0000   53.0000    0.7749    0.5132
0000   17.0000    0.8173    0.4018
0000   61.0000    0.8687    0.0760
0000   27.0000    0.0844    0.2399
0000   66.0000    0.3998    0.1233
0000   69.0000    0.2599    0.1839
0000   75.0000    0.8001    0.2399
0000   46.0000    0.4314    0.4173
0000    9.0000    0.9106    0.0497
0000   23.0000    0.1819    0.9027
0000   92.0000    0.2638    0.9448
0000   16.0000    0.1455    0.4909
0000   83.0000    0.1361    0.4893
0000   54.0000    0.8693    0.3377
0000  100.0000    0.5797    0.9001
0000    8.0000    0.5499    0.3693
0000   45.0000    0.1449    0.1112
>> fid=fclose(fid);
>> type jorgen.dat  % to compare
 17          1.0665E-01       8.5303E-01
 80          9.6190E-01       6.2206E-01
 32          4.6342E-03       3.5095E-01
 53          7.7491E-01       5.1325E-01
 17          8.1730E-01       4.0181E-01
 61          8.6869E-01       7.5967E-02
 27          8.4436E-02       2.3992E-01
 66          3.9978E-01       1.2332E-01
 69          2.5987E-01       1.8391E-01
 75          8.0007E-01       2.3995E-01
 46          4.3141E-01       4.1727E-01
  9          9.1065E-01       4.9654E-02
 23          1.8185E-01       9.0272E-01
 92          2.6380E-01       9.4479E-01
 16          1.4554E-01       4.9086E-01
 83          1.3607E-01       4.8925E-01
 54          8.6929E-01       3.3772E-01
100          5.7970E-01       9.0005E-01
  8          5.4986E-01       3.6925E-01
 45          1.4495E-01       1.1120E-01
>> 

Even simpler is the now sadly deprecated textread --

>> textread('jorgen.dat')
ans =
0000   17.0000    0.1066    0.8530
0000   80.0000    0.9619    0.6221
0000   32.0000    0.0046    0.3509
0000   53.0000    0.7749    0.5132
0000   17.0000    0.8173    0.4018
0000   61.0000    0.8687    0.0760
0000   27.0000    0.0844    0.2399
0000   66.0000    0.3998    0.1233
0000   69.0000    0.2599    0.1839
0000   75.0000    0.8001    0.2399
0000   46.0000    0.4314    0.4173
0000    9.0000    0.9106    0.0497
0000   23.0000    0.1819    0.9027
0000   92.0000    0.2638    0.9448
0000   16.0000    0.1455    0.4909
0000   83.0000    0.1361    0.4893
0000   54.0000    0.8693    0.3377
0000  100.0000    0.5797    0.9001
0000    8.0000    0.5499    0.3693
0000   45.0000    0.1449    0.1112
>> 

it saves messing around with file handles and format string entirely for all-numeric data arrays. textscan can do the same excepting have to use file handles and then cast the cell array to double manually.

4 Comments
Show 2 older comments Hide 2 older comments

Jörgen on 7 Feb 2020

Thank you very much. See that I should have spent more time on rtfm. Then the problem is solved!

dpb on 7 Feb 2020

Edited: dpb on 8 Feb 2020

The limitation certainly ought to be highlighted in the section on input arguments for the format string at a minimum instead of stuck down at the bottom as a "Tip" :(

Seems like ought to also generate error or at least a warning....

Sign in to comment.

Answer 2

dpb on 7 Feb 2020

Open in MATLAB Online

0 votes

fprintf(fid,'%5i%5i %17.4E%17.4E\r\n', A);

will write A in column-major order, not row-major as you're trying to read it in later. Use

fprintf(fid,'%5i%5i %17.4E%17.4E\n', A.')

6 Comments
Show 4 older comments Hide 4 older comments

dpb on 8 Feb 2020

Edited: dpb on 8 Feb 2020

Open in MATLAB Online

"Your fprintf format has a potential bug because you did not include any delimiters between the numbers. I strongly recommend adding spaces to ensure that the numbers to not end up merging together"

That's even more strongly fraught with danger--on write extra white space is transmitted verbatim to the output line. If building a fixed-width format field for Fortran application as appears might be the object here, that could be a fatal error.

Examine the following:

>> length(sprintf('%5i %5i',1,1))
ans =
    11
>> length(sprintf('%5i%5i',1,1)
ans =
    10
>> 

OTOMH, I don't know what the C Standard says re: embedded blanks in a formatting string on input nor what MATLAB does if anything unique in their implementation and haven't tried testing it.

C formatted i/o is a real bugger as implemented--would have been much nicer if TMW had stayed with the roots of MATLAB and FORTRAN and implemented it via emulating FORMAT instead--we would then have field repeat specifiers, complex variables and all kinds of other niceties lacking the way they chose (the easy way out from a programming/development standpoint, I'm sure, though).

Stephen23 on 9 Feb 2020

Edited: Stephen23 on 9 Feb 2020

"That's even more strongly fraught with danger--on write extra white space is transmitted verbatim to the output line. If building a fixed-width format field for Fortran application as appears might be the object here, that could be a fatal error."

I presumed that the author of the code would reduce the fieldwidth by one to compensate.

Are these two exactly equivalent? No, of course not, but given the differences between FORTRAN and C-style parsing there is no 100% equivalence. But with a bit of flexibility, it is possible to define file formats that write/read robustly using both.

dpb on 9 Feb 2020

Edited: dpb on 9 Feb 2020

The big problem though and the reason for making the observation is that often the Fortran is legacy code that can't be modified and that does indeed use fixed format READ statements.

Since input parsing with Fortran FORMAT field widths counts every character, the white space can cause silent errors in input interpretation.

If one is building both at the same time then yes, one can do much better in designing input formatting and also in coding; using '*' list-directed input instead of fixed formats is one of the best options for user hand-modified inputs. But, for automated data transfer between, it's more reliable to just use the input format identical to that of the application; even if the two I5 fields "run together" visually, the Fortran application will have no problem reading it correctly despite the problems that might make in looking at the input file visually.

Frequently/Most(?) often, the input format will have been set up with the expectation the input for that field will never have as many significant digits as the field width so in practice it won't happen even if it could.

ADDENDUM:

Also, the comment was intended to ensure the OP was aware of the difference between the two if doing so as his FORMAT statement itself might be causing a problem he wasn't actually aware of in that the blanks in the ML/C-style formatting string are significant whereas embedded in a FORMAT statement blanks are insignificant.

Sign in to comment.

Answer 3

Jörgen on 7 Feb 2020

0 votes

The fprintf function does what it should do. We have read the resulting text file with help of a fortran executable with the same format spec and that worked fine. But I wanted to check some numbers and read it with fscanf as to speed up reading compared to matlab's import function. Unfortunately I think I have spend time on this another time. Thank you for your answers though.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

fprintf and fscanf same format fail to read file. appreciate help

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

4 Comments
Show 2 older comments Hide 2 older comments

More Answers (2)

6 Comments
Show 4 older comments Hide 4 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Products

Tags

Community Treasure Hunt

fprintf and fscanf same format fail to read file. appreciate help

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

4 Comments Show 2 older comments Hide 2 older comments

More Answers (2)

6 Comments Show 4 older comments Hide 4 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Products

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

4 Comments
Show 2 older comments Hide 2 older comments

6 Comments
Show 4 older comments Hide 4 older comments

0 Comments
Show -2 older comments Hide -2 older comments