fprintf and fscanf same format fail to read file. appreciate help
Show older comments
I wrote a text file (on a windows PC). (A is large matrix of size [29909961, 4] elements)
fid=fopen(filename,'w')
fprintf(fid,'%5i%5i %17.4E%17.4E\r\n', A);
fclose(fid)
but fscanf fails to read it in again. have tried a bit with different options, but no success
fid=fopen(filename,'r')
K51=fscanf(fid,'%5i%5i %17.4E%17.4E\r\n');
%K51=fscanf(fid,'%5i%5i%20.4E%17.4E%\r%*c%\n%*c');
fclose(fid)
appreciate the help!
Accepted Answer
More Answers (2)
dpb
on 7 Feb 2020
fprintf(fid,'%5i%5i %17.4E%17.4E\r\n', A);
will write A in column-major order, not row-major as you're trying to read it in later. Use
fprintf(fid,'%5i%5i %17.4E%17.4E\n', A.')
6 Comments
Stephen23
on 7 Feb 2020
thank you for answers. Sorry. I did transpose the matrix using A', just forgot to put the ' when posting the question here. the file is written to disk correctly. (checked by open it in Notepad++). I wonder if it is a problem with reading first 2 integers and then 2 floating point values from each line. Do I need to declare the matrix for reading in values to "double" format before using the fscanf? Still did not succeed, not even with the 'rt' option in fopen.
Stephen23
on 8 Feb 2020
"I wonder if it is a problem with reading first 2 integers and then 2 floating point values from each line."
Just like array A has one numeric type (being one numeric array), so will the output of fscanf have one class: for mixed integer/floating formats the output will be double. This is explained in the fscanf documentation.
"Do I need to declare the matrix for reading in values to "double" format before using the fscanf?"
No, the class of the output is determined solely by the fscanf format string.
"Still did not succeed, not even with the 'rt' option in fopen."
Your fprintf format has a potential bug because you did not include any delimiters between the numbers. I strongly recommend adding spaces to ensure that the numbers to not end up merging together
"Your fprintf format has a potential bug because you did not include any delimiters between the numbers. I strongly recommend adding spaces to ensure that the numbers to not end up merging together"
That's even more strongly fraught with danger--on write extra white space is transmitted verbatim to the output line. If building a fixed-width format field for Fortran application as appears might be the object here, that could be a fatal error.
Examine the following:
>> length(sprintf('%5i %5i',1,1))
ans =
11
>> length(sprintf('%5i%5i',1,1)
ans =
10
>>
OTOMH, I don't know what the C Standard says re: embedded blanks in a formatting string on input nor what MATLAB does if anything unique in their implementation and haven't tried testing it.
C formatted i/o is a real bugger as implemented--would have been much nicer if TMW had stayed with the roots of MATLAB and FORTRAN and implemented it via emulating FORMAT instead--we would then have field repeat specifiers, complex variables and all kinds of other niceties lacking the way they chose (the easy way out from a programming/development standpoint, I'm sure, though).
Walter Roberson
on 8 Feb 2020
In C, the width is supposed to be taken into account and at most that many characters are to be read. However, any whitespace or invalid character for number forming interrupts the reading and leaves the current position there, so this cannot reliably be used for fixed width that might not be perfect (and I would need to test how leading space is handled.)
"That's even more strongly fraught with danger--on write extra white space is transmitted verbatim to the output line. If building a fixed-width format field for Fortran application as appears might be the object here, that could be a fatal error."
I presumed that the author of the code would reduce the fieldwidth by one to compensate.
Are these two exactly equivalent? No, of course not, but given the differences between FORTRAN and C-style parsing there is no 100% equivalence. But with a bit of flexibility, it is possible to define file formats that write/read robustly using both.
The big problem though and the reason for making the observation is that often the Fortran is legacy code that can't be modified and that does indeed use fixed format READ statements.
Since input parsing with Fortran FORMAT field widths counts every character, the white space can cause silent errors in input interpretation.
If one is building both at the same time then yes, one can do much better in designing input formatting and also in coding; using '*' list-directed input instead of fixed formats is one of the best options for user hand-modified inputs. But, for automated data transfer between, it's more reliable to just use the input format identical to that of the application; even if the two I5 fields "run together" visually, the Fortran application will have no problem reading it correctly despite the problems that might make in looking at the input file visually.
Frequently/Most(?) often, the input format will have been set up with the expectation the input for that field will never have as many significant digits as the field width so in practice it won't happen even if it could.
ADDENDUM:
Also, the comment was intended to ensure the OP was aware of the difference between the two if doing so as his FORMAT statement itself might be causing a problem he wasn't actually aware of in that the blanks in the ML/C-style formatting string are significant whereas embedded in a FORMAT statement blanks are insignificant.
Jörgen
on 7 Feb 2020
0 votes
Categories
Find more on Historical Contests in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!