How to make a small change to a big text file efficiently
Show older comments
Hi, I have a big normal text file, 5000 lines, where different forms of content exist, data or strings or symbols. And I want to change the data in only two locations.
e.g. data file content:
line 1-1000th ......
line 1001th ABCD
line 1002th 3 5 7
line till 5000th ......
What I know is that first to read the file line by line:
tline = fgetl(fid); data01{i} = tline;
find the line location after "ABCD", replace it with my new data:
data01{1002} = [9 0 0]; then write data01 line by line into the new file.
However, this takes too long time, due to the burden of reading each line. and I have a lot of files to work with.
Is there any faster way to do this? Any comment or hint will be appreciated! Thank you!
/Pengfei
Accepted Answer
More Answers (2)
Walter Roberson
on 14 May 2012
1 vote
If your new data is exactly the same length as the old, not a single character difference, then:
Use the 'rt+' permission when you fopen() the file (this is important)
fgetl() as many times as you need to skip over the data you wish to leave the same.
Before you read in the line that is to be changed, use ftell() and record the value.
fgetl() the line you will be changing. Compute the new line as a string: it must be exactly the same length as the existing line. Warning: 'ABCD' is not the same length as '9 0 0' !
fseek() on the file, relative to the beginning of the file, with "offset" the value you got from ftell(). This will reposition you to the beginning of the line you wish to change.
fprintf(fid, '%s\n', TheNewLine)
fseek() on the file, 0 bytes relative to your current position. This is needed in order to switch from writing mode to reading mode.
You are now positioned to the beginning of the line after the one you changed, and can fgetl() or whatever is needed to change the second line.
After changing the last line you need to change, you can fclose().
WARNING: if anything goes wrong your file is likely to be ruined.
Please consider rewriting the whole operation in perl. perl is provided with MATLAB and is designed for efficiency in these kinds of operations.
2 Comments
Pengfei
on 14 May 2012
Walter Roberson
on 14 May 2012
Using %.3f would be safer; otherwise you do not know how many decimal places it is going to emit. Safer yet is to use a string,
NewValue = '60.865';
and
fprintf(fid,'%s\n', NewValue);
You are not using feof() properly. Please see http://www.mathworks.com/matlabcentral/answers/21210-arrarys
After you do the fprintf() you have to fseek() to the current location (0 bytes from 'cur') to switch back to reading mode.
The fseek() / ftell() requirements are not something that is obvious. They have to do with the standards about how reading and writing files actually works. Each time you switch between reading and writing (or writing and reading), you need to fseek(), even if it is an fseek() that leaves you in the same position.
You could replace the ftell() / fseek() that you have in your code with fseek(fid, 0, 'cur')
Titus Edelhofer
on 14 May 2012
0 votes
Hi,
if it is the same length, I would suggest to use memmapfile: open the file using memmapfile. Use e.g. strfind to find where you want to replace something ('ABCD'). Just overwrite the data and close the file. Open the file with write permissions, though ;-).
Titus
Categories
Find more on Low-Level File I/O in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!