Info

This question is closed. Reopen it to edit or answer.

How to write an array to a pre-defined column in an existing .csv file

1 view (last 30 days)
I'm using "writematrix" to write to a csv file. The csv is pre-defined in that it already has column headers. I would like to "locate" a specific column name an then write the array to the column. I've read through the help section and don't see that this is possible. What is the solution to this?

Answers (1)

Walter Roberson
Walter Roberson on 24 Sep 2020
You have to read in the existing file in full, add the new data to the in-memory version of the file, and write the complete file out again.
The easiest approach would probably be readtable(), then create a new variable in the table with the appropriate name, and then writetable()
I've read through the help section and don't see that this is possible.
There are three fundamentally different ways to represent text files.
  • The oldest way is fixed length lines with no explicit line terminators, using either blank padding or NUL (binary 0) padding. The padding characters are stripped out when the line is read by the program. This approach has some advantages of being able to quickly go to a specific line because the offset of the line relative to the beginning of the file can be exactly calculated. This approach is now quite uncommon in everyday computing, but persists in some mainframe processing where performance is more important than immediate user-friendliness.
  • The second oldest way is as a data structure that in some form has a section with information about each line, such as the location and size of the actual data in the file. Modifying an individual line in the middle of the file can be done by locating the appropriate index record, determining whether the new information is no larger than the old information and if so updating the old line contents directly (possibly returning some of the buffer to the datapool if the new line is shorter), and if the new content is larger, getting a new databuffer, write the new content to it, then patch the index to point to the new content and return the old content buffer to the datapool. One way of describing this is that the text files were a kind of database and each line was considered a variable-length record that could be updated independently. This kind of approach has some advantages... but it can be notably slower for plain sequential access of the contents of the file. And recall that adding a new column to every line in the file requires updating every record, so you have the record management overhead for every line. Digitial Equipment Corporation (DEC) RSX and VMS operating systems implemented this kind of filesystem; it was kind of interesting. But it was out-competed by the third major way of representing files
  • The newest way, that has taken over for all computing you are likely to run into (except databases perhaps), is that files are treated as streams of bytes, with either no separator between lines (for binary files) or else with a fixed sequence of bytes indicating and of line. The favored line termination has changed over time; at one point it was CR (carriage return) alone, another time it was CR followed by LF (carriage return followed by linefeed), but these days it is nearly always just LF (linefeed). A text file is then just the characters for the first line, then a linefeed, then the characters for the second line, then a linefeed, and so on (the file does not need to end in a linefeed.) No indexing of lines is done, and there are no "pointers" saying "look at such-and-such an offset to find line #9324". Just a stream of bytes. It is a very simple arrangement that provides a lot of flexibility: fast for sequential access, and any application that needs non-sequential access can build whatever datastructures it wants to optimize for its needs. But the disadvantage is that there are no "holes" to expand a line into. If you want to add (say) 8 characters at the end of a line, then the only way in such files is to rewrite the entire rest of the file.
Thus, it is not the case that the tools such as writematrix() and csvwrite() "forgot" to implement appending a column to a text file: instead the problem is that the underlying filesystems do not provide any operations to add more to a line without rewriting pretty much all of the file.
So... easiest is for you to take advantage of the fact you know the structure of the file to read the file, add new data in memory, write out the revised version.

Tags

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!