how to read complicated CSV files into matlab?
17 views (last 30 days)
Show older comments
My raw CSV file looks like the 1st pic. And I wants to use Matlab read it into the format as the 2rd pic. I have over 1000 the same kind of CSV files, it will be painful if I do it by copy/paste. How can I do this? Any examples? thanks.
I am a new user here, do not know how to post images, Please click the links to see. Thanks!
-------------------------------- raw data:
ID
Height
1 2 3
4 5 6
Weight
1 2 3
4 5 6
---------------------------
output data:
id height weight
1 1 1
1 2 2
1 3 3
...
0 Comments
Accepted Answer
Geoff
on 21 May 2012
Easiest is to read your file one line at a time, and separate your sections. If your files follow exactly the format you've listed with no missing values, it's simple... Maybe a bit clunky, but serves its purpose. If there's more than height and weight, I'd combine this into a single loop with section logic, rather than create loops for each.
Try this... might contain errors. I just hacked it up here to illustrate.
fid = fopen('data.csv', 'rt');
header = fgetl(fid);
id = regexp(header, ',(.*),', 'tokens');
id = id{1}{1};
% Read height section
fgetl(fid); fgetl(fid); % skip headers
height = [];
while ~feof(fid)
line = fgetl(fid);
vals = sscanf(line, '%f,%f,%f', 3);
if isempty(vals)
break;
end
height(end+1, 1:3) = vals;
end
% Do the same thing for weight.
fgetl(fid); fgetl(fid); % skip headers
weight = [];
while ~feof(fid)
line = fgetl(fid);
vals = sscanf(line, '%f,%f,%f', 3);
if isempty(vals)
break;
end
weight(end+1, 1:3) = vals;
end
fclose(fid);
[EDIT] To select appropriate action from arbitrary order - the dirty way =)
while ~feof(fid)
header = regexp(fgetl(fid), '^(.*),', 'tokens');
if ~isempty(header)
header = header{1}{1}
end
fgetl(fid); % skip second header line
switch lower(header)
case 'height'
% height-processing loop goes here
case 'weight'
% weight-processing loop goes here
otherwise
error('Unexpected section header');
end
end
There are nicer ways to do all this, but my philosophy is if the quick and dirty option is good enough, there's no need to be clever.
3 Comments
Geoff
on 22 May 2012
Yep, in that case, pull the value out the same way I pulled out the ID. See where I have done the two fgetl() calls to "skip headers"? I ignored those results on the assumption that order was consistent. I'll edit my answer accordingly, as putting code in these comments is a bit futile.
More Answers (1)
Matt Kindig
on 21 May 2012
Can you excerpt the actual text of your CSV file (not the file imported into Excel that you screenshot, but the actual original file with the commas shown and everything)? I ask because I don't think you'll be able to simply use csvread/dlmread, but will need to use a more sophisticated approach using something like fscanf or regexp. Please post an excerpt of your file.
Also, where does the ID come from? Are there other fields marked 'id' later in the file (such as A002)? Note that the raw/output data shown in your posting and in the screenshots is inconsistent--this makes it a bit confusing for me.
See Also
Categories
Find more on Large Files and Big Data in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!