Regarding reading mulitple csv file with specific rows

3 views (last 30 days)
I have .csv 30 files with name inst0.csv to inst30.csv. First 24 row contain unnecessary data and I do not want remove it manually and due to unnecessary data I am facing problem with importing data with header issue.
I want to read all data including header Scan,Time,101 <M1_RW1> (C),102 <M1_RW2> (C),103 <M1_RS1> (C),104 <M1_RS2> (C),105 <M1_RN2> (C),106 <M1_RD2> (C),107 <M1_RW3> (C),108 <M1_RW4>
those data will find row 25 or 26 but the problem is the Header row of Scan and time keep changing for every file and I want to be read and process the data having header of scan time for multiple files. So how to do it?
Also,How to get all the files from folder and read it as table using drop down in live editor?

Accepted Answer

dpb
dpb on 27 May 2021
Read the headerline containing the number of channels in the dataset so you can compute the number of headerlines before the dataset of interest.
I just did the following at the command line--
for i=1:numel(d)
fid=fopen(d(i).name,'r');
nhdr=8+cell2mat(textscan(fid,'Total Channels:%f',1,'headerlines',5,'delimiter',','));
fid=fclose(fid);
opt{i}=detectImportOptions(d(i).name,'headerlines',nhdr);
preview(d(i).name,opt{i})
end
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions
property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
ans =
8×18 table
Scan Time x101_M1_RW1__C_ x102_M1_RW2__C_ x103_M1_RS1__C_ x104_M1_RS2__C_ x105_M1_RN2__C_ x106_M1_RD2__C_ x107_M1_RW3__C_ x108_M1_RW4__C_ x109_M1_RT3__C_ x201_M1_RM2__C_ x203_M1_RN1__C_ x204_M1_RD1__C_ x205_M1_RR1__C_ x206_M1_RSD1__C_ x207_M1_RSD2__C_ x208_M1_RF1__C_
____ ___________________________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ ________________ ________________ _______________
1 {'2020-12-10 14:02:35:214'} 9.9e+37 27.855 27.884 27.841 28.037 27.857 27.707 27.769 9.9e+37 -9.9e+37 9.9e+37 -9.9e+37 27.864 -9.9e+37 -9.9e+37 9.9e+37
2 {'2020-12-10 14:02:38:097'} 9.9e+37 27.853 27.885 27.839 28.036 27.855 27.705 27.769 9.9e+37 9.9e+37 -9.9e+37 -9.9e+37 27.859 -9.9e+37 -9.9e+37 9.9e+37
3 {'2020-12-10 14:02:40:145'} 9.9e+37 27.851 27.884 27.836 28.035 27.854 27.704 27.767 9.9e+37 9.9e+37 9.9e+37 -9.9e+37 27.859 -9.9e+37 -9.9e+37 9.9e+37
4 {'2020-12-10 14:02:42:368'} 9.9e+37 27.851 27.883 27.836 28.035 27.851 27.703 27.765 9.9e+37 9.9e+37 9.9e+37 -9.9e+37 27.858 -9.9e+37 -9.9e+37 9.9e+37
5 {'2020-12-10 14:02:44:414'} 9.9e+37 27.851 27.88 27.836 28.033 27.853 27.704 27.763 9.9e+37 9.9e+37 9.9e+37 -9.9e+37 27.858 -9.9e+37 -9.9e+37 9.9e+37
6 {'2020-12-10 14:02:46:461'} 9.9e+37 27.846 27.879 27.832 28.032 27.853 27.703 27.763 9.9e+37 9.9e+37 9.9e+37 -9.9e+37 27.858 -9.9e+37 -9.9e+37 9.9e+37
7 {'2020-12-10 14:02:48:509'} 9.9e+37 27.849 27.879 27.833 28.029 27.852 27.701 27.761 9.9e+37 9.9e+37 9.9e+37 -9.9e+37 27.858 -9.9e+37 -9.9e+37 9.9e+37
8 {'2020-12-10 14:02:50:557'} 9.9e+37 27.847 27.88 27.832 28.03 27.852 27.699 27.761 9.9e+37 9.9e+37 9.9e+37 -9.9e+37 27.857 -9.9e+37 -9.9e+37 9.9e+37
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions
property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
ans =
8×19 table
Scan Time x101_M1_RW1__C_ x102_M1_RW2__C_ x103_M1_RS1__C_ x104_M1_RS2__C_ x105_M1_RN2__C_ x106_M1_RD2__C_ x107_M1_RW3__C_ x108_M1_RW4__C_ x109_M1_RT3__C_ x201_M1_RM2__C_ x202_M1_RM1__C_ x203_M1_RN1__C_ x204_M1_RD1__C_ x205_M1_RR1__C_ x206_M1_RSD1__C_ x207_M1_RSD2__C_ x208_M1_RF1__C_
____ ___________________________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ ________________ ________________ _______________
1 {'2020-12-10 14:11:13:851'} 9.9e+37 27.623 27.652 27.607 27.814 27.678 27.483 27.544 9.9e+37 9.9e+37 9.9e+37 9.9e+37 9.9e+37 27.776 9.9e+37 9.9e+37 9.9e+37
2 {'2020-12-10 14:11:15:827'} 9.9e+37 27.623 27.651 27.607 27.81 27.681 27.483 27.542 9.9e+37 9.9e+37 9.9e+37 9.9e+37 9.9e+37 27.777 9.9e+37 9.9e+37 9.9e+37
3 {'2020-12-10 14:11:17:827'} 9.9e+37 27.622 27.65 27.606 27.812 27.678 27.481 27.542 9.9e+37 9.9e+37 9.9e+37 9.9e+37 9.9e+37 27.778 9.9e+37 9.9e+37 9.9e+37
4 {'2020-12-10 14:11:19:827'} 9.9e+37 27.623 27.65 27.606 27.813 27.676 27.482 27.541 9.9e+37 9.9e+37 9.9e+37 9.9e+37 9.9e+37 27.777 9.9e+37 9.9e+37 9.9e+37
5 {'2020-12-10 14:11:21:827'} 9.9e+37 27.62 27.649 27.602 27.81 27.678 27.479 27.539 9.9e+37 9.9e+37 9.9e+37 9.9e+37 9.9e+37 27.777 9.9e+37 9.9e+37 9.9e+37
6 {'2020-12-10 14:11:23:827'} 9.9e+37 27.621 27.648 27.602 27.808 27.675 27.48 27.54 9.9e+37 9.9e+37 9.9e+37 9.9e+37 9.9e+37 27.777 9.9e+37 9.9e+37 9.9e+37
7 {'2020-12-10 14:11:25:827'} 9.9e+37 27.619 27.647 27.601 27.808 27.674 27.478 27.537 9.9e+37 9.9e+37 9.9e+37 9.9e+37 9.9e+37 27.776 9.9e+37 9.9e+37 9.9e+37
8 {'2020-12-10 14:11:27:827'} 9.9e+37 27.618 27.646 27.602 27.809 27.676 27.477 27.539 9.9e+37 9.9e+37 9.9e+37 9.9e+37 9.9e+37 27.776 9.9e+37 9.9e+37 9.9e+37
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions
property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
ans =
8×19 table
Scan Time x101_M1_RW1__C_ x102_M1_RW2__C_ x103_M1_RS1__C_ x104_M1_RS2__C_ x105_M1_RN2__C_ x106_M1_RD2__C_ x107_M1_RW3__C_ x108_M1_RW4__C_ x109_M1_RT3__C_ x201_M1_RM2__C_ x202_M1_RM1__C_ x203_M1_RN1__C_ x204_M1_RD1__C_ x205_M1_RR1__C_ x206_M1_RSD1__C_ x207_M1_RSD2__C_ x208_M1_RF1__C_
____ ___________________________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ _______________ ________________ ________________ _______________
1 {'2020-12-10 14:49:56:530'} 9.9e+37 26.805 26.834 26.793 27.056 27.001 26.658 26.727 9.9e+37 27.367 27.417 26.644 27.241 27.302 9.9e+37 9.9e+37 9.9e+37
2 {'2020-12-10 14:49:58:507'} 9.9e+37 26.803 26.831 26.789 27.057 27 26.656 26.728 9.9e+37 27.367 27.418 26.644 27.24 27.302 9.9e+37 9.9e+37 9.9e+37
3 {'2020-12-10 14:50:00:507'} 9.9e+37 26.799 26.831 26.786 27.054 26.999 26.652 26.724 9.9e+37 27.367 27.418 26.643 27.24 27.302 9.9e+37 9.9e+37 9.9e+37
4 {'2020-12-10 14:50:02:507'} 9.9e+37 26.795 26.828 26.785 27.056 26.998 26.647 26.723 9.9e+37 27.366 27.417 26.643 27.24 27.302 9.9e+37 9.9e+37 9.9e+37
5 {'2020-12-10 14:50:04:507'} 9.9e+37 26.791 26.827 26.785 27.053 26.999 26.642 26.72 9.9e+37 27.366 27.417 26.643 27.24 27.302 9.9e+37 9.9e+37 9.9e+37
6 {'2020-12-10 14:50:06:507'} 9.9e+37 26.786 26.824 26.782 27.052 26.996 26.639 26.72 9.9e+37 27.365 27.416 26.642 27.239 27.301 9.9e+37 9.9e+37 9.9e+37
7 {'2020-12-10 14:50:08:507'} 9.9e+37 26.78 26.821 26.779 27.05 26.997 26.631 26.716 9.9e+37 27.365 27.416 26.641 27.239 27.301 9.9e+37 9.9e+37 9.9e+37
8 {'2020-12-10 14:50:10:507'} 9.9e+37 26.771 26.821 26.774 27.051 26.996 26.626 26.716 9.9e+37 27.365 27.415 26.641 27.238 27.301 9.9e+37 9.9e+37 9.9e+37
>>
Fixup the import object to read the datetime format.
You're on your own with the live editor; never used it.
  17 Comments
Jeremy Hughes
Jeremy Hughes on 27 May 2021
To get the full table replace the call to preview with readtable.
If all the files have the same format, you don't need to call detectImportOptions each time. This will save time.
opts = detectImportOptions(d(1).name,"Delimiter",","); % already know it's csv, avoid detecting it
T = {};
for i=1:numel(d)
fid = fopen(d(i).name,'r');
nhdr = 8 + cell2mat(textscan(fid,'Total Channels:%f',1,'HeaderLines',5,'Delimiter',','));
fid = fclose(fid);
if nhdr == 8 % skip no data in file cases if that occurs (dpb conjecture)
warning(['No data. Skipped file: ' d(i).name])
continue
end
opts.DataLines = [nhdr+2 inf];
opts.VariableNamesLine = nhdr+1;
T{i} = readtable(d(i).name,opts)
end
Arshey Dhangekar
Arshey Dhangekar on 28 May 2021
Edited: Arshey Dhangekar on 31 May 2021
Hello thank you so much It is working for me.
How can we use loop for ploting grpah for two columns?
I want to plot graph x vs y for all columns present in table ?
How can I do it for multiple csv file name of "WT_201119.csv" contain 52 main header with column
where I have to do changes? similar to above one csv file
Because I got error
'DataLines' must be specified as a positive integer or a 2-column matrix of positive integers."
And Kudos to Jeremy Hughe!!

Sign in to comment.

More Answers (0)

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!