Read in specific range of large .csv
Show older comments
I have very large .csv files that I am trying to work with, e.g. 7000 * 72000.
In each file the first column is a time vector. By saving these time vectors in separate files, I can load them in, get the row range of the dates of interest, and then use that to read in the rows of interest from the larger .csv?
However, I can't figure out how to apply this last step. Here is what I have so far...
%get time period of interest
startdate=datetime(2019,08,20);
enddate=datetime(2019,09,10);
timeperiod=datenum(startdate:enddate);
timeperiod=timeperiod';
%load in time vector
tvec_folder=('H:\SoundTrap\Boats\PSD Output\PSD_tvec');
tvecfile1=('TVEC_002Tiritiri_5280_PSD_1sHammingWindow_50%Overlap_2min_output.csv');
PSD_tvec=readtable(fullfile(tvec_folder,tvecfile1)); %read tvec and get times
PSD_tvec_t=PSD_tvec.Var1;
%get row range of interest
idx=PSD_tvec_t>timeperiod(1) & PSD_tvec_t<timeperiod(end); %find rows in tvec
%which correspond to date range of interest
x=find(idx(:,1)>0); %get row numbers for reading in PSD
PSDfolder=('H:\SoundTrap\Boats\PSD Output\Duty cycle data'); %folder where PSD output files are
PSDfile1=('002Tiritiri_5280_PSD_1sHammingWindow_50%Overlap_2min_output.csv');
%PSDfile1=readtable(fullfile(PSDfolder,PSDfile1)); %read in PSD file
How can I select a range of interest as I read the .csv?
In addition to selecting specific rows, I could also cut the data down by selecting different columns. I have tried that this way:
opts.SelectedVariableNames(2:24000)
T=readtable(fullfile(PSDfolder,PSDfile1),opts);
...but for some reason, whilst this does select the desired column range, it doesn't read the full number of rows in the file and there are no error messages.
Alternative ways of solving the problem would be equally appreciated. I need to read in these large files but since it is time consuming and I don't always need all of the data, I am looking to be more efficient. Thanks
4 Comments
Mathieu NOE
on 10 Nov 2020
hello welcome back Louise !
for the second csv file , have you tried with csvread and specifie the range ?
csvread Read a comma separated value file.
M = csvread('FILENAME') reads a comma separated value formatted file
FILENAME. The result is returned in M. The file can only contain
numeric values.
M = csvread('FILENAME',R,C) reads data from the comma separated value
formatted file starting at row R and column C. R and C are zero-
based so that R=0 and C=0 specifies the first value in the file.
M = csvread('FILENAME',R,C,RNG) reads only the range specified
by RNG = [R1 C1 R2 C2] where (R1,C1) is the upper-left corner of
the data to be read and (R2,C2) is the lower-right corner. RNG
can also be specified using spreadsheet notation as in RNG = 'A1..B7'.
Louise Wilson
on 10 Nov 2020
Louise Wilson
on 10 Nov 2020
Edited: Louise Wilson
on 10 Nov 2020
Louise Wilson
on 10 Nov 2020
Accepted Answer
More Answers (0)
Categories
Find more on Spreadsheets in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!