How to speed up this code?

2 views (last 30 days)
Ara
Ara on 1 Jul 2024
Commented: Ara on 1 Jul 2024
Dear All,
I have this code but when I run it it takes one day and half to run. How can I speed it up?
Best regards,
Ara
clear all;
clc;
% Set the folder path where the .nc files are located
folderPath = 'E:\data\podtc_apr';
% Get a list of all NetCDF files in the folder
fileList = dir(fullfile(folderPath, '*_nc'));
numFiles = numel(fileList);
% Initialize the data structures
data = cell(numFiles, 1);
dateArray = [];
typeArray = cell(numFiles, 1);
% Loop through each file
for fileIndex = 1:numFiles
% Read the NetCDF file
filePath = fullfile(folderPath, fileList(fileIndex).name);
% Read the data from the NetCDF file
ncinfo_struct = ncinfo(filePath);
if isfield(ncinfo_struct, 'Variables')
variable_names = {ncinfo_struct.Variables.Name};
if all(ismember({'time', 'TEC', 'S4', 'RFI', 'elevation', 'occheight', 'caL1_SNR', 'pL2_SNR', 'x_LEO', 'y_LEO', 'z_LEO', 'x_GPS', 'y_GPS', 'z_GPS'}, variable_names))
% Extract the date and type information from the file name
[~, filename, ~] = fileparts(fileList(fileIndex).name);
dateStr = regexp(filename, '\d{4}\.\d{3}', 'match', 'once');
typeStr = regexp(filename, 'G\d{2}|R\d{2}', 'match', 'once');
% Read the data from the NetCDF file
data{fileIndex}.time = ncread(filePath, 'time');
data{fileIndex}.TEC = ncread(filePath, 'TEC');
data{fileIndex}.S4 = ncread(filePath, 'S4');
data{fileIndex}.RFI = ncread(filePath, 'RFI');
data{fileIndex}.elevation = ncread(filePath, 'elevation');
data{fileIndex}.occheight = ncread(filePath, 'occheight');
data{fileIndex}.caL1_SNR = ncread(filePath, 'caL1_SNR');
data{fileIndex}.pL2_SNR = ncread(filePath, 'pL2_SNR');
data{fileIndex}.x_LEO = ncread(filePath, 'x_LEO');
data{fileIndex}.y_LEO = ncread(filePath, 'y_LEO');
data{fileIndex}.z_LEO = ncread(filePath, 'z_LEO');
data{fileIndex}.x_GPS = ncread(filePath, 'x_GPS');
data{fileIndex}.y_GPS = ncread(filePath, 'y_GPS');
data{fileIndex}.z_GPS = ncread(filePath, 'z_GPS');
% Store the date and type information
dateArray = [dateArray, str2double(dateStr)];
typeArray{fileIndex} = typeStr;
else
% Skip this file and move on to the next one
fprintf('File "%s" does not contain all the required variables. Skipping this file.\n', fileList(fileIndex).name);
continue;
end
else
% Skip this file and move on to the next one
fprintf('File "%s" does not contain the "Variables" field. Skipping this file.\n', fileList(fileIndex).name);
continue;
end
end
% Sort the data by date and type
[sortedDates, sortedIndices] = sort(dateArray);
sortedTypes = cellfun(@(x) x, typeArray(sortedIndices), 'UniformOutput', false);
% Create the sorted data structures
sortedData = cell(numFiles, 1);
for i = 1:numFiles
sortedData{i} = data{sortedIndices(i)};
end
% Save the data to a .mat file
save('podtc_apr_data.mat', 'sortedData', 'sortedDates', 'sortedTypes');

Answers (1)

Matlab Pro
Matlab Pro on 1 Jul 2024
Hi @Ara
First of all - I would try to indentify the bottolenecks that consume the most time.
A good idea is to use Matlab's profiler
here is a simple example
function test_profiling()
profile on
my_time_consumng_method();
profile off
profile viewer
%----------------------------------
function my_time_consumng_method()
%% time consuming code
for i=1:4
magic(10e3);
end
I would suspect that bottlenecks would probably be the i/o operation
Cosider creating a "small scale" problem (smaller # of files, maybe even 1 file) - just for the profiling phase - to be short.
Then - think of a way to reduce the i/o issue time.
A possible solution: copying aherad all relevant files to a local drive and only then - read them from your local drive. Usually - this makes a big difference in matter of timing
  1 Comment
Ara
Ara on 1 Jul 2024
Thank you Matlab Pro.
Would you please tell me which line need to be changed or added in my code?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!