LSTM learning error with load data
5 views (last 30 days)
Show older comments
Hello.
I'm a self-taught machine learning student. I don't know what's wrong with the code below.
Is there any kind person who can teach me, please?
I attached the csv data file.
clc; clear; close all;
%% 0. Load data from CSV file
% filename = 'PuB_Demand_2024_v394.csv'; % Replace with the actual file name
filename = 'PUB_Demand_2024_v394.csv'; % actual file name
% The top 3 rows contain meta information, row 4 contains column headers, data starts from row 5
opts = detectImportOptions(filename);
opts.VariableNamesLine = 4; % Row containing the column headers
opts.DataLines = [5, Inf]; % Actual data (from row 5 to the end of the file)
T = readtable(filename, opts);
% Combine Date and Hour into a single datetime (e.g., 1/1/2024, Hour=1 → 2024-01-01 01:00:00)
T.DateTime = datetime(T.Date,'InputFormat','MM/dd/yyyy') + hours(T.Hour - 1);
% Use 'MarketDemand' column in T as the target time series for prediction
dataLoad = T.MarketDemand;
% (Optional) You can remove columns like Date and Hour if they are not needed
% T(:, {'Date','Hour'}) = [];
%% 1. Data Split (Train/Test)
% Length of the entire data
N = length(dataLoad);
% Set the training data ratio
trainRatio = 0.8;
numTrain = floor(trainRatio * N);
% Split data into training and test sets
dataTrain = dataLoad(1:numTrain);
dataTest = dataLoad(numTrain+1:end);
%% 2. Create Sequences (Input, Target)
% Window size (number of past time steps to consider at once)
windowSize = 10;
% X: Past data of length windowSize as 'features',
% Y: The next time step data (demand) as 'target'
X = [];
Y = [];
for i = 1:(N - windowSize)
X = [X; dataLoad(i : i+windowSize-1)'];
Y = [Y; dataLoad(i + windowSize)];
end
% Training range: Consider the last windowSize data within the range (1 ~ numTrain)
XTrain = X(1 : (numTrain - windowSize), :);
YTrain = Y(1 : (numTrain - windowSize));
XTest = X((numTrain - windowSize + 1) : end, :);
YTest = Y((numTrain - windowSize + 1) : end);
% Convert to cell array format required by LSTM
XTrain = num2cell(XTrain, 2); % Each row is one cell
YTrain = num2cell(YTrain, 2);
XTest = num2cell(XTest, 2);
YTest = num2cell(YTest, 2);
%% 3. Define LSTM Network
% Number of input features: 1 (single time series)
% Number of LSTM hidden units: Adjust as needed (e.g., 50)
% Number of outputs: 1 (predicting the next time step demand)
inputSize = 1;
numHiddenUnits = 50;
numResponses = 1;
layers = [ ...
sequenceInputLayer(inputSize)
lstmLayer(numHiddenUnits, 'OutputMode','last') % sequence-to-one structure
fullyConnectedLayer(numResponses)
regressionLayer
];
%% 4. Set training options
options = trainingOptions('adam', ...
'MaxEpochs', 100, ... % Number of epochs
'GradientThreshold', 1, ... % Prevent gradient explosion
'InitialLearnRate', 0.01, ... % Initial learning rate
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',50, ...
'LearnRateDropFactor',0.1, ...
'Plots','training-progress', ... % Display training progress plot
'Verbose',0);
%% 5. Train the LSTM Network
net = trainNetwork(XTrain, YTrain, layers, options);
%% 6. Prediction and Performance Evaluation
% Predict on the test set
YPred = predict(net, XTest);
% Convert from cell array to numeric vector
YPred = cell2mat(YPred);
YTestVec = cell2mat(YTest);
% Calculate RMSE
rmse = sqrt(mean((YPred - YTestVec).^2));
fprintf('Test RMSE: %.4f\n', rmse);
% Visualize the results
figure;
plot(YTestVec, 'b-','LineWidth',1.5); hold on;
plot(YPred, 'r--','LineWidth',1.5);
legend('Actual','Predicted','Location','best');
title(['Load Prediction Results (RMSE: ' num2str(rmse) ')']);
xlabel('Sample Index');
ylabel('Load (Market Demand)');
grid on;
0 Comments
Answers (1)
Gayathri
on 1 Feb 2025
As the error "Invalid training data. For regression tasks, responses must be a vector, a matrix, or a 4-D array of real numeric responses. Responses must not contain NaNs" suggests, the responses as in "YTrain" and "YTest" must be a numeric vector or matrix, rather than a cell array. Therefore you can remove the below lines from the code.
%YTrain = num2cell(YTrain, 2);
%YTest = num2cell(YTest, 2);
Also, once the prediction is done using the trained model "net", there is no requirement to convert the prediction, "YPred" to numeric format. Please modify the code as shown below to resolve further errors.
YPred = predict(net, XTest);
% Calculate RMSE
rmse = sqrt(mean((YPred - YTest).^2));
fprintf('Test RMSE: %.4f\n', rmse);
% Visualize the results
figure;
plot(YTest, 'b-','LineWidth',1.5); hold on;
plot(YPred, 'r--','LineWidth',1.5);
legend('Actual','Predicted','Location','best');
title(['Load Prediction Results (RMSE: ' num2str(rmse) ')']);
xlabel('Sample Index');
ylabel('Load (Market Demand)');
grid on;
For more information on LSTM networks, please refer to the below documentation link.
Hope you find this information helpful!
See Also
Categories
Find more on Deep Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!