Optmize Model Hyperparameters using Directforecaster

19 views (last 30 days)
Dear all,
Is it possible to optmize the hyperparameters when using the directforecaster function to predict times series?
clear
close
clc
Tbl = importAndPreprocessPortData;
Tbl.Year = year(Tbl.Time);
Tbl.Quarter = quarter(TEU.Time);
slidingWindowPartition = tspartition(height(Tbl),"SlidingWindow", 4, "TestSize", 24)
Mdl = directforecaster(Tbl, "TEU", "Horizon", 1:12, "Learner", "lsboost", "ResponseLags", 1:12, ...
"LeadingPredictors", "all", "LeadingPredictorLags", {0:12, 0:12}, ...
"Partition", slidingWindowPartition, "CategoricalPredictors", "Quarter") % I would like to optmize lsboost hyperparameters
predY = cvpredict(Mdl)

Answers (1)

Taylor
Taylor about 7 hours ago
Yes. Instead of Learner="lsboost", create an LSBoost ensemble whose hyperparameters were tuned with fitrensemble, then pass that template (or its optimized settings) into directforecaster.
Step 1: Optimize LSBoost with fitrensemble
Use a representative regression problem (same predictors and response as your forecasting setup, but no time lags/partition yet) and let fitrensemble do Bayesian HPO on LSBoost.
% Example: X, Y built to mimic your forecasting design
X = Tbl(:, setdiff(Tbl.Properties.VariableNames, "TEU"));
Y = Tbl.TEU;
treeTmpl = templateTree("Surrogate","on"); % or your preferred base tree
MdlBoostOpt = fitrensemble(X, Y, ...
"Method","LSBoost", ...
"Learners",treeTmpl, ...
"OptimizeHyperparameters",{"NumLearningCycles","LearnRate","MaxNumSplits"}, ...
"HyperparameterOptimizationOptions",struct( ...
"AcquisitionFunctionName","expected-improvement-plus", ...
"MaxObjectiveEvaluations",30)); % tune budget
MdlBoostOpt now contains the tuned LSBoost settings (number of trees, learning rate, tree depth, etc.).
​Step 2: Build an equivalent LSBoost template
Pull the chosen hyperparameters from MdlBoostOpt and recreate them as a templateEnsemble to feed into directforecaster.
% Extract tuned hyperparameters
numTrees = MdlBoostOpt.NumTrained; % or MdlBoostOpt.NumLearningCycles
learnRate = MdlBoostOpt.LearnRate;
maxSplits = MdlBoostOpt.Trained{1}.ModelParameters.SplitCriterionParameters.MaxNumSplits;
minLeaf = MdlBoostOpt.Trained{1}.ModelParameters.MinLeafSize;
% Base tree with tuned depth / leaf size, etc.
treeTuned = templateTree( ...
"MaxNumSplits", maxSplits, ...
"MinLeafSize", minLeaf, ...
"Surrogate", "on"); % if used in optimization
% LSBoost ensemble template with tuned settings
lsboostTuned = templateEnsemble( ...
"LSBoost", numTrees, treeTuned, ...
"LearnRate", learnRate);
(If some properties are awkward to extract, you can hard‑code them from MdlBoostOpt.HyperparameterOptimizationResults.XAtMinObjective instead.)
Step 3: Use tuned LSBoost in directforecaster
Now replace "lsboost" in your original code with the tuned template:
Mdl = directforecaster(Tbl, "TEU", ...
"Horizon", 1:12, ...
"Learner", lsboostTuned, ... % << tuned LSBoost template
"ResponseLags", 1:12, ...
"LeadingPredictors", "all", ...
"LeadingPredictorLags", {0:12, 0:12}, ...
"Partition", slidingWindowPartition, ...
"CategoricalPredictors", "Quarter");
directforecaster will then train one tuned LSBoost model per horizon step, using your sliding‑window cross‑validation partition, but without doing any further hyperparameter search on its own.
  1 Comment
Geovane Gomes
Geovane Gomes about 3 hours ago
Thank you, Taylor. With a few small changes the code ran successfully.
I still have a couple of questions and would appreciate your help.
  1. Could data leakage be happening since I am using the entire table Tbl to train the fitrensemble?
  2. Another question is about not using any lags when obtaining the optimized hyperparameters. In the directforecaster I use response and leading predictor lags, so I might end up optimizing a model that is different from the one I will actually use in directforecaster.
clear
close
clc
Tbl = importdata("Tbl.mat")
Tbl.Year = year(Tbl.Time);
Tbl.Month = month(Tbl.Time);
X = Tbl(:, setdiff(Tbl.Properties.VariableNames, "TEUs"));
X = [X.Month, X.Year];
Y = Tbl.TEUs;
treeTmpl = templateTree("Surrogate", "on");
MdlBoostOpt = fitrensemble(X, Y, ...
"Method","LSBoost", ...
"Learners",treeTmpl, ...
"OptimizeHyperparameters",{'NumLearningCycles','LearnRate','MaxNumSplits'}, ...
"HyperparameterOptimizationOptions",struct( ...
"AcquisitionFunctionName","expected-improvement-plus", ...
"MaxObjectiveEvaluations",30)) % tune budget
% Extract tuned hyperparameters
numTrees = MdlBoostOpt.NumTrained % or MdlBoostOpt.NumLearningCycles
learnRate = MdlBoostOpt.ModelParameters.LearnRate;
minLeaf = MdlBoostOpt.ModelParameters.LearnerTemplates{1}.ModelParams.MinLeaf;
maxSplits = MdlBoostOpt.ModelParameters.LearnerTemplates{1}.ModelParams.MaxSplits;
% Base tree with tuned depth / leaf size, etc.
treeTuned = templateTree( ...
"MaxNumSplits", maxSplits, ...
"MinLeafSize", minLeaf, ...
"Surrogate", "on"); % if used in optimization
% LSBoost ensemble template with tuned settings
lsboostTuned = templateEnsemble( ...
"LSBoost", numTrees, treeTuned, ...
"LearnRate", learnRate);
slidingWindowPartition = tspartition(height(Tbl), "SlidingWindow", 4, "TestSize", 24)
Mdl = directforecaster(Tbl, "TEUs", ...
"Horizon", 1:12, ...
"Learner", lsboostTuned, ... % << tuned LSBoost template
"ResponseLags", 1:12, ...
"LeadingPredictors", "all", ...
"LeadingPredictorLags", {0:12, 0:12}, ...
"Partition", slidingWindowPartition);
predY = cvpredict(Mdl)
figure
plot(predY,"Time","TEUs_Step12")
axis padded
hold on
plot(Tbl, "Time", "TEUs")

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!