Multivariate regression for constrained dependent variables

13 views (last 30 days)
I have a database formed by 6 independent variables and 2 dependent variables. The equation for each dependent variable should be something like:
I want to find βs that satisfy .
Which metodology should I use?
I tried using mvregress, but I couldn't constrain the dependent variables. I am also thinking of using optimization as well, but I still couldn't figure out how to constrain the dependent variables.
Thank you so much!
  8 Comments
Umar
Umar on 11 Aug 2024
Hi @Ismail,
To address your comments regarding, “ Ah I see, so I don't need to put any constrained values in that case, right?Also, does that mean that there is no function that could deliberately put dependent variables as the constrains, right?”
You are correct that if you choose not to impose constraints on your beta values, you will allow for greater flexibility in fitting your model. However, if you were to impose equality constraints on dependent variables (like ensuring their sum equals a certain value), you would indeed face challenges. As you noted, with N data points, this would generate multiple constraints—one for each data point—thereby reducing your degrees of freedom significantly. Now, in situations where there are many data points (N), applying equality constraints can lead to an over-constrained system. Each constraint would reduce your available degrees of freedom (the number of independent pieces of information you have), which can hinder your ability to find a unique solution for the beta values. Given that traditional methods may not accommodate your specific needs effectively, consider reformulating your problem as an optimization goal rather than applying strict equality constraints. For instance, you could frame it as minimizing the difference from 1 while allowing for some tolerance—this can be expressed as a penalty function or through regularization techniques. In addition, if overfitting is a concern or if you wish to incorporate some form of control over the estimated parameters, consider using regularization techniques such as Lasso or Ridge regression. I would suggest you should also take @ John D'Errico’s advice into account as well.
John D'Errico
John D'Errico on 11 Aug 2024
Edited: John D'Errico on 11 Aug 2024
A ridge regression as suggested by @Umar may be a great idea here. Classically, ridge regression is a technique to bias a regression estimator, pushing the estimated coefficients towards zero for a singular or nearly singular problem. But you can also use different bias forms. One common one I have used is to bias coefficients towards smoothness, in the context of a curve fitting problem (think about splines, or low order piecewise functions here.)
In your context, I would build a scheme to bias the coefficients such that the expression (y1hat+y2hat-1) is always small. That is, write the target expression as a quadratic form. The nice thing is, you could implement that scheme without need to use a nonlinear tool like lsqnonlin. Backslash would be sufficient in the end if you do it carefully.

Sign in to comment.

Answers (3)

Umar
Umar on 10 Aug 2024
Hi @Ismail,
I would suggest utilizing a constrained optimization technique approach and formulate the problem as an optimization task where you minimize a cost function subject to equality constraints on the dependent variables. To start working on this process, first define your data by generating random data for independent variables X and dependent variables Y.
X = randn(10, 6); % Independent variables (10 samples, 6 features)
Y = randn(10, 2); % Dependent variables (10 samples, 2 outputs)
Then, define the equation constraint by specifying the coefficients matrix Aeq and the right-hand side beq of the equation constraint.
Aeq = [1, 2, 3, 4, 5, 6]; % Coefficients matrix for the equation constraint
beq = 10; % Right-hand side of the equation constraint
Afterwards, define the objective function by creating a function that represents the error you want to minimize. In this case, it can be the least squares error between the actual Y and the predicted values based on X and beta.
fun = @(beta) norm(Y - X*beta)^2;
Now, provide an initial guess for the coefficients beta.
beta0 = zeros(6, 1); % Initial guess for beta (6 coefficients)
Use fmincon function for constrained optimization with the specified equality constraint.
options = optimoptions('fmincon', 'Algorithm', 'interior-point');
beta = fmincon(fun, beta0, [], [], Aeq, beq, [], [], [], options);
Display optimal beta values by outputting the optimal values of beta that satisfy the equation constraint.
disp('Optimal beta values:');
disp(beta);
By implementing this method in your code, you should be able to use constrained optimization to find the βs that satisfy your equation constraint while considering the dependencies between the variables. This methodology also allows you to optimize the coefficients under the specified constraints, providing a solution to your problem. Please let me know if you have any further questions.
  6 Comments
Umar
Umar on 10 Aug 2024
Hi @Ismail,
Have you noticed the comments shared by @Torsten and @ John D'Errico, they have provided excellent examples to your queries. Please let us know if all your questions have been answered. Please don’t forget to click “accept answer” and vote for my friends. They have years of experience and have helped tackle lot of more complex problems than this one.

Sign in to comment.


Torsten
Torsten on 10 Aug 2024
Edited: Torsten on 10 Aug 2024
Define a Nx12 matrix A as
A = [x11,x12,x13,x14,x15,x16,x21,x22,x23,x24,x25,x26]
and a Nx1 vector b as
b = ones(N,1)
and solve for beta = [beta11;beta12;beta13;beta14;beta15;beta16;beta21;beta22;beta23;beta25;beta26] by using
beta = A\b
If you also have values for the y, your matrix A becomes 3*N x 12 as
A = [x11,x12,x13,x14,x15,x16,x21,x22,x23,x24,x25,x26;x11,x12,x13,x14,x15,x16,zeros(N,6);zeros(N,6),x21,x22,x23,x24,x25,x26];
your vector b becomes 3*N x 1 as
b = [ones(N,1);y1;y2]
and you solve for beta = [beta11;beta12;beta13;beta14;beta15;beta16;beta21;beta22;beta23;beta25;beta26] by using
beta = A\b
If it's the latter case you have to consider, you can also give weights to the three different error components as @John D'Errico does in his suggestion (alpha1, alpha2, alpha3).

John D'Errico
John D'Errico on 10 Aug 2024
Sorry. You CANNOT insure that y1+y2 == 1, EXACTLY. At least not if there is any noise in your data.
All you can do is to use it as a target. And then that target will conflict with your goal of minimizing the errors in y1 and y2. As such, you might set it up as a problem of the form
alpha1*norm(1 - y1hat - y2hat)^2 + alpha2*norm(y1 - y1hat)^2 + alpha3*norm(y2 - y2hat)^2
where y1hat and y2hat are the predictions from the separate regression models, and alpha1, alpha2, and alpha3 are parameters you must choose to tradeoff the residuals from each piece of the problem.
An important point I want to make is that the lack of a constant term in your models may well be a significant problem.
Anyway, how might you do this? I suppose you could solve it using lsqnonlin, though, since the models do have parameters that enter in linearly, it would seem a linear modeling tool must be an option too.
Let me see if I can give an example. I'll need to make up some data, since you have provided none that I see.
N = 50;
X1 = randn(N,6);
X2 = rand(N,6);
Y1 = 0.4 + randn(N,1)/100;
Y2 = 0.6 + randn(N,1)/100;
As you can see, the data does have the property that y1+y2 should be 1, but for the noise in the system.
If I model each process separately, I get this:
B1 = X1\Y1;
B2 = X2\Y2;
[B1,B2]
ans = 6x2
0.0587 0.3051 -0.0032 0.1956 0.0102 0.1781 0.0732 0.1198 0.0657 0.2308 -0.0492 0.1005
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
It is just random crap data, so I really don't care about how well the models fit. I expect the models will be pretty poor, in fact.
Y1hat = X1*B1;
Y2hat = X2*B2;
[norm(Y1 - Y1hat),norm(Y2 - Y2hat)]
ans = 1x2
2.6661 1.0332
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
But what happens with the predictions? As done separately, how well does it do?
norm(1 - (Y1 + Y2))
ans = 0.1054
norm(1 - (Y1hat + Y2hat))
ans = 3.0541
So, with no target put on the parallel models, things actually got WAY worse in terms of the sum of the predictions being 1. Again, the lack of a constant term in your model is probably a huge factor here.
Anyway, having built the models separately, can we formulate this as a regression model, but with the sum target built in?
alph = [1/3 1/3 1/3];
B12start = randn(12,1);
[B12est,~,~,exitflag] = lsqnonlin(@(B12) obj(B12,X1,X2,Y1,Y2,alph),B12start);
Local minimum found. Optimization completed because the size of the gradient is less than the value of the optimality tolerance.
reshape(B12est,6,2)
ans = 6x2
0.0599 0.4327 -0.0137 0.2133 0.0158 0.2106 0.0582 0.1975 0.0538 0.3339 -0.0636 0.0829
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Y1hatest = X1*B12est(1:6);
Y2hatest = X2*B12est(7:12);
[norm(Y1 - Y1hatest),norm(Y2 - Y2hatest)]
ans = 1x2
2.6715 1.6373
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
norm(1 - (Y1hatest + Y2hatest))
ans = 2.1121
As you can see, the models are a little different. The coefficients have changed, but not by a lot. But now the target for the sum of the two models has had its error reduced.
function res = obj(B12,X1,X2,Y1,Y2,alph)
Y1hat = X1*B12(1:6);Y2hat = X2*B12(7:12);
res = [alph(1)*(1 - (Y1hat + Y2hat));alph(2)*(Y1-Y1hat);alph(3)*(Y2-Y2hat)];
end

Products


Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!