MATLAB Answers

Confidence interval on dependent variable obtained through two consecutive linear regressions

2 views (last 30 days)
FastCar
FastCar on 18 Jun 2020
Commented: the cyclist on 18 Jun 2020
Dear all,
I have an independent variable vector x and two dependent variable y1 and y2.
y1 is given by
y1 = a * x + b
and a and b are given by a linear regression, thus I have the standard deviation on both the parameters.
y2 is given by
y2 = c * y1 + d
and c and d are given by a linear regression, thus I have the standard deviation on both the parameters.
I would like to compute the confidence interval for the variable y2 expressed as a function of the variable x.
Many thanks in advance

  5 Comments

Show 2 older comments
the cyclist
the cyclist on 18 Jun 2020
I'm assuming you do not have access to the actual data underlying the regression. (If you did, you could just run a regression on y2 vs. x.)
Your phrase "the confidence interval for the variable y2 expressed as a function of the variable x" is not clear to me.
Do you mean that if you could do the regression
y2 = e * x + f
what is the confidence interval on the parameters e and f?
FastCar
FastCar on 18 Jun 2020
I think you are right, I can do a regression on y2 vs x, but I was wondering if there was a relation between the standard deviation on a, b, c and d and the standard deviation of e and f
the cyclist
the cyclist on 18 Jun 2020
It gets complicated quickly.
One reason why is that when you do an ordinary linear regression of the form
y = a * x + b;
one of the assumptions is that there is no error in the measure of x (or at least negligible). So, the second regression is explicitly violating that assumption, when you say there is error in y1 (that carries over to the estimation of the parameters for y2). You will not have a valid estimate of c and d.
Technically, you should do a Deming regression (or some other errors-in-variables model) for the second regression, unless the error in y1 happens to be very small compared to the variability of y2.
Another reason is that you don't know (or at least haven't specified) whether y2 potentially has dependence directly on x, in addition to the dependence mediated by y1.
So, I do think it is probably possible to derive what the relationships are amongst the uncertainties in the parameters of the regression, but one would need to map out a bunch of assumptions first.
It's certainly not something I would expect to see in a built-in MATLAB function, as you hoped.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!