# Multiple Regression and Intercept

30 views (last 30 days)
형현 on 25 May 2024 at 8:48
Commented: Star Strider on 27 May 2024 at 4:38
% Load the data from the Excel file
% Define the dependent variable
y = data.Arrive;
% Define the independent variables
X = [data.Price_m, data.Volme, data.Relative_y, data.Relative_m, ...
data.mine, data.debt, data.Quin, data.Cpi, data.Rate, data.Depo, ...
data.Bull, data.Sale, data.Move, data.Sub];
% Add a column of ones to the independent variables matrix for the intercept
X = [ones(size(X, 1), 1), X];
% Perform the multiple linear regression
[b, ~, ~, ~, stats] = regress(y, X);
% Display the results
disp('Regression Coefficients:');
disp(b);
disp('R-squared:');
disp(stats(1));
disp('F-statistic:');
disp(stats(2));
disp('p-value:');
disp(stats(3));
disp('Error Variance:');
disp(stats(4));
I'm going to proceed with a multilinear regression analysis with the data string called Arrive as the dependent variable, and the result is as follows. Is it ok...?
disp(stats(4));
Regression Coefficients:
1.0e+06 *
4.1453
-0.0190
0.0040
-0.0960
-0.6115
-0.0022
-0.0140
0.0259
0.0070
-0.0602
-0.0196
-0.0003
-0.0000
0.0000
0.0000
R-squared:
0.3997
F-statistic:
4.5189
p-value:
3.5809e-06
Error Variance:
3.8687e+09

Star Strider on 25 May 2024 at 13:21
I see nothing wrong with the code, and it conforms to the example in the regress documentation.
The only suggestion I have is to use table indexing to replace the initial ‘X’ so —
data = 110x16 table
Date Arrive Price_m Volme Relative_y Relative_m mine debt Quin Cpi Rate Depo Bull Sale Move Sub ___________ __________ _______ ______ __________ __________ ______ ______ ____ ________ ____ ____ ____ _____ _____ __________ 01-Jan-2015 6.1513e+05 84.854 99.224 0.90087 1.0464 57.982 72.6 8.8 0.67762 2 25.7 57 12546 22145 9.4723e+06 01-Feb-2015 6.6337e+05 85.05 99.845 0.89997 1.0553 57.783 73.078 8.8 -0.05917 2 25.7 68.7 10484 24046 1.5481e+07 01-Mar-2015 7.7112e+05 85.322 102.01 0.89923 1.0714 57.604 73.543 8.8 0.009515 1.75 25.7 86.2 27303 10931 1.5779e+07 01-Apr-2015 6.4945e+05 85.656 102.89 0.90282 1.0768 57.445 73.994 8.8 0.030657 1.75 25.7 81.8 34230 20550 1.605e+07 01-May-2015 6.0572e+05 85.965 102.38 0.90459 1.083 57.304 74.428 8.8 0.28005 1.75 25.7 78.2 38583 15743 1.6232e+07 01-Jun-2015 6.504e+05 86.288 102.58 0.90773 1.0863 57.182 74.844 8.9 0.020023 1.5 25.7 79 33416 30593 1.6469e+07 01-Jul-2015 6.271e+05 86.579 101.88 0.9152 1.0744 57.078 75.241 9.6 0.18017 1.5 25.7 83.5 28688 25411 1.6677e+07 01-Aug-2015 6.1878e+05 86.829 101.88 0.91788 1.0769 56.991 75.616 9.7 0.13988 1.5 25.7 79.2 27539 22182 1.6866e+07 01-Sep-2015 5.5042e+05 87.085 102.37 0.91793 1.0752 56.921 75.967 9.7 -0.25942 1.5 25.7 79.1 16934 23636 1.7093e+07 01-Oct-2015 6.5344e+05 87.299 103.25 0.92311 1.0807 56.867 76.293 9.7 0 1.5 25.6 83.5 67650 39265 1.7348e+07 01-Nov-2015 6.502e+05 87.516 103.41 0.92408 1.0846 56.83 76.592 9.7 -0.18954 1.5 25.6 65.5 54509 22046 1.7536e+07 01-Dec-2015 7.0015e+05 87.657 100.56 0.91532 1.065 56.807 76.861 9.7 0.29962 1.5 25.5 54.5 21638 25647 1.7673e+07 01-Jan-2016 5.9513e+05 87.73 99.79 0.91401 1.0541 56.8 77.1 9.7 0.1704 1.5 25.4 47.8 9552 19836 1.7761e+07 01-Feb-2016 7.0941e+05 87.758 98.834 0.91328 1.0468 56.807 77.312 9.7 0.42843 1.5 25.4 47.2 10983 29027 1.7954e+07 01-Mar-2016 6.8613e+05 87.778 97.904 0.91169 1.036 56.828 77.505 9.7 -0.25826 1.5 25.4 46 22978 11587 1.8112e+07 01-Apr-2016 5.6423e+05 87.787 97.601 0.9105 1.0316 56.863 77.681 9.7 0.18869 1.5 25.4 49.3 30154 24567 1.822e+07
% Define the dependent variable
y = data.Arrive;
% Add a column of ones to the independent variables matrix for the intercept, Add a column of ones to the independent variables matrix for the intercept
X = [ones(size(data{:,1})) data{:,3:end}];
% Perform the multiple linear regression
[b, ~, ~, ~, stats] = regress(y, X);
% Display the results
disp('Regression Coefficients:');
Regression Coefficients:
disp(b);
1.0e+06 * 4.1453 -0.0190 0.0040 -0.0960 -0.6115 -0.0022 -0.0140 0.0259 0.0070 -0.0602 -0.0196 -0.0003 -0.0000 0.0000 0.0000
disp('R-squared:');
R-squared:
disp(stats(1));
0.3997
disp('F-statistic:');
F-statistic:
disp(stats(2));
4.5189
disp('p-value:');
p-value:
disp(stats(3));
3.5809e-06
disp('Error Variance:');
Error Variance:
disp(stats(4));
3.8687e+09
This is slightly more efficient code, and the result is the same.
.
형현 on 27 May 2024 at 3:04
Is there no problem with statistical significance? When you look at R^2 or regression coefficients...
Star Strider on 27 May 2024 at 4:38
There is a problem with statistical significance, because only four variables (including the Intercept term) are statistically significant, in the usual sense of having . I used fitlm to get those statistics —
data = 110x16 table
Date Arrive Price_m Volme Relative_y Relative_m mine debt Quin Cpi Rate Depo Bull Sale Move Sub ___________ __________ _______ ______ __________ __________ ______ ______ ____ ________ ____ ____ ____ _____ _____ __________ 01-Jan-2015 6.1513e+05 84.854 99.224 0.90087 1.0464 57.982 72.6 8.8 0.67762 2 25.7 57 12546 22145 9.4723e+06 01-Feb-2015 6.6337e+05 85.05 99.845 0.89997 1.0553 57.783 73.078 8.8 -0.05917 2 25.7 68.7 10484 24046 1.5481e+07 01-Mar-2015 7.7112e+05 85.322 102.01 0.89923 1.0714 57.604 73.543 8.8 0.009515 1.75 25.7 86.2 27303 10931 1.5779e+07 01-Apr-2015 6.4945e+05 85.656 102.89 0.90282 1.0768 57.445 73.994 8.8 0.030657 1.75 25.7 81.8 34230 20550 1.605e+07 01-May-2015 6.0572e+05 85.965 102.38 0.90459 1.083 57.304 74.428 8.8 0.28005 1.75 25.7 78.2 38583 15743 1.6232e+07 01-Jun-2015 6.504e+05 86.288 102.58 0.90773 1.0863 57.182 74.844 8.9 0.020023 1.5 25.7 79 33416 30593 1.6469e+07 01-Jul-2015 6.271e+05 86.579 101.88 0.9152 1.0744 57.078 75.241 9.6 0.18017 1.5 25.7 83.5 28688 25411 1.6677e+07 01-Aug-2015 6.1878e+05 86.829 101.88 0.91788 1.0769 56.991 75.616 9.7 0.13988 1.5 25.7 79.2 27539 22182 1.6866e+07 01-Sep-2015 5.5042e+05 87.085 102.37 0.91793 1.0752 56.921 75.967 9.7 -0.25942 1.5 25.7 79.1 16934 23636 1.7093e+07 01-Oct-2015 6.5344e+05 87.299 103.25 0.92311 1.0807 56.867 76.293 9.7 0 1.5 25.6 83.5 67650 39265 1.7348e+07 01-Nov-2015 6.502e+05 87.516 103.41 0.92408 1.0846 56.83 76.592 9.7 -0.18954 1.5 25.6 65.5 54509 22046 1.7536e+07 01-Dec-2015 7.0015e+05 87.657 100.56 0.91532 1.065 56.807 76.861 9.7 0.29962 1.5 25.5 54.5 21638 25647 1.7673e+07 01-Jan-2016 5.9513e+05 87.73 99.79 0.91401 1.0541 56.8 77.1 9.7 0.1704 1.5 25.4 47.8 9552 19836 1.7761e+07 01-Feb-2016 7.0941e+05 87.758 98.834 0.91328 1.0468 56.807 77.312 9.7 0.42843 1.5 25.4 47.2 10983 29027 1.7954e+07 01-Mar-2016 6.8613e+05 87.778 97.904 0.91169 1.036 56.828 77.505 9.7 -0.25826 1.5 25.4 46 22978 11587 1.8112e+07 01-Apr-2016 5.6423e+05 87.787 97.601 0.9105 1.0316 56.863 77.681 9.7 0.18869 1.5 25.4 49.3 30154 24567 1.822e+07
% Define the dependent variable
y = data.Arrive;
% Add a column of ones to the independent variables matrix for the intercept, Add a column of ones to the independent variables matrix for the intercept
X = [ones(size(data{:,1})) data{:,3:end}];
% Perform the multiple linear regression
[b, ~, ~, ~, stats] = regress(y, X);
% Display the results
disp('Regression Coefficients:');
Regression Coefficients:
disp(b);
1.0e+06 * 4.1453 -0.0190 0.0040 -0.0960 -0.6115 -0.0022 -0.0140 0.0259 0.0070 -0.0602 -0.0196 -0.0003 -0.0000 0.0000 0.0000
disp('R-squared:');
R-squared:
disp(stats(1));
0.3997
disp('F-statistic:');
F-statistic:
disp(stats(2));
4.5189
disp('p-value:');
p-value:
disp(stats(3));
3.5809e-06
disp('Error Variance:');
Error Variance:
disp(stats(4));
3.8687e+09
VN = data.Properties.VariableNames;
mdl = fitlm(data{:,3:end}, data.Arrive, 'VarNames',{VN{3:end},VN{2}})
mdl =
Linear regression model: Arrive ~ 1 + Price_m + Volme + Relative_y + Relative_m + mine + debt + Quin + Cpi + Rate + Depo + Bull + Sale + Move + Sub Estimated Coefficients: Estimate SE tStat pValue ___________ __________ ________ _________ (Intercept) 4.1453e+06 1.3912e+06 2.9797 0.0036636 Price_m -19030 8333.1 -2.2837 0.024619 Volme 3965.4 2458.5 1.613 0.11007 Relative_y -95964 4.8213e+05 -0.19904 0.84265 Relative_m -6.1154e+05 3.8759e+05 -1.5778 0.11794 mine -2239 2986.8 -0.74964 0.45532 debt -14013 10099 -1.3876 0.16852 Quin 25869 25464 1.0159 0.31227 Cpi 6957.2 20007 0.34773 0.72881 Rate -60201 30164 -1.9958 0.048817 Depo -19627 8482.7 -2.3137 0.022838 Bull -265.79 754.17 -0.35243 0.7253 Sale -0.44722 0.71444 -0.62597 0.53284 Move 0.81287 0.96829 0.83949 0.4033 Sub 0.0019539 0.0098312 0.19874 0.84289 Number of observations: 110, Error degrees of freedom: 95 Root Mean Squared Error: 6.22e+04 R-squared: 0.4, Adjusted R-Squared: 0.311 F-statistic vs. constant model: 4.52, p-value = 3.58e-06
Significant_Independent_Variables = mdl.CoefficientNames(mdl.Coefficients.pValue <= 0.05)
Significant_Independent_Variables = 1x4 cell array
{'(Intercept)'} {'Price_m'} {'Rate'} {'Depo'}
However considering the F-statistic, the regression itself is highly significant.
These are your data. I defer to you to interprret them and the regression results. (I am not even certain what the variables are.)
.