Load the sample data.
This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:
Flag to indicate whether the batch used the new process (newprocess
)
Processing time for each batch, in hours (time
)
Temperature of the batch, in degrees Celsius (temp
)
Categorical variable indicating the supplier (A
, B
, or C
) of the chemical used in the batch (supplier
)
Number of defects in the batch (defects
)
The data also includes time_dev
and temp_dev
, which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.
Fit a generalized linear mixed-effects model using newprocess
, time_dev
, temp_dev
, and supplier
as fixed-effects predictors. Include a random-effects intercept grouped by factory
, to account for quality differences that might exist due to factory-specific variations. The response variable defects
has a Poisson distribution, and the appropriate link function for this model is log. Use the Laplace fit method to estimate the coefficients. Specify the dummy variable encoding as 'effects'
, so the dummy variable coefficients sum to 0.
The number of defects can be modeled using a Poisson distribution
This corresponds to the generalized linear mixed-effects model
where
is the number of defects observed in the batch produced by factory during batch .
is the mean number of defects corresponding to factory (where ) during batch (where ).
, , and are the measurements for each variable that correspond to factory during batch . For example, indicates whether the batch produced by factory during batch used the new process.
and are dummy variables that use effects (sum-to-zero) coding to indicate whether company C
or B
, respectively, supplied the process chemicals for the batch produced by factory during batch .
is a random-effects intercept for each factory that accounts for factory-specific variation in quality.
Test if there is any significant difference between supplier C and supplier B.
The large -value indicates that there is no significant difference between supplier C and supplier B at the 5% significance level. Here, coefTest
also returns the -statistic, the numerator degrees of freedom, and the approximate denominator degrees of freedom.
Test if there is any significant difference between supplier A and supplier B.
If you specify the 'DummyVarCoding'
name-value pair argument as 'effects'
when fitting the model using fitglme
, then
where , , and correspond to suppliers A, B, and C, respectively. is the effect of A minus the average effect of A, B, and C. To determine the contrast matrix corresponding to a test between supplier A and supplier B,
From the output of disp(glme)
, column 5 of the contrast matrix corresponds to , and column 6 corresponds to . Therefore, the contrast matrix for this test is specified as H = [0,0,0,0,1,2]
.
The large -value indicates that there is no significant difference between supplier A and supplier B at the 5% significance level.