Documentation

# designMatrix

Fixed- and random-effects design matrices

## Syntax

``D = designMatrix(glme)``
``````D = designMatrix(glme,'Fixed')``````
``D = designMatrix(glme,'Random')``
``Dsub = designMatrix(glme,'Random',gnumbers)``
``````[Dsub,gnames] = designMatrix(glme,'Random',gnumbers)``````

## Description

example

````D = designMatrix(glme)` or ```D = designMatrix(glme,'Fixed')``` returns the fixed-effects design matrix for the generalized linear mixed-effects model `glme`.```

example

````D = designMatrix(glme,'Random')` returns the random-effects design matrix for the generalized linear mixed-effects model `glme`.```
````Dsub = designMatrix(glme,'Random',gnumbers)` returns a subset of the random-effects design matrix for the generalized linear mixed-effects model `glme` that corresponds to the grouping variables indicated by `gnumbers`.```
``````[Dsub,gnames] = designMatrix(glme,'Random',gnumbers)``` also returns the grouping variable names that correspond to `gnumbers`.```

## Input Arguments

expand all

Generalized linear mixed-effects model, specified as a `GeneralizedLinearMixedModel` object. For properties and methods of this object, see `GeneralizedLinearMixedModel`.

Grouping variable numbers, specified as an array of integer values containing elements in the range [1,R], where R is the length of the cell array that contains the grouping variables for the generalized linear mixed-effects model `glme`.

For example, you can specify the grouping variables g1, g3, and gr as `[1,3,r]`.

Data Types: `single` | `double`

## Output Arguments

expand all

Design matrix of a generalized linear mixed-effects model `glme` returned as one of the following:

• Fixed-effects design matrix — n-by-p matrix consisting of the fixed-effects design matrix of `glme`, where n is the number of observations and p is the number of fixed-effects terms.

• Random-effects design matrix — n-by-k matrix, consisting of the random-effects design matrix of `glme`. Here, k is equal to `length(B)`, where `B` is the random-effects coefficients vector of generalized linear mixed-effects model `glme`. The random-effects design matrix is returned as a sparse matrix. For more information, see Sparse Matrices (MATLAB).

If `glme` has R grouping variables g1, g2, ..., gR, with levels m1, m2, ..., mR, respectively, and if q1, q2, ..., qR are the lengths of the random-effects vectors that are associated with g1, g2, ..., gR, respectively, then `B` is a column vector of length q1*m1 + q2*m2 + ... + qR*mR.

`B` is made by concatenating the empirical Bayes predictors of random effects vectors corresponding to each level of each grouping variable as ```[g1level1; g1level2; ...; g1levelm1; g2level1; g2level2; ...; g2levelm2; ...; gRlevel1; gRlevel2; ...; gRlevelmR]'```.

Data Types: `single` | `double`

Submatrix of random-effects design matrix that corresponds to the grouping variables specified by `gnumbers`, returned as an n-by-k matrix, where k is length of the column vector `Bsub`.

`Bsub` contains the concatenated empirical Bayes predictors of random-effects vectors, corresponding to each level of the grouping variables, specified by `gnumbers`.

If, for example, `gnumbers` is `[1,3,r]`, this corresponds to the grouping variables g1, g3, and gr. Then, `Bsub` contains the empirical Bayes predictors of random-effects vectors corresponding to each level of the grouping variables g1, g3, and gr, such as

```[g1level1; g1level2; ...; g1levelm1; g3level1; g3level2; ...; g3levelm3; grlevel1; grlevel2; ...; grlevelmr]'```.

Thus, `Dsub*Bsub` represents the contribution of all random effects corresponding to grouping variables g1, g3, and gr to the response of `glme`.

If `gnumbers` is empty, then `Dsub` is the full random-effects design matrix.

Data Types: `single` | `double`

Names of grouping variables corresponding to the integers in `gnumbers` if the design type is `'Random'`, returned as a k-by-1 cell array. If the design type is `'Fixed'`, then `gnames` is an empty matrix `[]`.

Data Types: `cell`

## Examples

expand all

`load mfr`

This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:

• Flag to indicate whether the batch used the new process (`newprocess`)

• Processing time for each batch, in hours (`time`)

• Temperature of the batch, in degrees Celsius (`temp`)

• Categorical variable indicating the supplier (`A`, `B`, or `C`) of the chemical used in the batch (`supplier`)

• Number of defects in the batch (`defects`)

The data also includes `time_dev` and `temp_dev`, which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.

Fit a generalized linear mixed-effects model using `newprocess`, `time_dev`, `temp_dev`, and `supplier` as fixed-effects predictors. Include a random-effects term for intercept grouped by `factory`, to account for quality differences that might exist due to factory-specific variations. The response variable `defects` has a Poisson distribution, and the appropriate link function for this model is log. Use the Laplace fit method to estimate the coefficients. Specify the dummy variable encoding as `'effects'`, so the dummy variable coefficients sum to 0.

The number of defects can be modeled using a Poisson distribution

`${\text{defects}}_{ij}\sim \text{Poisson}\left({\mu }_{ij}\right).$`

This corresponds to the generalized linear mixed-effects model

`$\mathrm{log}\left({\mu }_{ij}\right)={\beta }_{0}+{\beta }_{1}{\text{newprocess}}_{ij}+{\beta }_{2}{\text{time}\text{_}\text{dev}}_{ij}+{\beta }_{3}{\text{temp}\text{_}\text{dev}}_{ij}+{\beta }_{4}{\text{supplier}\text{_}\text{C}}_{ij}+{\beta }_{5}{\text{supplier}\text{_}\text{B}}_{ij}+{b}_{i},$`

where

• ${\text{defects}}_{ij}$ is the number of defects observed in the batch produced by factory $i$ during batch $j$.

• ${\mu }_{ij}$ is the mean number of defects corresponding to factory $i$ (where $i=1,2,...,20$) during batch $j$ (where $j=1,2,...,5$).

• ${\text{newprocess}}_{ij}$, ${\text{time}\text{_}\text{dev}}_{ij}$, and ${\text{temp}\text{_}\text{dev}}_{ij}$ are the measurements for each variable that correspond to factory $i$ during batch $j$. For example, ${\text{newprocess}}_{ij}$ indicates whether the batch produced by factory $i$ during batch $j$ used the new process.

• ${\text{supplier}\text{_}\text{C}}_{ij}$ and ${\text{supplier}\text{_}\text{B}}_{ij}$ are dummy variables that use effects (sum-to-zero) coding to indicate whether company `C` or `B`, respectively, supplied the process chemicals for the batch produced by factory $i$ during batch $j$.

• ${b}_{i}\sim N\left(0,{\sigma }_{b}^{2}\right)$ is a random-effects intercept for each factory $i$ that accounts for factory-specific variation in quality.

`glme = fitglme(mfr,'defects ~ 1 + newprocess + time_dev + temp_dev + supplier + (1|factory)','Distribution','Poisson','Link','log','FitMethod','Laplace','DummyVarCoding','effects');`

Extract the fixed-effects design matrix and display rows 1 through 10.

```Dfe = designMatrix(glme,'Fixed'); disp(Dfe(1:10,:))```
``` 1.0000 0 0.1834 0.2259 1.0000 0 1.0000 0 0.3035 0.0725 0 1.0000 1.0000 0 0.0717 0.1630 1.0000 0 1.0000 0 0.1069 0.0809 -1.0000 -1.0000 1.0000 0 0.0241 0.0319 1.0000 0 1.0000 0 0.1214 0.1114 0 1.0000 1.0000 0 0.0033 0.0553 1.0000 0 1.0000 0 0.2350 0.0616 1.0000 0 1.0000 0 0.0488 0.0177 0 1.0000 1.0000 0 0.1148 0.0105 1.0000 0 ```

Column 1 of the fixed-effects design matrix `Dfe` contains the constant term. Column 2, 3, and 4 contain the `newprocess`, `time_dev`, and `temp_dev` terms, respectively. Columns 5 and 6 contain dummy variables for `supplier_C` and `supplier_B`, respectively.

Extract the random-effects design matrix and display rows 1 through 10.

```Dre = designMatrix(glme,'Random'); disp(Dre(1:10,:))```
``` (1,1) 1 (2,1) 1 (3,1) 1 (4,1) 1 (5,1) 1 (6,2) 1 (7,2) 1 (8,2) 1 (9,2) 1 (10,2) 1 ```

Convert the sparse matrix `Dre` to a full matrix and display rows 1 through 10.

`full(Dre(1:10,:))`
```ans = 10×20 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ```

Each column corresponds to a level of the grouping variable `factory`.