## Pearson Distribution

The Pearson distribution is a four-parameter distribution that has an arbitrary mean, standard deviation, skewness, and kurtosis. This distribution is often used to model asymmetric data that is prone to outliers.

Statistics and Machine Learning Toolbox™ offers two ways to work with the Pearson distribution:

Use distribution-specific functions (

`pearspdf`

,`pearscdf`

,`pearsrnd`

) with specified distribution parameters. The distribution-specific functions can accept parameters of multiple Pearson distributions.Use generic distribution functions (

`cdf`

,`pdf`

,`random`

) with the distribution name`"Pearson"`

and specified distribution parameters.

### Types

The Pearson distribution has eight types, most of which correspond to other known distributions.

Pearson Distribution Type | Description |
---|---|

`0` | Normal |

`1` | 4-parameter beta |

`2` | Symmetric 4-parameter beta |

`3` | 3-parameter gamma |

`4` | Distribution specific to the Pearson system with pdf proportional to $${\left(1+{\left(\frac{x-\mu}{\sigma}\right)}^{2}\right)}^{-a}\mathrm{exp}(-b\mathrm{arctan}(\frac{x-\mu}{\sigma}))$$, where a and b are
quantities related to the differential equation that defines the Pearson
distribution |

`5` | Inverse 3-parameter gamma |

`6` | F
location scale |

`7` | Student's t location
scale |

### Parameters

The Pearson distribution uses the following parameters.

Parameter | Description |
---|---|

μ | Mean |

σ | Standard deviation |

γ | Skewness. γ is a measure of the asymmetry
of the data around the sample mean. If the skewness is negative,
the data spreads out more to the left of the mean than to the
right. If the skewness is positive, the data spreads out more to
the right. γ^{2} must
be less than κ – 1. |

κ | Kurtosis. κ is a measure of how prone a
distribution is to outliers. The kurtosis of the normal
distribution is 3. Distributions that are more prone
to outliers than the normal distribution have a kurtosis value
greater than 3; distributions that are less prone have a
kurtosis value less than 3. κ must be greater
than γ^{2} +
1. |

### Probability Density Function

The Pearson distribution probability density function (pdf) is the solution to the differential equation

$$\frac{p\text{'}(x)}{p(x)}=-\frac{a+(x-\mu )}{{b}_{0}+{b}_{1}(x-\mu )+{b}_{2}{(x-\mu )}^{2}},$$

where the system is defined by the coefficients $${b}_{j}$$ for 1 ≤ *j* ≤ 3. For most distribution types, the
pdf is a closed-form function. The following table describes the pdf for each
distribution type.

Pearson Distribution Type | pdf p(x) |
---|---|

`0` |
$$\frac{1}{\sigma \sqrt{2\pi}}{e}^{\frac{-{(x-\mu )}^{2}}{2{\sigma}^{2}}}$$ |

`1` | $$\frac{{(x-lb)}^{a-1}{(ub-x)}^{b-1}}{B(a,b){(ub-lb)}^{a+b-1}}$$, where B is the Beta Function, lb and ub are the
lower and upper bounds of the distribution (respectively), a > 0 is a shape parameter, and b > 0 is a scale parameter |

`2` | $$\frac{{(x+ub)}^{a-1}{(ub-x)}^{b-1}}{B(a,b){(2ub)}^{a+b-1}}$$ |

`3` | $$\frac{1}{{b}^{a}\Gamma (a)}{(x-lb)}^{a-1}{e}^{-\frac{x-lb}{b}}$$, where Γ is the Gamma Function |

`4` | $$\frac{{\left|\frac{\Gamma \left(m+\frac{\nu}{2}i\right)}{\Gamma (m)}\right|}^{2}}{\sigma B\left(m-\frac{1}{2},\frac{1}{2}\right)}{\left[1+{u}^{2}\right]}^{-m}\mathrm{exp}\left[-\nu \mathrm{arctan}\left(u\right)\right],u=\frac{x-\mu}{\sigma}$$, where m > 0 and ν > 0 are shape parameters |

`5` | $$\frac{{b}^{a}{e}^{\frac{-b}{u}}}{\sigma {u}^{a+1}\Gamma (a)},u=\frac{x-\mu}{\sigma}$$ |

`6` | $$\frac{1}{\sigma}\frac{\Gamma \left[\frac{({\nu}_{1}+{\nu}_{2})}{2}\right]}{\Gamma \left(\frac{{\nu}_{1}}{2}\right)\Gamma \left(\frac{{\nu}_{2}}{2}\right)}{\left(\frac{{\nu}_{1}}{{\nu}_{2}}\right)}^{{}^{\frac{{\nu}_{1}}{2}}}\frac{{u}^{\frac{{\nu}_{1}-2}{2}}}{{\left[1+\left(\frac{{\nu}_{1}}{{\nu}_{2}}\right)u\right]}^{\frac{\left({\nu}_{1}+{\nu}_{2}\right)}{2}}},u=\frac{x-\mu}{\sigma}$$, where ν_{1} >
0 and ν_{2} >
0 are shape parameters |

`7` | $$\frac{\Gamma \left(\frac{\nu +1}{2}\right)}{\sigma \sqrt{\nu \pi}\Gamma \left(\frac{\nu}{2}\right)}{\left[\frac{\nu +{u}^{2}}{\nu}\right]}^{-\left(\frac{\nu +1}{2}\right)},u=\frac{x-\mu}{\sigma}$$ |

### Cumulative Distribution Function

The Pearson distribution cumulative distribution function (cdf) is the integral of the pdf. The following table describes the cdf for each distribution type.

Pearson Distribution Type | cdf c(x) |
---|---|

`0` | $$\frac{1}{\sigma \sqrt{2\pi}}{\displaystyle {\int}_{-\infty}^{x}{e}^{\frac{-{(t-\mu )}^{2}}{2{\sigma}^{2}}}}dt$$ |

`1` | $$\frac{1}{B(a,b){(ub-lb)}^{a+b-1}}{\displaystyle {\int}_{lb}^{x}{(t-lb)}^{a-1}}{(ub-t)}^{b-1}dt$$, where B is the Beta Function, lb and ub are the
lower and upper bounds of the distribution (respectively), a > 0 is a shape parameter, and b > 0 is a scale parameter |

`2` |
$$\frac{1}{B(a,b){(2ub)}^{a+b-1}}{\displaystyle {\int}_{-ub}^{x}{(t+ub)}^{a-1}}{(ub-t)}^{b-1}dt$$ |

`3` | $$\frac{1}{b\Gamma (a)}{\displaystyle {\int}_{lb}^{x}{(t-lb)}^{a-1}}{e}^{-\frac{t-lb}{b}}dt$$ |

`4` | A type 4 Pearson distribution does not have a closed-form
cdf. You can evaluate the type 4 Pearson distribution cdf at a
point x by numerically integrating the pdf
from –∞ to x. |

`5` | $$Q\left(a,\frac{b}{u}\right),u=\frac{x-\mu}{\sigma}$$, where Q is the Incomplete Gamma Function |

`6` | $${I}_{{\nu}_{1}u/({\nu}_{1}u+{\nu}_{2})}\left(\frac{{\nu}_{1}}{2},\frac{{\nu}_{2}}{2}\right),u=\frac{x-\mu}{\sigma}$$, where I is the regularized
incomplete beta function, and ν_{1} >
0 and ν_{2} >
0 are shape parameters |

`7` | $${\int}_{-\infty}^{x}\frac{\Gamma \left(\frac{\nu +1}{2}\right)}{\Gamma \left(\frac{\nu}{2}\right)}}\frac{1}{\sigma \sqrt{\nu \pi}}\frac{1}{{\left(1+\frac{{t}^{2}}{\nu}\right)}^{\frac{\nu +1}{2}}}dt$$, where ν > 0 is a shape parameter |

### Support

For some Pearson distribution types, support for the pdf and cdf is given by the
coefficients $${b}_{j}$$ in the differential equation that defines the pdf. The following
table shows the support for the Pearson distribution pdf and cdf when *μ* = 0 and *σ* = 1. The variables `a1`

and `a2`

are
solutions to the equation $${b}_{0}+{b}_{1}(x-\mu )+{b}_{2}{(x-\mu )}^{2}=0$$, and `a1 < a2`

.

Pearson Distribution Type | Support |
---|---|

`0` | `(-Inf,Inf)` |

`1` | `(a1,a2)` |

`2` | `(-a1,a1)` |

`3` | `(a1,Inf)` when `a>0` and
`(-Inf,a1)` when
`a<0` |

`4` | `(-Inf,Inf)` |

`5` | `(-C1,Inf)` when ```
(b1-C1)/b2
<0
``` , and `(-Inf,C1)` otherwise.
`C1 = b1/(2*b2)` . |

`6` | `(a2,Inf)` when `a1` and
`a2` are negative, and
`(-Inf,a1)` when `a1` and
`a2` are positive |

`7` | `(-Inf,Inf)` |

For distributions with *μ* ≠ 0 or *σ* ≠ 1, the bounds of the support are shifted from the bounds given in
the preceding table. In this case, you can calculate the lower and upper bounds
*lb* and *ub* as follows:

*lb*= σ*lb*^{*}+μ*ub*= σ*ub*^{*}+μ

where *lb*^{*} and
*ub*^{*} are the lower and upper bounds
given in the preceding table for the same distribution type.

### Examples

#### Compare Pearson Distributions

Create the variables `mu0`

, `sigma0`

, `skew0`

, and `kurt0`

, which contain the parameters for a Pearson distribution of type 0.

mu0 = 0; sigma0 = 1; skew0 = 0; kurt0 = 3;

Use the `pearspdf`

and `pearscdf`

functions to evaluate the pdf and cdf, respectively, for the type 0 Pearson distribution between –5 and 5. You can create a vector of points between –5 and 5 by using the `linspace`

function. Confirm that `mu0`

, `sigma0`

, `skew0`

, and `kurt0`

define a Pearson distribution of type 0.

x0 = linspace(-5,5,100); [p0,type0] = pearspdf(x0,mu0,sigma0,skew0,kurt0); c0 = pearscdf(x0,mu0,sigma0,skew0,kurt0); type0

type0 = 0

The output shows that `p0`

contains the pdf for a Pearson distribution of type 0, which is the standard normal distribution.

Draw a random sample of points from the distribution by using the `pearsrnd`

function.

rng(0,"twister") % For reproducibility r0 = pearsrnd(mu0,sigma0,skew0,kurt0,[100,1]);

Repeat the process for a Pearson distribution of type 4. Define the variables `mu4`

, `sigma4`

, `skew4`

, and `kurt4`

. Evaluate the pdf and cdf between –5 and 15, and draw a random sample from the distribution.

mu4 = 5; sigma4 = 1; skew4 = 1; kurt4 = 10; x4 = linspace(-5,15,100); [p4,type4] = pearspdf(x4,mu4,sigma4,skew4,kurt4); c4 = pearscdf(x4,mu4,sigma4,skew4,kurt4); r4 = pearsrnd(mu4,sigma4,skew4,kurt4,[100,1]);

Confirm that `mu4`

, `sigma4`

, `skew4`

, and `kurt4`

define a Pearson distribution of type 4.

type4

type4 = 4

Repeat the process for a Pearson distribution of type 6, evaluating the pdf and cdf between –10 and 10.

mu6 = 0; sigma6 = 5; skew6 = 3; kurt6 = 20; x6 = linspace(-10,10,100); [p6,type6] = pearspdf(x6,mu6,sigma6,skew6,kurt6); c6 = pearscdf(x6,mu6,sigma6,skew6,kurt6); r6 = pearsrnd(mu6,sigma6,skew6,kurt6,[100,1]);

Confirm that `mu6`

, `sigma6`

, `skew6`

, and `kurt6`

define a Pearson distribution of type 6.

type6

type6 = 6

Use the `tiledlayout`

and `nexttile`

functions to display box plots of the random samples, pdfs, and cdfs for the Pearson distributions of type 0, 4 and 6. Create box plots of the random samples using the `boxchart`

function.

tiledlayout(3,3) nexttile boxchart(r0) title("Random Sample") ylabel("Type 0",FontWeight="bold") nexttile plot(x0,p0) title("PDF") nexttile plot(x0,c0) title("CDF") nexttile boxchart(r4) ylabel("Type 4",FontWeight="bold") nexttile plot(x4,p4) nexttile plot(x4,c4) nexttile boxchart(r6) ylabel("Type 6",FontWeight="bold") nexttile plot(x6,p6) nexttile plot(x6,c6)

The rows of the figure correspond to the three Pearson distribution types. The first column contains a box plot of the random samples for each distribution. The type 6 Pearson distribution has the largest number of outliers, which is consistent with it having the largest kurtosis of the three distributions. The second column contains a plot of the pdf for each distribution. The pdfs for the type 0 and type 4 Pearson distributions are unbounded, and the type 6 Pearson distribution has a lower bound. The third column shows a plot of the cdf for each distribution. The type 0 and type 4 Pearson distribution cdfs are similarly S-shaped because their pdfs have similar shapes. The type 6 Pearson distribution cdf is concave for values greater than the lower bound.

To calculate the type 6 Pearson distribution lower bound, return the coefficients of the polynomial in the denominator of the ordinary differential equation that defines the Pearson distribution pdf. For more information, see Probability Density Function and Support.

[~,~,coefs6] = pearspdf([],mu6,sigma6,skew6,kurt6)

`coefs6 = `*1×3*
0.7162 0.9324 0.0946

From left to right, the coefficients correspond to terms of increasing order.

Find the roots of the polynomial function by using the `roots`

function. Use the `fliplr`

function to format `coefs6`

so that, from left to right, the coefficients correspond to terms of decreasing order.

coefs6 = fliplr(coefs6); roots6 = roots(coefs6)

`roots6 = `*2×1*
-9.0175
-0.8396

The roots of the polynomial are negative, indicating that the type 6 Pearson pdf has a lower bound.

To calculate the lower bound, multiply the largest root by `sigma6`

and add the result to `mu6`

.

lb6 = sigma6*max(roots6) + mu6

lb6 = -4.1982

The lower bound for the support of the type 6 Pearson distribution pdf is near `–4`

, which is consistent with the plot of the pdf.

## References

[1] Johnson, Norman Lloyd, et al. "Continuous Univariate Distributions." 2nd ed, vol. 1, Wiley, 1994.

[2] Willink, R. "A Closed-Form
Expression for the Pearson Type IV Distribution Function." *Australian & New Zealand Journal of Statistics*, vol. 50, no. 2,
June 2008, pp. 199–205. https://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2008.00508.x

## See Also

`pearspdf`

| `pearscdf`

| `pearsrnd`

| `pdf`

| `cdf`

| `random`

| `skewness`

| `kurtosis`