How to choose initial component parameters with gmdistribution.fit ?

Question

Stamatis Samaras on 15 Oct 2013

2
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/90288-how-to-choose-initial-component-parameters-with-gmdistribution-fit

Commented: Stamatis Samaras on 15 Oct 2013

I am training a Gaussian Mixture Model through gmdistribution.fit. However, as far as I am concerned my initial component parameters are random values. Well I would like to change this and instead of random values to initialize my component parameters through K-Means method. Any idea on how to solve this?

The code for training my model is :

options = statset('MaxIter',500,'Display','final');
models(BrandId).gmm = gmdistribution.fit(data',3,'CovType',...
    'diagonal','Options',options);

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Jonathan LeSage on 15 Oct 2013

2
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/90288-how-to-choose-initial-component-parameters-with-gmdistribution-fit#answer_99780

Open in MATLAB Online

Once you have clustered your data via the k-means algorithm, you can definitely use the cluster centers as initial conditions for your Gaussian mixture clustering. The trick is that the initial condition inputs to the gmdistribution.fit functions must be in the proper form (a structure). More information on the function can be found in the documentation, here:

http://www.mathworks.com/help/stats/gmdistribution.fit.html

The other trick here is that the Gaussian mixture clustering routine requires three initial conditions. The initial cluster means (which you are providing from k-means), the initial cluster covariances (you can randomly initialize this), and the initial cluster weights (same as the initial covariances).

To help get you started, here is some example code:

% Arbitrary 1-d data vector
dataLength = 5000;
muData = [5 30];
stdData = [4 10];
dataVec = [muData(1) + stdData(1)*randn(dataLength/2,1); ...
    muData(2) + stdData(2)*randn(dataLength/2,1)];
% K-means to initially cluster data
% The second output of the k-means function are the cluster center values
numberOfClusters = 2;
[~,kMeansClusters] = kmeans(dataVec,numberOfClusters);
% Fit GMM using the k-means centers as the initial conditions
% We only have mean initial conditions from the k-means algorithm, so we
% can specify some arbitrary initial variance and mixture weights.
gmInitialVariance = 0.1;
initialSigma = cat(3,gmInitialVariance,gmInitialVariance);
% Initial weights are set at 50%
initialWeights = [0.5 0.5];
% Initial condition structure for the gmdistribution.fit function
S.mu = kMeansClusters;
S.Sigma = initialSigma;
S.PComponents = initialWeights;
gmmOfData = gmdistribution.fit(dataVec,numberOfClusters,'Start',S);

Hope this helps and good luck!

1 Comment
Show -1 older commentsHide -1 older comments

Stamatis Samaras on 15 Oct 2013

This is definetely something to get started and its really close to what i am looking for , thanks !

Sign in to comment.

How to choose initial component parameters with gmdistribution.fit ?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

How to choose initial component parameters with gmdistribution.fit ?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments