When I am running following K-means code ... it gives me an error as "Normalization....." Please help me how to run this code.
3 views (last 30 days)
Show older comments
% K-means clustering % ------------------- CLUSTERING PHASE ------------------- % Load the Training Set** TrSet = load('TrainingSet.txt'); [m,n] = size(TrSet); % (m samples) x (n dimensions) for i = 1:m % the output (last column) values (0,1,2,3) are mapped to (0,1) if TrSet(i,end)>=1 TrSet(i,end)=1; end end % find the range of each attribute (for normalization later) for i = 1:n range(1,i) = min(TrSet(:,i)); range(2,i) = max(TrSet(:,i)); end x = Normalize(TrSet, range); % normalize the data set to a hypercube x(:,end) = []; % get rid of the output column [m,n] = size(x); nc = 2; % number of clusters = 2 % Initialize cluster centers to random points c = zeros(nc,n); for i = 1:nc rnd = int16(rand*m + 1); % select a random vector from the input set c(i,:) = x(rnd,:); % assign this vector value to cluster (i) end % Clustering Loop delta = 1e-5; n = 1000; iter = 1; while (iter < n) % Determine the membership matrix U % u(i,j) = 1 if euc_dist(x(j),c(i)) <= euc_dist(x(j),c(k)) for each k ~= i % u(i,j) = 0 otherwise for i = 1:nc for j = 1:m d = euc_dist(x(j,:),c(i,:)); u(i,j) = 1; for k = 1:nc if k~=i if euc_dist(x(j,:),c(k,:)) < d u(i,j) = 0; end end end end end % Compute the cost function J J(iter) = 0; for i = 1:nc JJ(i) = 0; for k = 1:m if u(i,k)==1 JJ(i) = JJ(i) + euc_dist(x(k,:),c(i,:)); end end J(iter) = J(iter) + JJ(i); end
% Stop if either J is below a certain tolerance value, % or its improvement over previous iteration is below a certain threshold str = sprintf('iteration: %.0d, J=%d', iter, J(iter)); disp(str); if (iter~=1) & (abs(J(iter-1) - J(iter)) < delta) break; end % Update the cluster centers % c(i) = mean of all vectors belonging to cluster (i) for i = 1:nc sum_x = 0; G(i) = sum(u(i,:)); for k = 1:m if u(i,k)==1 sum_x = sum_x + x(k,:); end end c(i,:) = sum_x ./ G(i); end iter = iter + 1; end % while disp('Clustering Done.'); % ----------------- TESTING PHASE -------------------------- % Load the evaluation data set EvalSet = load('EvaluationSet.txt'); [m,n] = size(EvalSet); for i = 1:m if EvalSet(i,end)>=1 EvalSet(i,end)=1; end end x = Normalize(EvalSet, range); x(:,end) = []; [m,n] = size(x); % Assign evaluation vectors to their respective clusters according % to their distance from the cluster centers for i = 1:nc for j = 1:m d = euc_dist(x(j,:),c(i,:)); evu(i,j) = 1; for k = 1:nc if k~=i if euc_dist(x(j,:),c(k,:)) < d evu(i,j) = 0; end end end end end % Analyze results ev = EvalSet(:,end)'; rmse(1) = norm(evu(1,:)-ev)/sqrt(length(evu(1,:))); rmse(2) = norm(evu(2,:)-ev)/sqrt(length(evu(2,:))); subplot(2,1,1); if rmse(1) < rmse(2) r = 1; else r = 2; end str = sprintf('Testing Set RMSE: %f', rmse(r)); disp(str); ctr = 0; for i = 1:m if evu(r,i)==ev(i) ctr = ctr + 1; end end str = sprintf('Testing Set accuracy: %.2f%%', ctr*100/m); disp(str); [m,b,r] = postreg(evu(r,:),ev); % Regression Analysis disp(sprintf('r = %.3f', r));
0 Comments
Answers (0)
See Also
Categories
Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!