How it is possible to have groups of data with the same distribution using KS test 2?

1 view (last 30 days)
I would like to have 30 series of data in three different distribution similarities. I mean, if I have x1 and x2 with two distinct distribution, the kstest2 will recognise that they have not the same distributions.
x1 = wblrnd(1,1,50,1); % Weibul dist.
x2 = gamrnd(1,1,[50 1]); % Gamma dist.
[h,p,k] = kstest2(x1,x2)
But when I extend it to these lines it shows different results:
x1 = wblrnd(1,1,50,10);
x2 = gamrnd(1,1,[50 10]);
x3=[x1,x2];
for i=1:20
for j=1:20
[h(i,j),p(i,j),k(i,j)] = kstest2(x3(:,i),x3(:,j));
end
end
In my minde the Weibul distribution vectors have to h=1 and it is the same for Gamma dist. vectors. However, the hypothesis shows something different.
Now, the point is, how it is possible to have 10 series with low (around 0.2) , 10 series with medium (around 0.5) and 10 series with high (around 0.8) KS statistics. In total, I want to see cdf of these 30 series together to figure out how is the grouping data based of distribution similarities. Naturally when the Kolmogorov - Smirnov statistics are near zero, the distribution of two sets of data are similar and maybe they are in a cluster.

Answers (1)

Aditya Patil
Aditya Patil on 23 Dec 2020
As per my understanding, the results are not same for all set of samples for the datasets.
As kstest2 uses samples of datasets, it is possible that two set of samples from different distributions look similar. It's important to note that statistical tests should not be considered as proof, especially so when the numbers of samples are low.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!