Help creating an array with uniformly distributed random numbers (row-wise) comprised between 0 and 1, with each column having a sum of 1
Show older comments
Hi All,
I am trying to solve this problem, but I am not even sure this is possible.
I need to run a model 1e8+ times. To do this - making sure the model is unbiased - I need to obtain 4 random values comprised between 0 and 1 at each run, and their sum must equal 1 (these values will be area fractions).
So I have tried the following:
Random_Fraction_a = rand(1, 4); % Random Numbers
Random_Fraction_b = Random_Fraction_a ./ sum(Random_Fraction_a); % Random Numbers wich their sum = 1
However, as depicted in the figure below, 'rand' returns uniformly distributed random numbers, which is exactly what I am after, but when I normalise them between [0 1], the distribution changes completely. Now, it is my understanding that this is "statistically normal", but it is not what I need. Do you know, or can you think of any alternative way to obtain 4 random decimal values comprised between 0 and 1, wich their sum gives 1, and that are uniformly distributed row-wise?
Any help is grately appreciated!
See figure (and code) below:

% Example code
for i=1:1e5
Random_Fraction_a = rand(1, 4);
Random_Fraction_b = Random_Fraction_a ./ sum(Random_Fraction_a);
Rand_Fa(:,i)=Random_Fraction_a;
Rand_Fb(:,i)=Random_Fraction_b;
end
figure
for ii=1:4
subplot(2,2,ii)
histogram(Rand_Fa(ii,:))
hold on
histogram(Rand_Fb(ii,:),"FaceAlpha",0.5)
legend('Rand Output','After Normalisation')
end
Accepted Answer
More Answers (1)
This is a mistake I see made so frequently, that is, a misunderstanding of what it means for a sample to be uniformly distributed, but with a sum constraint. The two ideas are sort of at odds, since a sample cannot be fully uniformly distributed the way we expect, once that constraint enters into the problem.
One thought is to start in 2-dimensions. So we have x1 and x2, uniformly distributed. but we want the sum to be 1. An answer is simple, we just choose x1 to be uniform on the interval [0,1], and then compute x2=1-x1. So this is easy. We can think of the result as choosing a sample uniformly along the straight line: x1+x2=1.
x1 = rand(1,500);
x2 = 1 - x1;
plot(x1,x2,'o')
You can create the desired array now, as
X = [x1;x2];
So the columns of X are uniformly distributed, under a sum constraint. Sadly, things get messier if we want to live in 3-dimensions. We cannot simply choose x1 and x2 independently now, because some of the time, x1+x2 will be greater than 1.
The randfixedsum code solves the problem elegantly, by sampling correctly so the set ois uniform, but it will lie in a TRIANGLE. Try it.
X = randfixedsum(3,1000,1,0,1);
plot3(X(1,:),X(2,:),X(3,:),'o')
box on
grid on
axis equal
view(48,28)
Now the points are seen to be uniformly distributed inside a TRIANGLE. However, you cannot now look at the marginal distrbution of any single variable, and hope them to also look uniform. This is your mistake. In fact, the histogram will now look like a trianglular distribution, with most of the samples near zero.
histogram(X(1,:),50)
Now, try the same thing in for 10 rows. I'll use a larger sample this time to make the histogram look pretty.
X = randfixedsum(10,100000,1,0,1);
size(X)
histogram(X(1,:),50,'normalization','pdf')
Again, just because the sample is uniform in one respect, does NOT mean the marginal distributiuon should also be uniform. If we could plot the sample with a 10 dimensionsal monitor (sorry, I don't have one) then the plot would agsin be seen to be uniformly sampled inside a 10-dimensional simplex (an analogue of a triangle, but in 10-dimensions.)
You CANNOT have everything your intuition demands. Sorry, but too often a simple intuition runs counter to mathematical fact.
1 Comment
Simone A.
on 15 Jul 2023
Categories
Find more on Uniform Distribution (Continuous) in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!





