MATLAB Answers

Row & Column Wise Normalisation

4 views (last 30 days)
edward holt
edward holt on 12 Feb 2020
Commented: edward holt on 13 Feb 2020
Objective: Normalise a matrix such that all rows and columns sum to 1.
The below normalises each column, then row and repeats until row and column totals, equal one another.
This seems to work for randomly generated arrays.
However, the data I wish to use it on has some zeros - and that is generating lots of NaN and Infs, which is making things quite messy and sometimes when running the while loop won't execute (no error message, it just hops over it)
I've tried changing the while condition to be rounded to 3 decimal places (because that's good enough) but still no success.
a = rand(7)
rows = sum(a,2) % orginal row totals
cols = sum(a,1) % original col totals
b = a;
i = 1; % for counting how many iterations
while sum(b,1,"omitnan") ~= sum(b,2,"omitnan")' %when column totals == row totals, stop.
b = b ./ sum(b,1,"omitnan"); %divide by col totals
b = b ./ sum(b,2,"omitnan"); %divide by row totals
i = i + 1;
end
i %how many loops
b % normalised output
brows = sum(b,2,"omitnan") %check that all rows sum 1
bcols = sum(b,1,"omitnan") %check that all cols sum 1.
attached are two 7 x 7 matrices. These are the desired input for a.
Suggestions welcome.
edit:
The margfit function (row 345 - 376) in link below, is (I think) what I am trying to implement. My python is non-existant

  9 Comments

Show 6 older comments
Matt J
Matt J on 13 Feb 2020
In that context, what can we say about the solution? That is, consider an array A0, of size NxM. Do there exist vectors of L and R, length N and M respectively, such that A = diag(L)*A0*diag(R) where the matrix A has all unit row and column sums?
According to this article on Sinkhorn's theorem,
it is true for square strictly positive A0, and moreover the alternating row/column normalization approach (the Sinkhorn-Knopp algorithm) will find A. It is clearly not true for nonsquare matrices, however. For example, it is trivial to see that no L and R exist for A0=ones(N,1) for any N>1.
John D'Errico
John D'Errico on 13 Feb 2020
Thanks to Matt for providing the (now obvious) counterexample.
edward holt
edward holt on 13 Feb 2020
Matt, thank for the Sinkhom-Knopp information (a fair chunk of that is beyond my skill-set)
And John, thank you for making me realise something that now seems glaringly obvious. Removing the columns / rows that are entirely comprised of zeros is certainly the first step.
Furthermore, a solution doesn't seem possible in the data I attached, as there were a few instances of a column containing only one non-zero element, with the corresponding row containing multiple non-zero elements.
Thank you for your efforts.

Sign in to comment.

Accepted Answer

Matt J
Matt J on 13 Feb 2020
Edited: Matt J on 13 Feb 2020
For a non-negative square matrix, the attached article mentions necessary and sufficient conditions (p. 3, Theorem 1) both for the normalization you are trying to achieve to be possible and for the alternating row/column normalization approach (the Sinkhorn-Knopp algorithm ) to work. The required condition for both are the same. So basically, if you are seeing Infs and NaNs in your iterations, the normalization is known to be impossible from the get-go.
The condition is:
"A necessary and sufficient condition ... is that A has total support"
The given matrix A having total support means that for every non-zero element A(i,j)>0, a column permutation Ap of A exists such that Ap has only strictly positive elements on the diagonal, one of which is A(i,j).

  0 Comments

Sign in to comment.

More Answers (0)

Sign in to answer this question.