# Solution to Ax-b

34 views (last 30 days)
Alice Faisal on 10 Jul 2020
Commented: John D'Errico on 25 Jul 2020 at 16:52
Greetings,
I want to find a solution for x in which A is a non-square matrix and b is a column vector.
I tried using the following methods but I did not get accurate results:
Backslash, lsqminnorm(M,y), lsqr(M,y), linsolve(M,y).
Can anyone let me know what is the best way to address this issue?
Thanks

Show 1 older comment
John D'Errico on 10 Jul 2020
1. Do you know a true solution exists?
2. Do you know what that solution should be? (Or think you know.)
3. Have you verified the solution you think to be correct is indeed a solution?
4. Do backslash and tools like lsqminnorm return (significantly) different results?
5. Is the matrix A of full rank? What is the condition number of A?
If the answers to #1, #2, and #3 above are anything on the order of duh, or I don't know, then how do you know if the solution you got is accurate or not?
If the answer to #4 is yes, then indeed the result is not accurate as defined above, because the matrix problem is ill-posed. A is likely to be a singular matrix, or nearly so.
If the answer to #5 is the matrix is not of full rank and thus has a very large condition number, then you need to understand that the matrix problem is not well posed. You cannot then find an "accurate" solution, since there are infinitely many possible solutions.
Alice Faisal on 14 Jul 2020 at 11:14
Thanks, indeed the matrix is not of full rank, and the results are sifnificantly different than the true solution.
Dana on 14 Jul 2020 at 15:05
You say A is non-square. Suppose A is m by n (m rows, n columns).
• If m > n, then you're trying to solve a system of equations that has more equations than unknowns. In that case, solutions to Ax=b for x only exist under very particular conditions (i.e., when m-n of the equations are redundant, in the sense that they can be derived from the remaining n equations). If indeed m-n equations are redundant, then remove them and solve the resulting square system using the backslash operator. If you don't have m-n redundant equations, then you're out of luck: you won't be able to solve this problem. If you do try to use one of the algorithms that can handle cases like this, the "solution" you get will not actually be a solution, but rather the value of x that gets Ax as "close" to b as possible in some sense (e.g., using the backslash operator/pinv function will yield you the value of x that solves the least-squares regression problem b=Ax+u, where u is the error term).
• On the other hand, if m < n, then you're solving a system with more unknowns than equations. In that case, there is typically an uncountable infinity of solutions to Ax=b. In that case, there is no one "true solution", and depending on what algorithm you use to look for a solution numerically, you will generally get a different solution. Importantly, though, each of those results will actually be solutions. It's up to you to think more carefully about which of those infinite solutions is the one you're after.

John D'Errico on 14 Jul 2020 at 17:09
Edited: John D'Errico on 14 Jul 2020 at 23:24
Ok. If your matrix is not full rank, then the answer is easy, though it wil probably not be the answer you want to see. Really, it comes down to understanding linear algebra.
A matrix problem A*X=b, where A has less than full rank (is rank deficient), will have infinitely many solutions, all equally good or bad, in context of how well they solve the problem. Equally good, in the sense that NO solution is better than another. Equally bad, in the sense that you cannot simply control which solution you get, of the infinitely many possible solutions. If there is one solution that you expected to see, I'm sorry, you will get a solution, but there is no reason you will get the solution you expected to see.
Again, an understanding of linear algebra seems crucial here. But the idea is that in the problem
A*X = B
if A has less than full rank, then you can arbitrarily choose any solution provided by any of the tools you have mentioned. Then ANY other solution can be found by adding some arbitrary multiples of the null space basis vector(s).
For example, consider this simple case. I'll pick an easy matrix to generate that I know to be singular.
A = magic(10)
A =
92 99 1 8 15 67 74 51 58 40
98 80 7 14 16 73 55 57 64 41
4 81 88 20 22 54 56 63 70 47
85 87 19 21 3 60 62 69 71 28
86 93 25 2 9 61 68 75 52 34
17 24 76 83 90 42 49 26 33 65
23 5 82 89 91 48 30 32 39 66
79 6 13 95 97 29 31 38 45 72
10 12 94 96 78 35 37 44 46 53
11 18 100 77 84 36 43 50 27 59
>> rank(A)
ans =
7
What matters here is A is of less than full rank. In this case, A has rank 7, and dimensionality of the nullspace is therefore 10-7=3. Now I'll pick a vector X0.
X0 = (-4:5)'
X0 =
-4
-3
-2
-1
0
1
2
3
4
5
>> B = A*X0
B =
125
155
415
155
125
300
330
290
330
300
Now, since I created B by multiplying A*X0, we know that an exact solution exists.
I can use one of the before mentioned solvers, and get a valid solution.
Xhat1 = A\B
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 2.477803e-18.
Xhat1 =
-6.91568828516707
-5.09823622246779
-4.91568828516707
-2.47286741366576
3.76318689003081
3.91568828516707
4.09823622246778
5.91568828516707
5.47286741366576
1.23681310996919
Xhat2 = lsqminnorm(A,B)
Xhat2 =
-2.07692307692308
0.653846153846154
-0.0769230769230779
-0.807692307692307
1.34615384615384
-0.923076923076922
-1.65384615384615
1.07692307692308
3.80769230769231
3.65384615384616
These are both equally valid solutions. Both yield zero errors to within floating point trash. This is as much as you can ask.
norm(A*Xhat1 - B)
ans =
5.30961349689659e-13
norm(A*Xhat2 - B)
ans =
1.98951966012828e-13
But when you have a rank deficient system, there are, as I said, infinitely many solutions, all equally the same. The difference is you can add any linear combination of the nullspace vectors.
Anull = null(A)
Anull =
0.0938060418210618 -0.470266575735707 -0.0286182639744013
0.643724380828035 0.0513564331028858 -0.0778354763808516
0.093806041821062 -0.470266575735707 -0.0286182639744012
0.181140183639106 0.219990275800516 -0.333316973363338
0.187637971728761 0.0816424572375614 0.617416734320226
-0.0938060418210623 0.470266575735707 0.0286182639744012
-0.643724380828036 -0.0513564331028855 0.0778354763808517
-0.0938060418210616 0.470266575735706 0.0286182639744014
-0.181140183639106 -0.219990275800516 0.333316973363338
-0.187637971728761 -0.0816424572375614 -0.617416734320226
Here, for example, I created Xhat3 by adding some random linear combination of the nullspace vectors...
Xhat3 = Xhat1 + Anull*rand(3,1)
Xhat3 =
-7.00247988035883
-4.50849923442573
-5.00247988035884
-2.41921734224954
4.33241533043219
4.00247988035884
3.50849923442573
6.00247988035884
5.41921734224954
0.667584669567805
norm(A*Xhat3 - B)
ans =
5.62900547729963e-13
Indeed, that is just as valid a solution as any other.
Now, as it turns out, if you actually wanted to find the solution that recovers X0, we could theoretically do that. Of course, this verges on the absurd, since I just said, that if you actually know the solution you ant to get, then you can find that solution. For example, if you insist on the absurd, we can do this:
V = Anull\(X0 - Xhat1)
V =
2.91676577215197
-5.23552575911924
-6.28917471745474
So V is the vector that allows us to recover X0 as a solution. Why, oh why you would do that, I have no clue. I you know the answer you want to see, then you can get that as an answer.
Xhat4 = Xhat1 + Anull*V
Xhat4 =
-4
-3
-2
-1
4.44089209850063e-16
1
2
3
4
5
norm(A*Xhat4 - B)
ans =
6.21389833357723e-13
But it works essentially as well as would any other solution from this infinite family.
Anyway, all of this shows (by way of example) that the solution of a matrix problem when the matrix is of less than full rank is impossible to resolve accurately. But again, you need to appreciate what accuracy means, and why that is impossible to achieve when the problem is degenerate.
Next, if there is no exact soution at all, then you still will get a solution, at least from those solvers that can handle singular systems. If the vector B does not lies in the column space of A, then an exact solution does not exist.
Finally, when the problem is under-determined, thus it has fewer rows than columns, the problem is again one where there are infinitely many solutions in general. For example...
A = rand(3,5)
A =
0.80007 0.18185 0.13607 0.54986 0.62206
0.43141 0.2638 0.86929 0.14495 0.35095
0.91065 0.14554 0.5797 0.85303 0.51325
B = rand(3,1)
B =
0.40181
0.075967
0.23992
rank(A)
ans =
3
A\B
ans =
0.86477
0
-0.26474
-0.46202
0
yet we see:
lsqminnorm(A,B)
ans =
0.25618
0.16369
-0.23953
-0.1079
0.41637
Thus different distinct solutions, although since A has full rank, they will all be exact to within floating point trash.
norm(A*(A\B) - B)
ans =
2.7756e-17
Here again, A has a non-empy null space, even thugh A has the maximum possible rank for a non-square matrix of that shape.
null(A)
ans =
-0.50663 -0.52589
0.74612 -0.51223
-0.077571 0.12743
0.41703 0.17497
0.08184 0.64358
So again, the solution comprises a particular solution, plus any linear combination of the null-space basis of A. Again, there is no unique solution, and we cannot find an "accurate" solution when of the solutions we could create are equally good.
This is NOT a question of just using the correct solver. It really IS a question of appreciating the linear algebra behind what you are doing.

Show 1 older comment
John D'Errico on 14 Jul 2020 at 23:27
I've changed my wording there. I was a little sloppy in talking about singularity for a non-square linear system, where rank deficient would be a better description.
Paul on 16 Jul 2020 at 2:10
Quick question on condition number. A comment on the original question states:
"... the matrix is not of full rank and thus has a very large condition number, ..."
If the matrix is not of full rank, doesn't it have a theoretical condition number of inf (or perhaps its undefined)? Or did this statement mean the computed condition number will typically be large, but finite, because the minimum singular value will typically be greater than zero due to computational errors?
If the latter, do you have any thoughts on how big the condition number can be before the user might say "Hmm, maybe I'm dealing with a rank deficient matrix?" For example, suppose cond(M) == 7.9e17. Is that because M is rank deficient but cond(M) is finite due to rounding? Or is it because M is full rank, but very close to rank deficient? In a practical sense, does it even matter which of these two apply?
Side note: according to the doc and the actual code in cond.m, the condition number for anything other than the 2-norm is computed as
c = norm(A,p) * norm(inv(A),p);
I was surprised to see that call to inv, insofar that it seems that use of inv is frowned upon.Maybe this is one of those cases where the explicit use of inv is appropriate? Is inv(A) better than A\eye(size(A))?
John D'Errico on 25 Jul 2020 at 16:52
In double precision, any condition number that is close to
1/eps('double')
ans =
4.5035996273705e+15
or larger should be viewed as rank deficient, or close enough that the difference is impossible to distinguish from a matrix that is rank deficient.
A floating point matrix will never have an infinite condition number as computed. Even a simple and obviously singular case fails to yield inf:
cond(ones(2))
ans =
5.96177704763898e+16
This is because cond uses svd. The computed condition number is the ratio of the smallest and largest singular values.
svd(ones(2))
ans =
2
3.35470445140523e-17
Floating point arithmetic and the SVD algorithm will essentially never result in a ZERO singular value. Therefore cond will essentially never prodiuce inf, unless you work with syms.
svd(sym(ones(2)))
ans =
2
0
cond(sym(ones(2)))
ans =
Inf
As for the use of inv in cond for other norms, I doubt that they were terribly worried. inv(A) seems a bit faster than A\eye(size)A)) anyway.