Uniques giving duplicates (unresolved)

I created a matrix P, and I incremented the values in first column using a for loop...
"for X = 0.9:0.025:1..."
After plotting, I then wanted to focus a bit more on one of the values, namely X = 0.975...
So I (manually) asked matlab to do some more calculations using 0.975 without a loop:
"for X = 0.975"
However, for reasons which I now understand, the 0.975 which was created in the loop is not EXACTLY equal to 0.975.
So, when I ask for the unique values in the X column, I received 0.975 twice:
>> PU = unique(P(:,1))
PU =
0.900000000000000
0.925000000000000
0.950000000000000
0.975000000000000
0.975000000000000
1.000000000000000
>> PU(4) - PU(5)
ans =
-1.110223024625157e-16
They are out by a tiny amount...
I designed my plotting routine to plot a line for each unique X value... So, there are two distinct legend lines for X = 0.975...
Two questions:
1. How do I avoid this problem in the future (even though I understand the cause, I don't see a solution). I will generally be plotting data from the original loop seeing how it looks, and manually requesting more information on specific values. So, how do I get around the problem that the for loop does not add EXACTLY what I ask it to?
2. How do I make the two "0.975...'s" equal to each other now? (They each appear a few thousand times in the first column of matrix P)
Thanks
D Howard

 Accepted Answer

xvals = 0.9:0.025:1... %list of values for your loop
for X = xvals
....
end
nearest975 = interp1(xvals, xvals, 0.975, 'nearest');
for X = nearest975
...
end
In this way, nearest975 will be an exact copy of one of the values in the list xvals, which is the list you looped over.
Also, generally speaking linspace() has higher accuracy than the colon operator.
You would use the colon operator in a "for" loop instead of linspace if memory space is tight, in that linspace will create the complete list of values and store it, but the colon operator in a for loop will not store the values ahead of time and will generate them as needed. (Note: in the code I show above, the colon operator is not being used in a for loop, so the values will be generated and stored.)

4 Comments

Hello again Walter,
Is there a rounding algorithm which will sort this out? I am setting these X values myself, to a set number of decimal places, and 10^-16 is totally immaterial to me... But, if anything, given the choice I would prefer to have 0.975 rather than the loop's closest number to it.
And what about the second question? How do I now get rid of the inequality? Should I just round the column to 10 decimal places?
D Howard
The easiest way to get rid of the inequality is to only calculate each value once and to copy (or replicate) it the other times. This is a stronger condition than "always calculate the value the same way", because very slight differences in code that "should not matter" (e.g., setting another variable in the same loop) can lead to MATLAB using different calculation paths.
0.975 cannot be exactly represented as a binary floating point number. The closest approximation to it is
0.97499999999999997779553950749686919152736663818359375
Thus is is not possible to have "0.975 rather than the loop's closest number to it".
The alternative is to switch to using symbolic numbers, the Symbolic Toolkit, which can use rational numbers and arbitrary precision decimal numbers.
The X values that you are setting yourself: do you know those ahead of time, before the loop? If you do, then you could construct the array of xvals, and then find the closest xvals to each of your manually-specified X value and replace those closest xvals with the manually-specified X values, then loop over that. But your initial description makes it sound as if that is not practical.
If your calculations for your manually-specified value are strictly a superset of the values calculated at the lower resolution, then once you have been given a value manually, you could search the list of approximate values that were used, and if an approximate value was "close enough" to the manual value, throw away the corresponding approximate results, knowing the data will be "filled in" with the later calculation.
Hello Walter,
I will look into the "solutions for the future" today...
But, for now, I want to solve the problem in the current matrix P (which took 3 days to compute)... There are three solutions I can think of in principle, but I don't know how to perform any of them:
1. replace all the PU(5) values with PU(4) (or vice-versa)
2. round off the entire column to 5 decimal places.
3. modify the unique statement itself with some tolerance level of (0.000000001) to allow the two to be "equal" in the modified PU vector
I don't like (1) because it is not general enough to put into a proper code, and it will mean looking manually each time for duplicates in PU.
I don't know if either of (2) or (3) is possible or how to do it.
Thanks
D Howard
(...also, sometimes I will be throwing out the previous "xvals", including the one I am interested in if I want to increase the repetitions to get lower-variance estimates, so this interpolation thing just seems so twisted and impracticle)

Sign in to comment.

More Answers (0)

Asked:

on 23 Feb 2012

Edited:

on 23 Oct 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!