Clear Filters
Clear Filters

Determine the index of the value closest to the mean value of each group

5 views (last 30 days)
In the attached xls, first 3 columns (A to C) are the raw data representing multiple measurements of each test along with the path of the stored measurements.
The desired output is a list of the paths highlighted in Column H representing the measurement closest to the mean value of each group. Was wondering whether there is an easy way to do so.
  2 Comments
Stephen23
Stephen23 on 1 Nov 2018
ILoveML's "Answer" moved here:
Thanks all for the speedy reply! Let me supply you with the real raw data and the expected outcome in the attached xls. Notes: 1. 1st 3 columns (A to C) are the raw data 2. the desired output is a list of the paths highlighted in Column H.
Bruno, look forward to your solution again!

Sign in to comment.

Accepted Answer

Bruno Luong
Bruno Luong on 1 Nov 2018
Edited: Bruno Luong on 1 Nov 2018
Exactly as with my previous answer, but adapted for OP's test data read from Excel
[~,~,C] = xlsread('MLdata.xlsx');
Test = C(2:end,1);
Val = [C{2:end,3}]';
% Compute the mean
[tnum,~,J] = unique(Test);
tmean = accumarray(J,Val,[],@mean);
% Compute the distance to the mean
dR = abs(Val-tmean(J));
% Find the closest
% the bellow statement can get less messy if coded in an mfile
findclosest = @(i) i(find(dR(i)==min(dR(i)),1,'first'));
selectidx = accumarray(J,1:size(dR),[],findclosest);
TestSelect = Test(selectidx);
ValSelect = Val(selectidx);
dRSelect = dR(selectidx);
% Check
n = length(TestSelect);
for k=1:n
fprintf('Test=%s;\tmean=%f,\trow=%02d\tValue=%f\t,error=%f\n',...
TestSelect{k}, tmean(k), selectidx(k),ValSelect(k), dRSelect(k))
end
Result:
Test=1xt; mean=0.092173, row=04 Value=0.092210 ,error=0.000037
Test=BLA; mean=0.356085, row=05 Value=0.355550 ,error=0.000535
Test=CLA; mean=0.216280, row=08 Value=0.218100 ,error=0.001820
Test=FHD; mean=0.562556, row=12 Value=0.562270 ,error=0.000286
Test=Fac; mean=0.151698, row=17 Value=0.152830 ,error=0.001132
Test=GAL; mean=0.158806, row=23 Value=0.162640 ,error=0.003834
Test=GOO; mean=0.128713, row=26 Value=0.123770 ,error=0.004943
Test=LCD; mean=0.046140, row=28 Value=0.046140 ,error=0.000000
Test=MEN; mean=0.139204, row=31 Value=0.138550 ,error=0.000654
Test=ML; mean=0.123010, row=37 Value=0.122830 ,error=0.000180
Test=MP3; mean=0.037224, row=42 Value=0.036360 ,error=0.000864
Test=MUS; mean=0.065843, row=45 Value=0.063890 ,error=0.001953
Test=RBS; mean=0.003770, row=47 Value=0.003770 ,error=0.000000
Test=TAK; mean=0.539522, row=51 Value=0.539510 ,error=0.000012
Test=VID; mean=0.123862, row=56 Value=0.123470 ,error=0.000392
Test=WAP; mean=0.193967, row=59 Value=0.189900 ,error=0.004067
Test=WHA; mean=0.164740, row=62 Value=0.164910 ,error=0.000170
Test=WIF; mean=0.182384, row=67 Value=0.181780 ,error=0.000604
Test=YOU; mean=0.162776, row=71 Value=0.163700 ,error=0.000924

More Answers (3)

KSSV
KSSV on 31 Oct 2018
T = [1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3]' ; % test
R = [4 12 13 11 13 43 35 31 49 42 44 38 137 134 173 150 120 138]' ; % REsult
test = [T R] ;
% GEt the mean of each test
[C,ia,ib] = unique(T) ;
N = length(C) ;
iwant = zeros(N,2) ; % frst column test, second column mean
for i = 1:N
iwant(i,:) = [C(i) mean(R(T==C(i)))] ;
end

Bruno Luong
Bruno Luong on 31 Oct 2018
Edited: Bruno Luong on 31 Oct 2018
ntests = 1000;
TC = ceil(10*rand(ntests,1));
R = rand(ntests,1);
% Here is the n x 2 input matrix
A = [TC, R];
% Compute the mean
[tnum,~,J] = unique(A(:,1));
tmean = accumarray(J,A(:,2),[],@mean);
% Compute the distance to the mean
dR = abs(A(:,2)-tmean(J));
% Find the closest
% the bellow statement can get less messy if coded in an mfile
findclosest = @(i) i(find(dR(i)==min(dR(i)),1,'first'));
tidx = accumarray(J,1:size(dR),[],findclosest);
A(tidx,:)
tmean

ILoveML
ILoveML on 1 Nov 2018
Thanks so much Bruno Luong!

Products


Release

R2015b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!