How can i randomly divide a dataset(matrix) into k parts ??

I have a database and i want to randomly divide it into ka parts of equal size . if the database has n row each part will contain n/k randomly chosen row from the dataset .

 Accepted Answer

function [idxo prtA]=randDivide(M,K)
[n,m]=size(M);
np=(n-rem(n,K))/K;
B=M;
[c,idx]=sort(rand(n,1));
C=M(idx,:);
i=1;
j=1;
ptrA={};
idxo={};
n-mod(n,K)
while i<n-mod(n,K)
prtA{j}=C(i:i+np-1,:);
idxo{i}=idx(i:i+np-1,1);
i=i+np;
j=j+1;
end
prtA{j}=C(i:n,:);
end
this a my algo it works very well think u for ur answers

More Answers (2)

Suppose you have the N by M matrix A. I would randomly permute positions from 1:N and then group them into k partitions. Follows the code.
% Sample inputs
N = 100;
A = rand(N,2);
% Number of partitions
k = 6;
% Scatter row positions
pos = randperm(N);
% Bin the positions into k partitions
edges = round(linspace(1,N+1,k+1));
Now you can "physically" partition A, or apply your code to the segments of without actually separating into blocks.
% Partition A
prtA = cell(k,1);
for ii = 1:k
idx = edges(ii):edges(ii+1)-1;
prtA{ii} = A(pos(idx),:); % or apply code to the selection of A
end
EDIT
You can also avoid the loop, but in that case you have to build a group index that points the row to which partition it belongs and then apply accumarray() to execute your code on the partitions.

4 Comments

i tried ur code and
N = 20;
A = rand(N,4);
% Number of partitions
k = 3;
% Scatter row positions
pos = randperm(N);
% Bin the positions into k partitions
edges = round(linspace(1,N+1,k+1));
% Partition A
prtA = cell(k,1);
for ii = 1:k
idx = edges(ii):edges(ii+1)-1;
prtA{ii} = A(pos(idx),:); % or apply code to the selection of A
end
>> prtA{1}
ans =
0.8449 0.0473 0.8380 0.9653
0.8271 0.1157 0.2192 0.7255
0.3530 0.7918 0.0578 0.7980
0.3644 0.4857 0.3543 0.1227
0.4233 0.5737 0.5924 0.1201
0.6064 0.3261 0.5511 0.9597
0.2981 0.2081 0.4948 0.5097
>> prtA{2}
ans =
0.7283 0.3060 0.6703 0.6065
0.5678 0.5613 0.3997 0.6631
0.9188 0.7568 0.8532 0.2698
0.0012 0.8892 0.3832 0.4133
0.2181 0.3098 0.6159 0.0196
0.1474 0.3711 0.1479 0.9500
>> prtA{3}
ans =
0.8401 0.1154 0.2291 0.4997
0.0225 0.8911 0.2773 0.6086
0.6951 0.6617 0.1875 0.4398
0.4731 0.3774 0.0707 0.7327
0.3307 0.4178 0.1733 0.9329
0.3416 0.0135 0.3195 0.2772
0.3953 0.6725 0.6642 0.1881
prtA =
[7x4 double]
[6x4 double]
[7x4 double]
how can i do to get the smaller partition at the end
Why does it matter?
Anyways, after the loop:
% Index smaller as last
[~,idx] = sort(diff(edges),'descend');
prtA = prtA(idx);
Yes i tried it but prtA =
[17x2 double]
[17x2 double]
[17x2 double]
[17x2 double]
[16x2 double]
[16x2 double]
and what i want is all the result that i expect is
prtA =
[17x2 double]
[17x2 double]
[17x2 double]
[17x2 double]
[17x2 double]
[15x2 double]
all the partition with the same size and the rest in the last partition
HOw to do that
Adapting to your requests, I build edges in a slightly different way then:
% Sample inputs scrambling
N = 100;
A = rand(N,2);
k = 6;
pos = randperm(N);
% Edges
edges = 1:round(N/k):N+1;
if numel(edges) < k+1
edges = [edges N+1];
end
% partition
prtA = cell(k,1);
for ii = 1:k
idx = edges(ii):edges(ii+1)-1;
prtA{ii} = A(pos(idx),:);
end

Sign in to comment.

A=rand(210,4);[n,m]=size(A);
np=20;B=A;
[c,idx]=sort(rand(n,1));
C=A(idx,:);
idnan=mod(np-rem(n,np),np)
C=[C ;nan(idnan,m)];
[n,m]=size(C);
for k=1:n/np
ind=(k-1)*np+1:k*np
res(:,:,k)=C(ind,:)
end
idxo=reshape([idx ;nan(idnan,1)],np,1,n/np) % your original index

9 Comments

hi think for the answer , how can i save the original index ??
the original index is idxo
check the updated code
it works well , but i dont understand the role of each line A=rand(200010,4); %create the matrix [n,m]=size(A);%get the matrix size np=20;%define the submatrix size B=A;%just keep the original matrix [c,idx]=sort(rand(n,1)); %sort the matrix randomly /or sort a randomly genratematrix of n row and A column C=A(idx,:); %the matrix A randomly sorted idnan=mod(np-rem(n,np),np) %whats the role of thsi component ? do u add to the sorted matrix some rows in order to prevent class with less than np C=[C ;nan(idnan,m)]; %adjust the C with adding NaN rows just to keep the final submatrix with %np rows [n,m]=size(C);%get the new size res=reshape(C,[np,m,n/np]) % your submatrix : n/np matrix with np rows m columns idxo=reshape([idx ;nan(idnan,1)],np,1,n/np) % your original index Is it raght ?
idnan=mod(np-rem(n,np),np)
% if you have 205 data and np=20, we have to complete with 15 nan value, will be ignored, it help us to reshape 220/20 like you said
i tried the code , but i found the nan in every partition and in every column how can i group them in the last partition and let them in the same rows?
A=rand(5,4);[n,m]=size(A);
np=2;B=A;
[c,idx]=sort(rand(n,1));
C=A(idx,:);
idnan=mod(np-rem(n,np),np)
C=[C ;nan(idnan,m)];
[n,m]=size(C);
res=reshape(C,[np,m,n/np]) % your submatrix
idxo=reshape([idx ;nan(idnan,1)],np,1,n/np); % your original index
idnan =
1
res(:,:,1) =
0.9571 0.0020 0.2607 0.5391
0.3271 0.3492 NaN 0.9611
res(:,:,2) =
0.8100 0.4447 0.3409 0.4193
0.6083 NaN 0.2305 0.8073
res(:,:,3) =
0.5130 0.7416 0.5576 0.3956
NaN 0.4363 0.2633 NaN
%instead of reshape use
for k=1:n/np
ind=(k-1)*np+1:k*np
res(:,:,k)=C(ind,:)
end
Because reshape is working column after column
Yes i see that , :((( so itsn't useful for my problem because i need to keep the initial configuration the matrix is dataset of objects so i can t change the column into rows what reshape do a column is descriptor.

Sign in to comment.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!