organising large sets of data into a irregular matrix

Hello,
I have a large data sets with of 40000+ samples, which are organised in 2 columns. The first colum tells me a cluster-number (column 1) by which I would like to sort, and the second a value is a time which I would like to allocated to the individual clusters.
the data looks something like:
A=[32 7.83425
32 8.0074
5 8.01005
5 8.0119
5 8.10775
19 8.1082
7 8.1085
4 8.1089]
I would like to organise this set into a matrix or table. One caveat is that the clusters are different sizes.
I have managed to get the individual cluster number out of my matrix, using the unique function.
I thought this would be similar to this thread:
https://uk.mathworks.com/matlabcentral/answers/19877-organizing-data-into-a-matrix-generating-a-matrix
However I am struggling to assign the values to the clusters because of the irregular number of values per cluster.
Can someone help me with this?

 Accepted Answer

You can use cell arrays for this.
A=[ 32 7.83425
32 8.0074
5 8.01005
5 8.0119
5 8.10775
19 8.1082
7 8.1085
4 8.1089 ]
[unA, ~, j] = unique(A(:,1)) ;
C = accumarray(j, A(:,2), [numel(unA),1], @(x) {x})
% The cell C{k} contains all values that belong to unA(k)
However, for some purposes you can also use a matlab table, for which dedicated functions, like splitapply, exist to operate on groups. This may worthy of some exploration on your site ...

4 Comments

Hi Jos, thanks for this, creating the cell array works very well! However, I would like to export the data within the cell array to either Excel or a csv format. so that I can import the data into another software package.
My cell array is a 30x1 cell array, the individual parts are not all the same size!
I have tried several versions, but I didn't get it to work, yet. And I mostly get an error message like this "Error using horzcat
Dimensions of arrays being concatenated are not consistent.
Error in readspiketimes2 (line 26)
fprintf(file,[sprintf(['%s'],C1{row,1:ncols-1}) C1{row,ncols} '\n']);"
Here is my code (it is only part):
[a1, ~, j] = unique(clusters(:,1 )) ;
C = accumarray(j, B(:,2), [numel(a1),1], @(x) {x })% The cell C contains all spiketime values per cluster
C1 = C'; % I have played with this whether it makes a difference of the array being horizontal or %vertical, but either gives me the error
C1 = cellfun(@string, C, 'UniformOutput', false); %convert doubles into strings
file = fopen('times.xlsx', 'w');
[nrows,ncols] = size(C1);
for row = 1:numel(C1)
fprintf(file,[sprintf(['%s'],C1{row,1:ncols-1}) C1{row,ncols} '\n']);
end
fclose(file);
I think you're better off writing to a text file. An option:
C = {[1 2 3], 1, [1 2 3 4]} % some test data
myfile = 'myfile.txt' ;
dlmwrite(myfile ,reshape(C{1},1,[]))
for k=2:size(C,1),
dlmwrite(myfile , reshape(C{k},1,[]), '-append') ;
end
Almost perfect, I just need to switch columns and rows of the myfile, I will try this later, else I have trouble importing the data into Excel due to too long rows.
Thank you very much!
You're welcome :-)
Switching to column orientation inside the text file might not be trivial though. As you need to re-arrange the values between the cells of C.
You might also be interested in xlswrite, btw.

Sign in to comment.

More Answers (1)

The xlswrite function did the trick! It took me a while to get it to work, but now I am happy! So simple!
myfile = '*.xlsx' ;
for k=1:size(C,1)
xlswrite(myfile ,reshape(C{k},[],1),k);
end
Thanks again, Jos!

Products

Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!