Merging .mat files into 1 file, only containing variables in array form

Hi there,
I want to merge 5 .mat files into 1 .mat file. When doing this, the end result is a mat file which contains 1 structure array. I want the result to be a mat file that only contains matrix arrays.
The situation is follows:
- Each mat file contains 11 variables (with the same name), and each variable contains 136000x1 samples.
- The end result should be 1 mat file that contains the 11 variables, but with 136000x5 samples.
Aside of the accepted answer, this code works as well:
clear all
close all
clc
%% loading data
% Create example data
y = load('Data_file1.mat');
z = load('Data_file2.mat');
q = load('Data_file3.mat');
vrs = fieldnames(y);
if ~isequal(vrs,fieldnames(y))
error('Different variables in these MAT-files')
end
for k = 1:length(vrs)
x.(vrs{k}) = [y.(vrs{k});z.(vrs{k});q.(vrs{k})];
end
% Save result in a new file
save('Data.mat','-struct','x')

 Accepted Answer

FileList = dir(fullfile(Folder, '*.mat')); % List of all MAT files
allData = struct();
for iFile = 1:numel(FileList) % Loop over found files
Data = load(fullfile(Folder, FileList(iFile).name));
Fields = fieldnames(Data);
for iField = 1:numel(Fields) % Loop over fields of current file
aField = Fields{iField};
if isfield(allData, aField) % Attach new data:
allData.(aField) = [allData.(aField), Data.(aField)];
% [EDITED]
% The orientation depends on the sizes of the fields. There is no
% general method here, so maybe it is needed to concatenate
% vertically:
% allData.(aField) = [allData.(aField); Data.(aField)];
% Or in general with suiting value for [dim]:
% allData.(aField) = cat(dim, allData.(aField), Data.(aField));
else
allData.(aField) = Data.(aField);
end
end
end
save(fullfile(Folder, 'AllData.mat'), '-struct', 'allData');

16 Comments

Dear Jan,
I would be grateful if you could kindly suggest what change should be made to your above solution such that the variables in 'allData' appear in matrix form as opposed to array form. Many thanks!
Usman
Usman:
Please expand on your meaning? In MATLAB, "matrix" and "array" mean the same thing -- though sometimes people consider "matrix" to only refer to 2D arrays.
@Usman: Perhaps all you need to change is:
allData.(aField) = [allData.(aField), Data.(aField)];
to
allData.(aField) = [allData.(aField)(:), Data.(aField)(:)];
Then the contents of the fields are joined as columns of a matrix, instead of creating a vector (if you mean this by "array"). Therefore the corresponding data need to have the same number of elements.
If you want something else, please explain it again with a small example.
@ Walter: apologies for the confusion caused by the misuse of terminology. By “array” I meant a vector i.e. a 1D array and by “matrix” I meant a 2D array (as you correctly pointed out).
@ Jan: many thanks for your solution; this is exactly what I was looking for.
@ Walter & Jan: I was wondering if I could get your respective thoughts on the following problem (which is related to my original question posted above). Any ideas on how best to formulate a solution would be greatly appreciated.
I have the following sequence of .mat files (540 files in total) saved in some directory
output_r10_tht00_imp30.mat
output_r10_tht00_imp35.mat
:
output_r10_tht00_imp70.mat
output_r10_tht15_imp30.mat
output_r10_tht15_imp35.mat
:
output_r10_tht15_imp70.mat
output_r10_tht30_imp30.mat
output_r10_tht30_imp35.mat
:
output_r10_tht30_imp70.mat
:
output_r10_tht135_imp70.mat
output_r11_tht00_imp30.mat
output_r11_tht00_imp35.mat
:
output_r11_tht00_imp70.mat
output_r11_tht15_imp30.mat
output_r11_tht15_imp35.mat
:
output_r11_tht15_imp70.mat
output_r11_tht30_imp30.mat
output_r11_tht30_imp35.mat
:
output_r11_tht30_imp70.mat
:
output_r15_tht135_imp70.mat
These are outputs of a Matlab simulation that was run for different combinations of values for the parameters ‘r’, ‘tht’ and ‘imp’ respectively. Each .mat file contains a 1x1 structure called ‘output’ that contains variables with the same names, e.g. A, B … etc. The numbers that follow the characters ‘r’, ‘tht’ and ‘imp’ in the above filenames denote the values for the respective parameters pertaining to each simulation. For example, the first one that appears in the list represents the combination r = 10, tht = 0, imp = 30, and so on. The parameter ‘r’ changes in increments of 1, the parameter ‘tht’ changes in increments of 15, and the parameter ‘imp’ changes in increments of 5. In other words, we could represent the parameter ‘r’ by the 1 x 6 array
r = [10 11 12 13 14 15];
the parameter ‘tht’ by the 1 x 10 array
tht = [0 15 30 45 60 75 90 105 120 135];
and the parameter ‘imp’ by the 1 x 9 array
imp = [30 35 40 45 50 55 60 65 70];
Now to the problem specifically. I would like to be able to concatenate the variable A (for example) from .mat files with imp = 30 into a 2D array called X (for example) whose rows are the dimensions of ‘r’ and whose columns are the dimensions of ‘tht’ (i.e. a 6 x 10 array). In this way, the element X(2,2) should be the value of the variable A contained in the file output_r10_tht15_imp30.mat, and so on. Furthermore, I would like to be able to repeat this process for imp = 35, imp = 40 etc.
Apologies for the lengthy post. Again, any suggestions would be really appreciated.
Usman
Note that this is a new question. Then it is better to open a new thread than to hijacking another one. Now you cannot accept an answer.
Start with importing the files:
Folder = 'C:\Your\Folder';
FileList = dir(fullfile(Folder, '*.mat'));
nFile = numel(FileList);
A = cell(1, nFile);
imp = NaN(1, nFile);
for iFile = 1:nFile
FileName = FileList(iFile).name;
File = fullfile(Folder, FileName);
Data = load(File);
A{iFile} = Data.A;
% Parse "output_r10_tht15_imp70.mat"
value = sscanf(FileName, 'output_r%i_tht%i_imp%i.mat');
imp(iFile) = value(3);
end
Add the equivalent code for collecting r and tht. Then:
match = r == 10 & imp = 70;
A_for_r10_imp70 = cat(1, A{match})
Replace the 1 by the dimension you want to concatenate the arrays of A over.
@ Jan: my apologies for not following the correct etiquette when posting a new question. I will now re-post the above question under a new submission.
You are welcome, Usman. Your chances to get a useful answer will grow, if you open a new thread.
@ Md Shahriar Islam: Please use flags only to inform admins and editors about content, which might conflict with the terms of use, like spam or rudeness. Thanks.
How can this code be adjusted if the files you're trying to compile are in the subdirectories? Thanks in advance!
To get MAT files from different subfolders, modify the code:
% List of MAT files in all subfolders (>= R2016b):
FileList = dir(fullfile(Folder, '..', '*.mat'));
allData = struct();
for iFile = 1:numel(FileList) % Loop over found files
aFile = FileList(iFile);
Data = load(fullfile(aFile.folder, aFile.name));
... same as above
Hello Jan,
I have multiple .MAT files, all contain one row with 8 numbers and I want to merge them into single MAT file. Your code works, but it merges the files into one long row, but I want to add the new data into new rows. How do I modify the code? Thanks.
allData.(aField) = [allData.(aField), Data.(aField)];
would become
allData.(aField) = [allData.(aField); Data.(aField)];
This requires that each file has rows with the same number of columns.
what if the number of columns are different?
If the number of columns vary, you can use a compound data structure (such as cell arrays), or you can use padding to indicate missing entries. The easiest padding is zeros, but padding with other values is not too bad.
oldcols = size(all_data,2);
newcols = size(new_data,2);
if oldcols < newcols
all_data(:,oldcols+1:newcols) = Fill_value;
elseif newcols < oldcols
new_data(:,newcols+1:oldcols) = Fill_value;
end
all_data = [all_data; new_data];
i have 3 different file location and 50 .mat file in each i just want to put then in one file as a matrix. please help
@Rakesh: what have you tried so far? What do the .MAT file contain?

Sign in to comment.

More Answers (4)

use the save function specifying the variable names:
1.
let's say you have saved your 5 in these 5 .mat files
save var1.mat v1
save var2.mat v2
save var3.mat v3
save var4.mat v4
save var5.mat v5
2.
if not already in the workspace, load them
load var1.mat
load var2.mat
load var3.mat
load var4.mat
load var5.mat
now v1 v2 v3 v4 v5 should be in the workspace
3. combine all the variables you want in a single .mat file
save vars12345.mat v1 v2 v3 v4 v5
Now you have the 5 variables that were in separate .mat files in single .mat file
if you find these lines useful would you please mark my answer as Accepted Answer?
To any other reader, please if you find this answer of any help, click on the thumbs-up vote link,
thanks in advance for time and attention
John BG

3 Comments

I am sorry, but I do not understand what you mean with save var1.mat v1.
Is var1.mat the mat file with the 11 variables, and v1 one of the variable name?
Still thank you for the quick response.
save [ your made-up filename].mat [ the variable you want to save]
is I believe what he is saying. However you need your variables already loaded in your workspace I'd think
correct, if not loaded, load first the variables, thanks Nighttalker

Sign in to comment.

John BG
John BG on 26 Dec 2016
Edited: John BG on 26 Dec 2016
Ok,
If you have exactly the same variable names in different .mat files and attempt loading them, you have to change variables identically named, otherwise the last load will override previous values of variables with same names.
I wrote a short script for another answer asking to merge figures in same file. I can modify that script to help you solve this question.
But if you tell me that the names of the already stored variables cannot be changed then I will generate a list and alter variable names that attempt override.
how do you want me to proceed?

2 Comments

> If you have exactly the same variable names in different .mat files and attempt loading them, you have to >change variables identically named, otherwise the last load will override previous values of variables with >same names.
I was aware of the problem, but the names of the stored variables cannot be changed. If you could write a new script, that would be nice.
"If you have exactly the same variable names in different .mat files and attempt loading them, you have to change variables identically named, otherwise the last load will override previous values of variables with same names."
Keeping the names the same in every .MAT file is the way to write simple, efficient, robust code. It is very easy to avoid overwriting the LOADed data in a loop (hint: always LOAD into an output argument and access its fields).
Following the bad advice given in this answer and changing the names in every file is how you will force yourself into writing slow, complex, obfuscated, inefficient, fragile code. Best avoided.

Sign in to comment.

the following is really basic but it does what you asked for, there are some points that you may want to improve but improvements take time, they can be built them gradually:
1.
put all .mat files in same folder
cd 'folder_mat_files'
2.
get in the folder where all .mat files to merge have been placed
system('dir /o:n /b > list.txt')
3.
build list
fid=fopen('list.txt')
s1=textscan(fid,'%s')
fclose(fid)
[sz1 sz2]=size(s1{1})
4.
PENDING
remove list.txt from s1
5.
getting var names stored in .mat files
C={};
for k=1:1:sz1
if regexp(s1{1}{k},'.mat')
C{k}=who('-file',s1{1}{k})
end;
end;
[szC1 szC2]=size(C) % szC2 should be sz1-1 amount of .mat files
6.
simple case just 2 .mat files to merge
L=combinations([1:1:numel(C{1})],[1:1:numel(C{1})])
[sz1L sz2L]=size(L)
C2=C;
7.
first in has priority, any variable in second .mat file with same name as in 1st .mat is renamed
for k=1:1:sz1L
if strcmp(C{1}{L(k,1)},C{2}{L(k,2)})
C{2}{L(k,2)}=[C{2}{L(k,2)} '_copy']
end
end
8.
create file to collect input .mat files adding var string named L12 containing all var names
L12=['merge of ' s1{1}{1} ' ' s1{1}{2}]
save('merge_file.mat','L12')
9.
% for k=1:1:szC2 % this for is to process more than 2 .mat files to merge, v1 just 2 .mat files
[sz1C1 sz2C1]=size(C{1})
for n=1:1:sz1C1
load(s1{1}{1},'-mat',C{1}{n})
save('merge_file.mat',C{1}{n},'-append')
clearvars C{1}{n}
end
[sz1C2 sz2C2]=size(C{2})
for n=1:1:sz1C2
load(s1{1}{2},'-mat',C{2}{n})
eval([C{2}{n} '=' C2{2}{n} ])
save('merge_file.mat',C{2}{n},'-append')
clearvars C{2}{n}
end
% end
awaiting answer
John BG

1 Comment

@John BG: It is very inefficient to call dir through the system command, because it can be called from Matlab directly.
Please do not post functions from the file exchange without the required license file. The BSD license is clear in this point. Better use a link to the original submission. Then future updates or bugfixes are considered also: http://www.mathworks.com/matlabcentral/fileexchange/23080-combinations

Sign in to comment.

function ans = mergeallmatfiles % Merge all mat files in folder
% The set of var. names must be the same in all *.mat files
% NOTE: No error catching!
Folder=pwd; % Read name of current folder
FileList = dir(fullfile(Folder, '*.mat')); % List of all MAT files
allData = struct();
for iFile = 1:numel(FileList) % Loop over found files
Data = load(fullfile(Folder, FileList(iFile).name));
Fields = fieldnames(Data);
for iField = 1:numel(Fields) % Loop over fields of current file
aField = Fields{iField};
if isfield(allData, aField) % Attach new data:
allData.(aField) = [allData.(aField); Data.(aField)]; % Note: must be semicolon
else
allData.(aField) = Data.(aField);
end
end
end
save(fullfile(Folder, 'AllData.mat'), '-struct', 'allData');
end
% Thanks to Jan: https://se.mathworks.com/matlabcentral/answers/318025-merging-mat-files-into-1-file-only-containing-variables-in-array-form
% Regards. RA

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Asked:

Bas
on 23 Dec 2016

Commented:

on 16 Oct 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!