Concatenate multiple tables with different variable names vertically to a big table
Show older comments
Hi,
I have 20 tables in the matlab workplace with the the same dimensions (1000x8). However, the variable names are not the same. How can i concatenate the 20 tables vertically? I do not need the variable names in the big table, I'm just interest in the values within the 20 tables...Please note that the first column is datetime in each table. This, among other aspects, creates difficulties for me. The datetimes are not necessary either.
Helpful for any advice.
Kind regards
7 Comments
Dyuman Joshi
on 15 Nov 2023
Edited: Dyuman Joshi
on 15 Nov 2023
Convert the tables to cell arrays then concatenate, then convert back to a table.
Or edit the Variables names to be same and then vertically concatenate.
Peter Perkins
on 16 Nov 2023
Converting a table to a cell array is generally not a good idea. Each element of the table (not each variable in the table) becomes a separate cell. Memory issues aside (with 1000x8, maybe not a problem, but with 1000000x8, definitely), you lose the homogeneity of each variable in the table, and run the risk of ending up with some cells containing, e.g., numbers and others containing who knows what. And the operations you can perform on a cell array are limited. I realize that you've suggested converting back, but table2cell is a risky tool to use, and your second approach is the way to go.
Dyuman Joshi
on 17 Nov 2023
@Peter Perkins - Yes, I agree that the operations that can be performed on a cell array are limited.
But doesn't MATLAB store tables as a cell arrays internally? (I remember the lengthy thread where Walter demonstrated this)
"but table2cell is a risky tool to use"
Could you elaborate on this?
Peter Perkins
on 17 Nov 2023
I interpreted "convert table to cell array" as "call table2cell". I may have misunderstood.
Consider this:
X = rand(1000000,10);
T = array2table(X);
C = table2cell(T);
whos
Name Size Bytes Class Attributes
C 1000000x10 1120000000 cell
T 1000000x10 80002889 table
X 1000000x10 80000000 double
X stores 10M doubles as one contiguous array.
T stores 10 vectors, each 1M doubles. There are 11 arrays (the eleventh is the cell array that Walter referred to, but it's not visible and you shouldn't count on it).
C stores 10M arrays, each a scalar double. It's the memory of 10M array "headers" (something that's not visible, but that whos, for a cell array, accounts for). So when I said "table2cell is risky", I meant you may use up A LOT of memory you did not mean to.
Dyuman Joshi
on 18 Nov 2023
"I interpreted "convert table to cell array" as "call table2cell". I may have misunderstood."
No, you were right to interpret it that way. That's what I meant.
Ok, yes, I kinda forgot that table2cell() converts to cell where each element of the table is stored as a scalar. And the overhead is quite hefty.
I have one more question - Why was table2cell() made to behave this way? Why not convert each column of table as a cell element? I feel that this is something that should have been discussed when the function was being written.
One way around this is to call TABLE2STRUCT with ToScalar=true (thus keeping the table columns/variables together), then STRUCT2CELL. I guess due to copy-on-write the conversion won't even copy any data in memory.
X = rand(1000000,10);
T = array2table(X);
C = struct2cell(table2struct(T,'ToScalar',true));
whos
Given that tables apparently store the column/variable data in a cell array it would be nice to have a way to access that cell array efficiently, e.g. to have ToRowVector option for TABLE2CELL (much like TABLE2STRUCT has the ToScalar option). And also the corresponding FromRowVector for CELL2TABLE.
Peter Perkins
on 20 Nov 2023
If you find yourself needing functions like these, you may be doing something unnecessary. In this case, the much more straight-forward solution is to just set the variables names all the same.
Accepted Answer
More Answers (1)
Steven Lord
on 15 Nov 2023
0 votes
Are you trying to combine rows of those tables based on the date and time information stored in the first variable? If so I would try using table2timetable to turn those tables into timetables then use synchronize to combine the timetables based on their row times.
Categories
Find more on Cell Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!