How to separate table columns by groups?

3 views (last 30 days)
If I have a table of data (T1) such as:
Var1 Var2
1 xyz
1 xyy
2 xxx
2 xzx
3 xyy
4 yzy
4 xzz
4 yzz
I would like to output another table (T2):
Var1 Var2
1 xyz,xyy
2 xxx,xzx
3 xyy
4 yzy,xzz,yzz
I have unsuccessfully used unstack, since it creates the values in Var2 as column names. Any suggestions would be much appreciated.
  3 Comments
Denis Pesacreta
Denis Pesacreta on 20 Jun 2018
No it is not. It is categorical of the form 'xyz'.
Denis Pesacreta
Denis Pesacreta on 20 Jun 2018
Var 1 is the output of a findgroups command, Var2 is categorical. Such as if Var1 was the output of a findgroups command on an ID variable, and Var2 was possessions (i.e person number one has a car and a house, while person number 4 may have a car, a house, and a computer)

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 20 Jun 2018
If Var2 is a categorical array, then I assume that your xyz,xyy is actually a 1x2 categorical array [xyz, xyy], not a char array. If so:
T1 = table([1;1;2;2;3;4;4;4], categorical({'xyz';'xyy';'xxx';'xzx';'xyy';'yzy';'xzz';'yzz'}))
varfun(@(values) {values'}, T1, 'GroupingVariables', 'Var1')
  3 Comments
Guillaume
Guillaume on 20 Jun 2018
No, that's the built-in disp of a table, you can't change that. tables are not really designed to hold arrays in columns. It's more designed to have scalar values in each column.
Peter Perkins
Peter Perkins on 3 Jul 2018
Guillaume, you are correct that it's sometimes difficult to display multi-column variables in a table, but tables absolutely are designed to support them (and in fact it's the reason why "variables" in tables are called "variables", not "columns" in the doc). The display works as you would expect in cases with only a few columns in each variable.
>> table(rand(3,2),rand(3,5))
ans =
3×2 table
Var1 Var2
__________________ _______________________________________________________
0.15761 0.48538 0.42176 0.95949 0.84913 0.75774 0.65548
0.97059 0.80028 0.91574 0.65574 0.93399 0.74313 0.17119
0.95717 0.14189 0.79221 0.035712 0.67874 0.39223 0.70605
Actually, in your solution, the second variable in the result from varfun is in fact a cell array, each cell containing a categorical vector. There's nothing at all wrong with that, it's completely supported by tables, and in fact if the different groups have different numbers of rows in the original table it's pretty much necessary. But you are right that the display is not as informative as it could be in some cases.

Sign in to comment.

More Answers (0)

Categories

Find more on Tables in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!