Using Row Labels in Table and Timetable Operations

Tables and timetables provide ways to label the rows in your data. In tables, you can label the rows with names. In timetables, you must label the rows with dates, times, or both. Row names are optional for tables, but row times are required for timetables. These row labels are part of the metadata in a table or timetable. In some functions you also can use row labels as key variables, grouping variables, and so on, just as you can use the data variables in a table or timetable. These functions are sortrows, join, innerjoin, outerjoin, varfun, rowfun, stack, and unstack. There are some limitations on using these table functions and on using row labels as key variables.

Sort on Row Labels

For example, you can sort a timetable on its row times, on one or more of its data variables, or on row times and data variables together.

Create a timetable using the timetable function. A timetable has row times along its first dimension, labeling the rows. The row times are a property of the timetable, not a timetable variable.

Date = datetime(2016,7,[10;10;11;11;10;10;11;11]);
X = [1;1;1;1;2;2;2;2];
Y = {'a';'b';'a';'b';'a';'b';'a';'b'};
Z = [1;2;3;4;5;6;7;8];
TT = timetable(X,Y,Z,'RowTimes',Date)
TT=8×4 timetable
       Time        X      Y      Z
    ___________    _    _____    _

    10-Jul-2016    1    {'a'}    1
    10-Jul-2016    1    {'b'}    2
    11-Jul-2016    1    {'a'}    3
    11-Jul-2016    1    {'b'}    4
    10-Jul-2016    2    {'a'}    5
    10-Jul-2016    2    {'b'}    6
    11-Jul-2016    2    {'a'}    7
    11-Jul-2016    2    {'b'}    8

Rename the first dimension. By default, the name of the first dimension of a timetable is Time. You can access the Properties.DimensionNames property to rename a dimension.

TT.Properties.DimensionNames{1} = 'Date';
TT.Properties.DimensionNames
ans = 1x2 cell array
    {'Date'}    {'Variables'}

As an alternative, you can specify the row times as the first input argument to timetable, without specifying 'RowTimes'. The timetable function names the row times, or the first dimension, after the first input argument, just as it names the timetable variables after the other input arguments.

TT = timetable(Date,X,Y,Z)
TT=8×4 timetable
       Date        X      Y      Z
    ___________    _    _____    _

    10-Jul-2016    1    {'a'}    1
    10-Jul-2016    1    {'b'}    2
    11-Jul-2016    1    {'a'}    3
    11-Jul-2016    1    {'b'}    4
    10-Jul-2016    2    {'a'}    5
    10-Jul-2016    2    {'b'}    6
    11-Jul-2016    2    {'a'}    7
    11-Jul-2016    2    {'b'}    8

Sort the timetable by row times. To sort on row times, refer to the first dimension of the timetable by name.

sortrows(TT,'Date')
ans=8×4 timetable
       Date        X      Y      Z
    ___________    _    _____    _

    10-Jul-2016    1    {'a'}    1
    10-Jul-2016    1    {'b'}    2
    10-Jul-2016    2    {'a'}    5
    10-Jul-2016    2    {'b'}    6
    11-Jul-2016    1    {'a'}    3
    11-Jul-2016    1    {'b'}    4
    11-Jul-2016    2    {'a'}    7
    11-Jul-2016    2    {'b'}    8

Sort by the data variables X and Y. sortrows sorts on X first, then on Y.

sortrows(TT,{'X' 'Y'})
ans=8×4 timetable
       Date        X      Y      Z
    ___________    _    _____    _

    10-Jul-2016    1    {'a'}    1
    11-Jul-2016    1    {'a'}    3
    10-Jul-2016    1    {'b'}    2
    11-Jul-2016    1    {'b'}    4
    10-Jul-2016    2    {'a'}    5
    11-Jul-2016    2    {'a'}    7
    10-Jul-2016    2    {'b'}    6
    11-Jul-2016    2    {'b'}    8

Sort by row times and X together.

sortrows(TT,{'Date' 'X'})
ans=8×4 timetable
       Date        X      Y      Z
    ___________    _    _____    _

    10-Jul-2016    1    {'a'}    1
    10-Jul-2016    1    {'b'}    2
    10-Jul-2016    2    {'a'}    5
    10-Jul-2016    2    {'b'}    6
    11-Jul-2016    1    {'a'}    3
    11-Jul-2016    1    {'b'}    4
    11-Jul-2016    2    {'a'}    7
    11-Jul-2016    2    {'b'}    8

Use Row Labels as Grouping or Key Variables

When you group rows together using the rowfun, varfun, stack, and unstack functions, you can specify row labels as grouping variables. When you join tables or timetable together using the join, innerjoin, and outerjoin functions, you can specify row labels as key variables.

For example, you can perform an inner join two tables together, using row names and a table variable together as key variables. An inner join keeps only those table rows that match with respect to the key variables.

Create two tables of patient data. A table can have row names along its first dimension, labeling the rows, but is not required to have them. Specify the last names of patients as the row names of the tables. Add the first names of the patients as table variables.

A = table({'Michael';'Louis';'Alice';'Rosemary';'Julie'},[38;43;45;40;49],...
    'VariableNames',{'FirstName' 'Age'},...
    'RowNames',{'Garcia' 'Johnson' 'Wu' 'Jones' 'Picard'})
A=5×2 table
                FirstName      Age
               ____________    ___

    Garcia     {'Michael' }    38 
    Johnson    {'Louis'   }    43 
    Wu         {'Alice'   }    45 
    Jones      {'Rosemary'}    40 
    Picard     {'Julie'   }    49 

B = table({'Michael';'Beverly';'Alice'},...
    [64;69;67],...
    [119;163;133],...
    [122 80; 109 77; 117 75],...
    'VariableNames',{'FirstName' 'Height' 'Weight' 'BloodPressure'},...
    'RowNames',{'Garcia' 'Johnson' 'Wu'})
B=3×4 table
                FirstName     Height    Weight    BloodPressure
               ___________    ______    ______    _____________

    Garcia     {'Michael'}      64       119       122     80  
    Johnson    {'Beverly'}      69       163       109     77  
    Wu         {'Alice'  }      67       133       117     75  

If a table has row names, then you can index into it by row name. Indexing by row names is a convenient way to select rows of a table. Index into B by a patient's last name to retrieve information about the patient.

B('Garcia',:)
ans=1×4 table
               FirstName     Height    Weight    BloodPressure
              ___________    ______    ______    _____________

    Garcia    {'Michael'}      64       119       122     80  

Perform an inner join on the two tables. Both tables use the last names of patients as row names, and contain the first names as a table variable. Some patients in the two tables have matching last names but different first names. To ensure that both last and first names match, use the row names and FirstName as key variables. To specify the row names as a key or grouping variable, use the name of the first dimension of the table. By default, the name of the first dimension is 'Row'.

C = innerjoin(A,B,'Keys',{'Row','FirstName'})
C=2×5 table
               FirstName     Age    Height    Weight    BloodPressure
              ___________    ___    ______    ______    _____________

    Garcia    {'Michael'}    38       64       119       122     80  
    Wu        {'Alice'  }    45       67       133       117     75  

If you rename the first dimension of a table, then you can refer to the row names by that name instead of using 'Row'. Perform the same inner join as above but use a different name to refer to the row names.

Show the dimension names of A by accessing its Properties.DimensionNames property.

A.Properties.DimensionNames
ans = 1x2 cell array
    {'Row'}    {'Variables'}

Change the name of the first dimension of the table by using its Properties.DimensionNames property. Then use the new name as a key variable.

A.Properties.DimensionNames{1} = 'LastName';
A.Properties.DimensionNames
ans = 1x2 cell array
    {'LastName'}    {'Variables'}

Perform an inner join on A and B using LastName and FirstName as key variables.

B.Properties.DimensionNames{1} = 'LastName';
D = innerjoin(A,B,'Keys',{'LastName','FirstName'})
D=2×5 table
               FirstName     Age    Height    Weight    BloodPressure
              ___________    ___    ______    ______    _____________

    Garcia    {'Michael'}    38       64       119       122     80  
    Wu        {'Alice'  }    45       67       133       117     75  

Notes on Use of Table Functions and Row Labels

  • You cannot stack or unstack row labels using the stack and unstack functions. However, you can use row labels as grouping variables.

  • You cannot perform a join using the join, innerjoin, or outerjoin functions when the first argument is a table and the second argument is a timetable. However, you can perform a join when both arguments are tables, both are timetables, or the first argument is a timetable and the second is a table.

  • The output of a join operation can have row labels if you specify row labels as key variables. For more details on row labels from a join operation, see the documentation on the 'Keys', 'LeftKeys', and 'RightKeys' arguments of the join, innerjoin, and outerjoin functions.

See Also

| | | | | | |