# setdiff

Set difference of two arrays

## Syntax

``C = setdiff(A,B)``
``````C = setdiff(A,B,setOrder)``````
``C = setdiff(A,B,___,'rows')``
``C = setdiff(A,B,'rows',___)``
``````[C,ia] = setdiff(___)``````
``````[C,ia] = setdiff(A,B,'legacy')``````
``````[C,ia] = setdiff(A,B,'rows','legacy')``````

## Description

example

````C = setdiff(A,B)` returns the data in `A` that is not in `B`, with no repetitions. `C` is in sorted order. If `A` and `B` are tables or timetables, then `setdiff` returns the rows from `A` that are not in `B`. For timetables, `setdiff` takes row times into account to determine equality, and sorts the output timetable `C` by row times. ```

example

``````C = setdiff(A,B,setOrder)``` returns `C` in a specific order. `setOrder` can be `'sorted'` or `'stable'`.```
````C = setdiff(A,B,___,'rows')` and `C = setdiff(A,B,'rows',___)` treat each row of `A` and each row of `B` as single entities and return the rows from `A` that are not in `B`, with no repetitions. You must specify `A` and `B` and optionally can specify `setOrder`.The `'rows'` option does not support cell arrays, unless one of the inputs is either a categorical array or a datetime array.```

example

``````[C,ia] = setdiff(___)``` also returns the index vector `ia` using any of the previous syntaxes. Generally, `C = A(ia)`.If the `'rows'` option is specified, then ```C = A(ia,:)```.If `A` and `B` are tables or timetables, then `C = A(ia,:)`. ```

example

``````[C,ia] = setdiff(A,B,'legacy')``` and ```[C,ia] = setdiff(A,B,'rows','legacy')``` preserve the behavior of the `setdiff` function from R2012b and prior releases.The `'legacy'` option does not support categorical arrays, datetime arrays, duration arrays, tables, or timetables.```

## Examples

collapse all

Define two vectors with values in common.

```A = [3 6 2 1 5 1 1]; B = [2 4 6];```

Find the values in `A` that are not in `B`.

`C = setdiff(A,B)`
```C = 1×3 1 3 5 ```

Define two tables with rows in common.

`A = table([1:5]',['A';'B';'C';'D';'E'],logical([0;1;0;1;0]))`
```A=5×3 table Var1 Var2 Var3 ____ ____ _____ 1 A false 2 B true 3 C false 4 D true 5 E false ```
`B = table([1:2:10]',['A';'C';'E';'G';'I'],logical(zeros(5,1)))`
```B=5×3 table Var1 Var2 Var3 ____ ____ _____ 1 A false 3 C false 5 E false 7 G false 9 I false ```

Find the rows in `A` that are not in `B`.

`C = setdiff(A,B)`
```C=2×3 table Var1 Var2 Var3 ____ ____ _____ 2 B true 4 D true ```

Define two vectors with values in common.

```A = [3 6 2 1 5 1 1]; B = [2 4 6];```

Find the values in `A` that are not in `B` as well as the index vector `ia`, such that `C = A(ia)`.

`[C,ia] = setdiff(A,B)`
```C = 1×3 1 3 5 ```
```ia = 3×1 4 1 5 ```

Define a table, `A`, of gender, age, and height for five people.

```A = table(['M';'M';'F';'M';'F'],[27;52;31;46;35],[74;68;64;61;64],... 'VariableNames',{'Gender' 'Age' 'Height'},... 'RowNames',{'Ted' 'Fred' 'Betty' 'Bob' 'Judy'})```
```A=5×3 table Gender Age Height ______ ___ ______ Ted M 27 74 Fred M 52 68 Betty F 31 64 Bob M 46 61 Judy F 35 64 ```

Define a table, `B`, with the same variables as `A`.

```B = table(['F';'M';'F';'F'],[64;68;62;58],[31;47;35;23],... 'VariableNames',{'Gender' 'Height' 'Age'},... 'RowNames',{'Meg' 'Joe' 'Beth' 'Amy'})```
```B=4×3 table Gender Height Age ______ ______ ___ Meg F 64 31 Joe M 68 47 Beth F 62 35 Amy F 58 23 ```

Find the rows in `A` that are not in `B`, as well as the index vector `ia`, such that `C = A(ia,:)`.

`[C,ia] = setdiff(A,B)`
```C=4×3 table Gender Age Height ______ ___ ______ Judy F 35 64 Ted M 27 74 Bob M 46 61 Fred M 52 68 ```
```ia = 4×1 5 1 4 2 ```

The rows of `C` are in sorted order first by `Gender` and next by `Age`.

Define two matrices with rows in common.

```A = [7 9 7; 0 0 0; 7 9 7; 5 5 5; 1 4 5]; B = [0 0 0; 5 5 5];```

Find the rows from `A` that are not in `B` as well as the index vector `ia`, such that `C = A(ia,:)`.

`[C,ia] = setdiff(A,B,'rows')`
```C = 2×3 1 4 5 7 9 7 ```
```ia = 2×1 5 1 ```

Use the `setOrder` argument to specify the ordering of the values in `C`.

Specify `'stable'` or `'sorted'` when the order of the values in `C` are important.

```A = [3 6 2 1 5 1 1]; B = [2 4 6]; [C,ia] = setdiff(A,B,'stable')```
```C = 1×3 3 1 5 ```
```ia = 3×1 1 4 5 ```

Alternatively, you can specify `'sorted'` order.

` [C,ia] = setdiff(A,B,'sorted')`
```C = 1×3 1 3 5 ```
```ia = 3×1 4 1 5 ```

Define two vectors containing `NaN`.

```A = [5 NaN NaN]; B = [5 NaN];```

Find the set difference of `A` and `B`.

`C = setdiff(A,B)`
```C = 1×2 NaN NaN ```

`setdiff` treats `NaN` values as distinct.

Create a cell array of character vectors, `A`.

`A = {'dog','cat','fish','horse'};`

Create a cell array of character vectors, `B`, where some of the vectors have trailing white space.

`B = {'dog ','cat','fish ','horse'};`

Find the character vectors in `A` that are not in `B`.

`[C,ia] = setdiff(A,B)`
```C = 1x2 cell {'dog'} {'fish'} ```
```ia = 2×1 1 3 ```

`setdiff` treats trailing white space in cell arrays of character vectors as distinct characters.

Create a character vector, `A`.

```A = ['cat';'dog';'fox';'pig']; class(A)```
```ans = 'char' ```

Create a cell array of character vectors, `B`.

```B={'dog','cat','fish','horse'}; class(B)```
```ans = 'cell' ```

Find the character vectors in `A` that are not in `B`.

`C = setdiff(A,B)`
```C = 2x1 cell {'fox'} {'pig'} ```

The result, `C`, is a cell array of character vectors.

`class(C)`
```ans = 'cell' ```

Use the `'legacy'` flag to preserve the behavior of `setdiff` from R2012b and prior releases in your code.

Find the difference of `A` and `B` with the current behavior.

```A = [3 6 2 1 5 1 1]; B = [2 4 6]; [C1,ia1] = setdiff(A,B)```
```C1 = 1×3 1 3 5 ```
```ia1 = 3×1 4 1 5 ```

Find the difference of `A` and `B`, and preserve the legacy behavior.

`[C2,ia2] = setdiff(A,B,'legacy')`
```C2 = 1×3 1 3 5 ```
```ia2 = 1×3 7 1 5 ```

## Input Arguments

collapse all

Input arrays, specified as numeric arrays, logical arrays, character arrays, string arrays, categorical arrays, datetime arrays, duration arrays, cell arrays of character vectors, tables, or timetables. If you specify the `'rows'` option, `A` and `B` must have the same number of columns.

`A` and `B` must belong to the same class with the following exceptions:

• `logical`, `char`, and all numeric classes can combine with `double` arrays.

• Cell arrays of character vectors can combine with character arrays or string arrays.

• Categorical arrays can combine with character arrays, cell arrays of character vectors, or string arrays.

• Datetime arrays can combine with cell arrays of date character vectors or single date character vectors.

There are additional requirements for `A` and `B` based on data type:

• If `A` and `B` are both ordinal categorical arrays, they must have the same sets of categories, including their order. If neither `A` nor `B` are ordinal, they need not have the same sets of categories, and the comparison is performed using the category names. In this case, the categories of `C` are the sorted union of the categories from `A` and `B`.

• If `A` and `B` are tables or timetables, they must have the same variable names (except for order). For tables, row names are ignored, so that two rows that have the same values, but different names, are considered equal. For timetables, row times are taken into account, so that two rows that have the same values, but different times, are not considered equal.

• If `A` and `B` are datetime arrays, they must be consistent with each other in whether they specify a time zone.

`A` and `B` also can be objects with the following class methods:

• `sort` (or `sortrows` for the `'rows'` option)

• `eq`

• `ne`

The object class methods must be consistent with each other. These objects include heterogeneous arrays derived from the same root class. For example, `A` and `B` can be arrays of handles to graphics objects.

Order flag, specified as `'sorted'` or `'stable'`, indicates the order of the values (or rows) in `C`.

FlagDescription
`'sorted'`

The values (or rows) in `C` return in sorted order as returned by `sort`.

Example

`C = setdiff([4 1 3 2 5],[2 1],'sorted')`
```C = 3 4 5```

`'stable'`

The values (or rows) in `C` return in the same order as in `A`.

Example

`C = setdiff([4 1 3 2 5],[2 1],'stable')`
```C = 4 3 5```

Data Types: `char` | `string`

## Output Arguments

collapse all

Difference of `A` and `B`, returned as a vector, matrix, table, or timetable. If the inputs `A` and `B` are tables or timetables, then the order of the variables in `C` is the same as the order of the variables in `A`.

The following describes the shape of `C` when the inputs are vectors or matrices and when the `'legacy'` flag is not specified:

• If the `'rows'` flag is not specified and `A` is a row vector, then `C` is a row vector.

• If the `'rows'` flag is not specified and `A` is not a row vector, then `C` is a column vector.

• If the`'rows'` flag is specified, then `C` is a matrix containing the rows of `A` that are not in `B`.

• If all the values (or rows) of `A` are also in `B`, then `C` is an empty matrix.

The class of `C` is the same as the class of `A`, unless:

• `A` is a character array and `B` is a cell array of character vectors, in which case `C` is a cell array of character vectors.

• `A` is a character vector, cell array of character vectors, or string, and `B` is a categorical array, in which case `C` is a categorical array.

• `A` is a cell array of character vectors or single character vector and `B` is a datetime array, in which case `C` is a datetime array.

• `A` is a character vector or cell array of character vectors and `B` is a string array, in which case `C` is a string array.

Index to `A`, returned as a column vector when the `'legacy'` flag is not specified. `ia` identifies the values (or rows) in `A` that are not in `B`. If there is a repeated value (or row) appearing exclusively in `A`, then `ia` contains the index to the first occurrence of the value (or row).

## Tips

• To find the set difference with respect to a subset of variables from a table or timetable, you can use column subscripting. For example, you can use `setdiff(A(:,vars),B(:,vars))`, where `vars` is a positive integer, a vector of positive integers, a variable name, a cell array of variable names, or a logical vector. Alternatively, you can use `vartype` to create a subscript that selects variables of a specified type.