# setdiff

Set difference of two arrays

## Syntax

• `C = setdiff(A,B)` example
• `C = setdiff(A,B,'rows')`
• ```[C,ia] = setdiff(A,B)``` example
• ```[C,ia] = setdiff(A,B,'rows')``` example
• ```[C,ia] = setdiff(___,setOrder)``` example
• ```[C,ia] = setdiff(A,B,'legacy')``` example
• ```[C,ia] = setdiff(A,B,'rows','legacy')``` example

## Description

example

````C = setdiff(A,B)` returns the data in `A` that is not in `B`.If `A` and `B` are numeric arrays, logical arrays, character arrays, categorical arrays, datetime arrays, duration arrays, or cell arrays of strings, then `setdiff` returns the values in `A` that are not in `B`. The values of `C` are in sorted order.If `A` and `B` are tables, then `setdiff` returns the rows from `A` that are not in `B`, with repetitions removed. The rows of table `C` are in sorted order.```
````C = setdiff(A,B,'rows')` treats each row of `A` and each row of `B` as single entities and returns the rows from `A` that are not in `B`. The rows of `C` are in sorted order.The `'rows'` option does not support cell arrays, unless one of the inputs is either a categorical array or a datetime array.```

example

``````[C,ia] = setdiff(A,B)``` also returns the index vector `ia`.If `A` and `B` are numeric arrays, logical arrays, character arrays, categorical arrays, datetime arrays, duration arrays, or cell arrays of strings, then ```C = A(ia)```.If `A` and `B` are tables, then `C = A(ia,:)`.```

example

``````[C,ia] = setdiff(A,B,'rows')``` also returns the index vector `ia`, such that `C = A(ia,:)`. ```

example

``````[C,ia] = setdiff(___,setOrder)``` returns `C` in a specific order using any of the input arguments in the previous syntaxes. `setOrder='sorted'` returns the values (or rows) of `C` in sorted order. `setOrder='stable'` returns the values (or rows) of `C` in the same order as `A`. If no value is specified, the default is `'sorted'`.```

example

``````[C,ia] = setdiff(A,B,'legacy')``` and ```[C,ia] = setdiff(A,B,'rows','legacy')``` preserve the behavior of the `setdiff` function from R2012b and prior releases.The `'legacy'` option does not support categorical arrays, tables, datetime arrays, or duration arrays.```

## Examples

collapse all

### Difference of Two Vectors

Define two vectors with values in common.

`A = [3 6 2 1 5 1 1]; B = [2 4 6];`

Find the values in `A` that are not in `B`.

`C = setdiff(A,B)`
```C = 1 3 5```

### Difference of Two Tables

Define two tables with rows in common.

```A = table([1:5]',['A';'B';'C';'D';'E'],logical([0;1;0;1;0])) B = table([1:2:10]',['A';'C';'E';'G';'I'],logical(zeros(5,1)))```
```A = Var1 Var2 Var3 ---- ---- ----- 1 A false 2 B true 3 C false 4 D true 5 E false B = Var1 Var2 Var3 ---- ---- ----- 1 A false 3 C false 5 E false 7 G false 9 I false```

Find the rows in `A` that are not in `B`.

`C = setdiff(A,B)`
```C = Var1 Var2 Var3 ---- ---- ----- 2 B true 4 D true ```

### Difference of Two Vectors and Indices to Different Values

Define two vectors with values in common.

`A = [3 6 2 1 5 1 1]; B = [2 4 6];`

Find the values in `A` that are not in `B` as well as the index vector `ia`, such that ```C = A(ia)```.

```[C,ia] = setdiff(A,B) ```
```C = 1 3 5 ia = 4 1 5```

### Difference of Two Tables and Indices to Different Rows

Define a table, `A`, of gender, age, and height for five people.

```A = table(['M';'M';'F';'M';'F'],[27;52;31;46;35],[74;68;64;61;64],... 'VariableNames',{'Gender' 'Age' 'Height'},... 'RowNames',{'Ted' 'Fred' 'Betty' 'Bob' 'Judy'})```
```A = Gender Age Height ------ --- ------ Ted M 27 74 Fred M 52 68 Betty F 31 64 Bob M 46 61 Judy F 35 64 ```

Define a table, `B`, with the same variables as `A`.

```B = table(['F';'M';'F';'F'],[64;68;62;58],[31;47;35;23],... 'VariableNames',{'Gender' 'Height' 'Age'},... 'RowNames',{'Meg' 'Joe' 'Beth' 'Amy'})```
```B = Gender Height Age ------ ------ --- Meg F 64 31 Joe M 68 47 Beth F 62 35 Amy F 58 23 ```

Find the rows in `A` that are not in `B`, as well as the index vector `ia`, such that ```C = A(ia,:)```.

```[C,ia] = setdiff(A,B) ```
```C = Gender Age Height ------ --- ------ Judy F 35 64 Ted M 27 74 Bob M 46 61 Fred M 52 68 ia = 5 1 4 2```

The rows of `C` are in sorted order first by `Gender` and next by `Age`.

### Difference of Rows in Two Matrices

Define two matrices with rows in common.

```A = [7 9 7; 0 0 0; 7 9 7; 5 5 5; 1 4 5]; B = [0 0 0; 5 5 5];```

Find the rows from `A` that are not in `B` as well as the index vector `ia`, such that ```C = A(ia,:)```.

```[C,ia] = setdiff(A,B,'rows') ```
```C = 1 4 5 7 9 7 ia = 5 1```

### Difference of Two Vectors with Specified Output Order

Use the `setOrder` argument to specify the ordering of the values in `C`.

Specify `'stable'` or `'sorted'` when the order of the values in `C` are important.

```A = [3 6 2 1 5 1 1]; B = [2 4 6]; [C,ia] = setdiff(A,B,'stable')```
```C = 3 1 5 ia = 1 4 5```

Alternatively, you can specify `'sorted'` order.

``` [C,ia] = setdiff(A,B,'sorted') ```
```C = 1 3 5 ia = 4 1 5 ```

### Difference of Vectors Containing NaNs

Define two vectors containing `NaN`.

`A = [5 NaN NaN]; B = [5 NaN];`

Find the set difference of `A` and `B`.

```C = setdiff(A,B) ```
```C = NaN NaN```

`setdiff` treats `NaN` values as distinct.

### Cell Array of Strings with Trailing White Space

Create a cell array of strings, `A`.

```A = {'dog','cat','fish','horse'}; ```

Create a cell array of strings, `B`, where some of the strings have trailing white space.

`B = {'dog ','cat','fish ','horse'};`

Find the strings in `A` that are not in `B`.

`[C,ia] = setdiff(A,B)`
```C = 'dog' 'fish' ia = 1 3```

`setdiff` treats trailing white space in cell arrays of strings as distinct characters.

### Difference of Char and Cell Array of Strings

Create a character array, `A`.

```A = ['cat';'dog';'fox';'pig']; class(A)```
```ans = char```

Create a cell array of strings, `B`.

```B={'dog','cat','fish','horse'}; class(B)```
```ans = cell```

Find the strings in `A` that are not in `B`.

```C = setdiff(A,B) ```
```C = 'fox' 'pig'```

The result, `C`, is a cell array of strings.

`class(C)`
```ans = cell```

### Preserve Legacy Behavior of setdiff

Use the `'legacy'` flag to preserve the behavior of `setdiff` from R2012b and prior releases in your code.

Find the difference of `A` and `B` with the current behavior.

```A = [3 6 2 1 5 1 1]; B = [2 4 6]; [C1,ia1] = setdiff(A,B) ```
```C1 = 1 3 5 ia1 = 4 1 5```

Find the difference of `A` and `B`, and preserve the legacy behavior.

`[C2,ia2] = setdiff(A,B,'legacy')`
```C2 = 1 3 5 ia2 = 7 1 5```

## Input Arguments

collapse all

### `A,B` — Input arraysnumeric arrays | logical arrays | character arrays | categorical arrays | datetime arrays | duration arrays | cell arrays of strings | tables

Input arrays, specified as numeric arrays, logical arrays, character arrays, categorical arrays, datetime arrays, duration arrays, cell arrays of strings, or tables.

`A` and `B` must belong to the same class with the following exceptions:

• `logical`, `char`, and all numeric classes can combine with `double` arrays.

• Cell arrays of strings can combine with `char` arrays.

• Categorical arrays can combine with cell arrays of strings or single strings.

• Datetime arrays can combine with cell arrays of date strings or single date strings.

If `A` and `B` are both ordinal categorical arrays, they must have the same sets of categories, including their order. If neither `A` nor `B` are ordinal, they need not have the same sets of categories, and the comparison is performed using the category names. In this case, the categories of `C` are the sorted union of the categories from `A` and `B`.

If you specify the `'rows'` option, `A` and `B` must have the same number of columns.

If `A` and `B` are tables, they must have the same variable names. Conversely, the row names do not matter. Two rows that have the same values, but different names, are considered equal.

If `A` and `B` are datetime arrays, they must be consistent with each other in whether they specify a time zone.

Furthermore, `A` and `B` can be objects with the following class methods:

• `sort` (or `sortrows` for the `'rows'` option)

• `eq`

• `ne`

The object class methods must be consistent with each other. These objects include heterogeneous arrays derived from the same root class.

### `setOrder` — Order flag`'sorted'` (default) | `'stable'`

Order flag, specified as `'sorted'` or `'stable'`, indicates the order of the values (or rows) in `C`.

Order FlagMeaning
`'sorted'`The values (or rows) in `C` return in sorted order. For example: ```C = setdiff([4 1 3 2],[2 1],'sorted')``` returns `C = [3 4]`.
`'stable'`The values (or rows) in `C` return in the same order as in `A`. For example: ``` C = setdiff([4 1 3 2],[2 1],'stable')``` returns `C = [4 3]`.

## Output Arguments

collapse all

### `C` — Difference of `A` and `B`vector | matrix | table

Difference of `A` and `B`, returned as a vector, matrix, or table. If the inputs `A` and `B` are tables, the order of the variables in the resulting table, `C`, is the same as the order of the variables in `A`.

The following describes the shape of `C` when the inputs are vectors or matrices and when the `'legacy'` flag is not specified:

• If the `'rows'` flag is not specified and `A` is a row vector, then `C` is a row vector.

• If the `'rows'` flag is not specified and `A` is not a row vector, then `C` is a column vector.

• If the`'rows'` flag is specified, then `C` is a matrix containing the rows of `A` that are not in `B`.

• If all the values (or rows) of `A` are also in `B`, then `C` is an empty matrix.

The class of `C` is the same as the class of `A`, unless:

• `A` is a character array and `B` is a cell array of strings, in which case `C` is a cell array of strings.

• `A` is a cell array of strings or single string and `B` is a categorical array, in which case `C` is a categorical array.

• `A` is a cell array of strings or single string and `B` is a datetime array, in which case `C` is a datetime array.

### `ia` — Index to `A`column vector

Index to `A`, returned as a column vector when the `'legacy'` flag is not specified. `ia` identifies the values (or rows) in `A` that are not in `B`. If there is a repeated value (or row) appearing exclusively in `A`, then `ia` contains the index to the first occurrence of the value (or row).

## More About

collapse all

### Tips

• To find the set difference with respect to a subset of variables from a table, you can use column subscripting. For example, you can use `setdiff(A(:,vars),B(:,vars))`, where `vars` is a positive integer, a vector of positive integers, a variable name, a cell array of variable names, or a logical vector.

## See Also

#### Introduced before R2006a

Was this topic helpful?

Get trial now