Main Content

genelowvalfilter

Remove gene profiles with low absolute values

Description

Mask = genelowvalfilter(Data) returns a logical vector Mask identifying gene expression profiles in Data that have absolute expression levels in the lowest 10% of the data set.

Gene expression profile experiments have data where the absolute values are very low. The quality of this type of data is often bad due to large quantization errors or simply poor spot hybridization. Use this function to filter data.

example

[Mask,FData] = genelowvalfilter(Data) also returns FData, a data matrix containing filtered expression profiles.

example

[Mask,FData,FNames] = genelowvalfilter(Data,geneNames) also returns FNames, a cell array of filtered gene names or IDs. You have to specify geneNames to return FNames unless Data is a DataMatrix object with specified row names.

example

[___] = genelowvalfilter(___,Name,Value) uses additional options specified as one or more optional name-value pair arguments.

example

Examples

collapse all

Load the sample yeast data.

load yeastdata;

Retrieve the genes and corresponding expression data where absolute expression levels exceed the 10th percentile.

[mask,filteredData,filteredGenes] = genelowvalfilter(yeastvalues,genes);

Compare the number of filtered genes (filteredGenes) with the number of genes in the original data set (genes).

size (filteredGenes,1)
ans = 
6394

Load the sample yeast data.

load yeastdata;

Mark the genes that have low absolute expression levels below the 10th percentile of the data set.

mask = genelowvalfilter(yeastvalues);

The variable genes contains every gene names in the yeast data set. Use the generated logical vector mask to retrieve the genes where expression levels exceed the 10th percentile.

filteredGenes = genes(mask);

Extract corresponding expression profile data for the selected genes from the variable yeastvalues, which contains expression profiles of every gene in the yeast data set.

filteredData = yeastvalues(mask,:);

Load the sample yeast data.

load yeastdata;

Retrieve the genes and corresponding expression data where absolute expression levels exceed the 30th percentile of the data set.

[mask,filteredData,filteredGenes] = genelowvalfilter(yeastvalues,genes,'Percentile',30);

Compare the number of filtered genes (filteredGenes) with the number of genes in the original data set (genes).

size (filteredGenes,1)
ans = 
6384

Input Arguments

collapse all

Input data, specified as a DataMatrix object or numeric matrix. Each row of the matrix corresponds to the experimental results for one gene. Each column represents the results for all genes from one experiment.

Gene names or IDs, specified as a cell array of character vectors or string vector. The array has the same number of rows as Data. Each row contains the name or ID of the gene in the data set.

Note

If Data is a DataMatrix object with specified row names, you do not need to provide the second input geneNames to return the third output FNames.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'AbsValue',10.5 specifies genelowvalfilter to remove expression profiles with absolute values less than 10.5.

Percentile value, specified as a scalar value in the range (0 to 100). The function genelowvalfilter removes gene expression profiles with absolute values less than the percentile value, which is specified using 'Percentile'.

Example: 'Percentile',50

Absolute expression profile value, specified as a real number. The function genelowvalfilter removes gene expression profiles with absolute values less than the absolute value, which is specified using 'AbsValue'.

Example: 'AbsValue',10.5

Logical indicator to select the minimum or maximum absolute value, specified as true or false. Set the value to true to select the minimum absolute value. Set it to false to select the maximum absolute value.

Example: 'AnyVal',true

Output Arguments

collapse all

Logical vector, returned as a vector of 0s and 1s for each row in Data. The elements of Mask with value 1 correspond to rows with absolute expression levels exceeding the threshold, and those with value 0 correspond to rows with absolute expression levels less than or equal to the threshold.

Filtered data matrix, returned as a data matrix that contains gene expression profiles with absolute expression levels exceeding the threshold value. You can also create FData using FData = Data(Mask,:).

Array of filtered gene names, returned as a cell array of character vectors or string vector. It contains gene names or IDs corresponding to each row of Data that contains gene expression profiles with absolute expression levels exceeding the threshold value. You can also create FNames using FNames = geneNames(Mask).

References

[1] Kohane, I.S., Kho, A.T., Butte, A.J. (2003). Microarrays for an Integrative Genomics, First Edition (Cambridge, MA: MIT Press).

Version History

Introduced before R2006a