Return a grandR object with fewer genes than the given grandR object (usually to filter out weakly expressed genes).
FilterGenes( data, mode.slot = "count", minval = 100, mincol = ncol(data)/2, min.cond = NULL, use = NULL, keep = NULL, return.genes = FALSE )
the grandR object
the mode.slot that is used for filtering (see details)
the minimal value for retaining a gene
the minimal number of columns (i.e. samples or cells) a gene has to have a value >= minval
if not NULL, do not compare values per column, but per condition (see details)
if not NULL, defines the genes directly that are supposed to be retained (see details)
if not NULL, defines genes directly, that should be kept even though they do not adhere to the filtering criteria (see details)
if TRUE, return the gene names instead of a new grandR object
either a new grandR object (if return.genes=FALSE), or a vector containing the gene names that would be retained
By default genes are retained, if they have 100 read counts in at least half of the columns (i.e. samples or cells).
use parameter can be used to define genes to be retained directly. The
keep parameter, in contrast, defines
additional genes to be retained. For both, genes can be referred to by their names, symbols, row numbers in the gene table,
or a logical vector referring to the gene table rows.
To refer to data slots, the mode.slot syntax can be used: Each name is either a data slot, or one of (new,old,total) followed by a dot followed by a slot. For new or old, the data slot value is multiplied by ntr or 1-ntr. This can be used e.g. to filter by new counts.
min.cond parameter is given, first all columns belonging to the same
Condition are summed up, and then the usual filtering
is performed by conditions instead of by columns.
sars <- ReadGRAND(system.file("extdata", "sars.tsv.gz", package = "grandR"), design=c("Condition",Design$dur.4sU,Design$Replicate)) #> Warning: Duplicate gene symbols (n=1, e.g. MATR3) present, making unique! nrow(sars) #>  1045 # This is already filtered and has 1045 genes nrow(FilterGenes(sars,minval=1000)) #>  966 # There are 966 genes with at least 1000 read counts in half of the samples nrow(FilterGenes(sars,minval=10000,min.cond=1)) #>  944 # There are 944 genes with at least 10000 read counts in the Mock or SARS condition nrow(FilterGenes(sars,use=GeneInfo(sars,"Type")!="Cellular")) #>  11 # These are the 11 viral genes.