Skip to contents

Returns a function to be used as classify.genes parameter for ReadGRAND.

Usage

ClassifyGenes(
  ...,
  use.default = TRUE,
  drop.levels = TRUE,
  name.unknown = "Unknown"
)

Arguments

...

additional functions to define types (see details)

use.default

if TRUE, use the default type inference (priority after the user defined ones); see details

drop.levels

if TRUE, drop unused types from the factor that is generated

name.unknown

the type to be used for all genes where no type was identified

Value

a function that takes the original GeneInfo table and adds the Type column

Details

This function returns a function. Usually, you do not use it yourself but ClassifyGenes is usually as classify.genes parameter for ReadGRAND to build the Type column in the GeneInfo table. See the example to see how to use it directly.

Each ... parameter must be a function that receives the gene info table and must return a logical vector, indicating for each row in the gene info table, whether it matches to a specific type. The name of the parameter is used as the type name.

If a gene matches to multiple type, the first function returning TRUE for a row in the table is used.

By default, this function will recognize mitochondrial genes (MT prefix of the gene symbol), ERCC spike-ins, and Ensembl gene identifiers (which it will call "cellular"). These three are the last functions to be checked (in case a user defined type via ...) also matches to, e.g., an Ensembl gene).

See also

Examples


viral.genes <- c('ORF3a','E','M','ORF6','ORF7a','ORF7b','ORF8','N','ORF10','ORF1ab','S')
sars <- ReadGRAND(system.file("extdata", "sars.tsv.gz", package = "grandR"),
                  design=c("Cell",Design$dur.4sU,Design$Replicate),
                  classify.genes=ClassifyGenes(`SARS-CoV-2`=
                             function(gene.info) gene.info$Symbol %in% viral.genes),
                  verbose=TRUE)
#> Checking file...
#> Reading files...
#> Warning: Duplicate gene symbols (n=1, e.g. MATR3) present, making unique!
#> Processing...
table(GeneInfo(sars)$Type)
#> 
#> SARS-CoV-2   Cellular 
#>         11       1034 

fun<-ClassifyGenes(viral=function(gene.info) gene.info$Symbol %in% viral.genes)
table(fun(GeneInfo(sars)))
#> 
#>    viral Cellular 
#>       11     1034