Skip to contents

Perform gene-set enrichment and overrepresentation analysis (GSEA/ORA) for a specified set of genes

Usage

AnalyzeGeneSets(
  data,
  analysis = Analyses(data)[1],
  criteria = LFC,
  genes = NULL,
  species = NULL,
  category = NULL,
  subcategory = NULL,
  verbose = TRUE,
  minSize = 10,
  maxSize = 500,
  process.genesets = NULL
)

Arguments

data

the grandR object that contains the data to analyze

analysis

the analysis to use, can be more than one and can be regexes (see details)

criteria

an expression to define criteria for GSEA/ORA (see details)

genes

specify genes directly (use analysis and criteria if NULL; see details)

species

the species the genes belong to (eg "Homo sapiens"); can be NULL, then the species is inferred from gene ids (see details)

category

the category defining gene sets (see ListGeneSets)

subcategory

the category defining gene sets (see ListGeneSets)

verbose

Print status messages

minSize

The minimal size of a gene set to be considered

maxSize

The maximal size of a gene set to be considered

process.genesets

a function to process geneset names; can be NULL (see details)

Value

the clusterprofile object representing the analysis results.

Details

The analysis parameter (just like for GetAnalysisTable can be a regex (that will be matched against all available analysis names). It can also be a vector (of regexes). Be careful with this, if more than one table e.g. with column LFC ends up in here, only the first is used (if criteria=LFC).

The criteria parameter can be used to define how analyses are performed. The criteria must be an expression that either evaluates into a numeric or logical vector. In the first case, GSEA is performed, in the latter it is ORA. The columns of the given analysis table(s) can be used to build this expression.

If no species is given, a very simple automatic inference is done, which will only work when having human or mouse ENSEMBL identifiers as gene ids.

The process.genesets parameters can be function that takes the character vector representing the names of all gene sets. The original names are replaced by the return value of this function.

See also

Examples

# See the differential-expression vignette!