This is the main function to access slot data for all genes as a large matrix. If data from a particular gene (or a small set of genes)
must be retrieved, use the GetData
function. For analysis results, use the GetAnalysisTable
function.
Usage
GetTable(
data,
type = DefaultSlot(data),
columns = NULL,
genes = Genes(data),
ntr.na = TRUE,
gene.info = FALSE,
summarize = NULL,
prefix = NULL,
name.by = "Symbol"
)
Arguments
- data
A grandR object
- type
Either a mode.slot (see details) or a regex to be matched against analysis names. Can also be a vector
- columns
A vector of columns (either condition/cell names if the type is a mode.slot, or names in the output table from an analysis; use Columns(data,<analysis>) to learn which columns are available); all condition/cell names if NULL
- genes
Restrict the output table to the given genes
- ntr.na
For columns representing a 4sU naive sample, should types ntr,new.count and old.count be 0,0 and count (ntr.na=FALSE; can be any other slot than count) or NA,NA and NA (ntr.na=TRUE)
- gene.info
Should the table contain the GeneInfo values as well (at the beginning)?
- summarize
Should replicates by summarized? see details
- prefix
Prepend each column in the output table (except for the gene.info columns) by the given prefix
- name.by
A column name of Coldata(data). This is used as the rownames of the output table
Details
This is a convenience wrapper for GetData (values from data slots) and GetAnalysisTable (values from analyses). Types can refer to any of the two (and can be mixed). If there are types from both data and analyses, columns must be NULL. Otherwise columns must either be condition/cell names (if type refers to one or several data slots), or regular expressions to match against the names in the analysis tables.
Columns definitions for data slots can be given as a logical, integer or character vector representing a selection of the columns (samples or cells).
The expression is evaluated in an environment having the Coldata
, i.e. you can use names of Coldata
as variables to
conveniently build a logical vector (e.g., columns=Condition=="x").
To refer to data slots via type
, the mode.slot syntax can be used: Each name is either a data slot, or one of (new,old,total)
followed by a dot followed by a slot. For new or old, the data slot value is multiplied by ntr or 1-ntr. This can be used e.g. to obtain the new counts.
The summarization parameter can only be specified if columns is NULL. It is either a summarization matrix (GetSummarizeMatrix) or TRUE (in which case GetSummarizeMatrix(data) is called). If there a NA values, they are imputed as the mean per group!
Examples
sars <- ReadGRAND(system.file("extdata", "sars.tsv.gz", package = "grandR"),
design=c("Condition",Design$dur.4sU,Design$Replicate))
#> Warning: Duplicate gene symbols (n=1, e.g. MATR3) present, making unique!
sars <- Normalize(FilterGenes(sars))
head(GetTable(sars))
#> Mock.no4sU.A Mock.1h.A Mock.2h.A Mock.2h.B Mock.3h.A Mock.4h.A
#> UHMK1 2396.19060 1532.20609 1248.6574 2142.93482 2010.2812 1681.95612
#> ATF3 44.23736 63.81363 44.7533 62.54815 45.1864 43.22928
#> PABPC4 1880.08801 2036.94478 2284.0310 1910.80191 2042.8873 2013.07398
#> ROR1 1146.48504 988.09303 868.0527 938.66272 867.7842 834.84625
#> ZC3H11A 1072.75610 650.35597 662.8326 860.69777 815.1523 801.12128
#> ZBED6 1069.06965 640.17295 656.7849 845.72145 804.6260 790.69720
#> SARS.no4sU.A SARS.1h.A SARS.2h.A SARS.2h.B SARS.3h.A SARS.4h.A
#> UHMK1 1229.137 1217.2545 958.4119 1506.9346 1310.6846 1364.2703
#> ATF3 1313.490 166.8318 331.2772 280.9383 506.4009 650.2597
#> PABPC4 1482.195 1976.0305 1939.7423 1940.7771 1846.8738 1697.5967
#> ROR1 1313.490 962.6815 741.7275 846.4994 870.7346 1183.9462
#> ZC3H11A 2157.015 996.0479 1044.8773 1424.0348 1354.2214 1466.2718
#> ZBED6 2072.663 970.0963 1016.7500 1399.1649 1356.5128 1464.4503
# DefaultSlot values, i.e. size factor normalized read counts for all samples
head(GetTable(sars,summarize=TRUE))
#> Mock SARS
#> UHMK1 1723.20712 1271.5112
#> ATF3 51.90615 387.1416
#> PABPC4 2057.54779 1880.2041
#> ROR1 899.48779 921.1178
#> ZC3H11A 758.03200 1257.0906
#> ZBED6 747.60049 1241.3949
# DefaultSlot values averaged over the two conditions
head(GetTable(sars,type="new.count",columns=!no4sU))
#> Mock.1h.A Mock.2h.A Mock.2h.B Mock.3h.A Mock.4h.A SARS.1h.A SARS.2h.A
#> UHMK1 201 434 687 1619 1653 123 377
#> ATF3 118 81 106 130 111 109 318
#> PABPC4 531 1197 876 2252 2245 255 958
#> ROR1 187 633 651 1378 1745 113 365
#> ZC3H11A 129 725 849 1810 2004 114 674
#> ZBED6 127 727 839 1806 1982 112 669
#> SARS.2h.B SARS.3h.A SARS.4h.A
#> UHMK1 739 275 463
#> ATF3 305 221 357
#> PABPC4 1094 466 609
#> ROR1 532 271 639
#> ZC3H11A 828 451 797
#> ZBED6 833 455 800
# Estimated counts for new RNA for all samples with 4sU
sars<-LFC(sars,contrasts=GetContrasts(sars,group = "duration.4sU"))
head(GetAnalysisTable(sars,columns="LFC"))
#> Gene Symbol Length Type total.Mock vs SARS.1.LFC
#> UHMK1 ENSG00000152332 UHMK1 8478 Cellular 0.37014686
#> ATF3 ENSG00000162772 ATF3 2103 Cellular -1.24995147
#> PABPC4 ENSG00000090621 PABPC4 3592 Cellular 0.08373593
#> ROR1 ENSG00000185483 ROR1 5832 Cellular 0.07724080
#> ZC3H11A ENSG00000058673 ZC3H11A 11825 Cellular -0.56924380
#> ZBED6 ENSG00000257315 ZBED6 12481 Cellular -0.55395950
#> total.Mock vs SARS.2.LFC total.Mock vs SARS.3.LFC
#> UHMK1 0.4395433 0.63126971
#> ATF3 -2.4486712 -3.29885218
#> PABPC4 0.1344397 0.16166905
#> ROR1 0.1936823 0.01171992
#> ZC3H11A -0.7000287 -0.71129270
#> ZBED6 -0.6889175 -0.73232660
#> total.Mock vs SARS.4.LFC
#> UHMK1 0.3021589
#> ATF3 -3.7827628
#> PABPC4 0.2463332
#> ROR1 -0.5008342
#> ZC3H11A -0.8675652
#> ZBED6 -0.8845711
# Estimated fold changes SARS vs Mock for each time point