Skip to contents

Compute the posterior distributions of RNA synthesis and degradation for a particular gene

Usage

FitKineticsGeneSnapshot(
  data,
  gene,
  columns = NULL,
  reference.columns = NULL,
  dispersion = NULL,
  slot = DefaultSlot(data),
  time.labeling = Design$dur.4sU,
  time.experiment = NULL,
  sample.f0.in.ss = TRUE,
  sample.level = 2,
  beta.prior = NULL,
  return.samples = FALSE,
  return.points = FALSE,
  N = 10000,
  N.max = N * 10,
  CI.size = 0.95,
  correct.labeling = FALSE
)

Arguments

data

the grandR object

gene

a gene name or symbol or index

columns

samples or cell representing the same experimental condition (must refer to a unique labeling duration)

reference.columns

a reference matrix usually generated by FindReferences to define reference samples for each sample (see details)

dispersion

dispersion parameter for the given columns (if NULL, this is estimated from the data, takes a lot of time!)

slot

the data slot to take f0 and totals from

time.labeling

the column in the column annotation table denoting the labeling duration or the labeling duration itself

time.experiment

the column in the column annotation table denoting the experimental time point (can be NULL, see details)

sample.f0.in.ss

whether or not to sample f0 under steady state conditions

sample.level

Define how the NTR is sampled from the hierarchical Bayesian model (must be 0,1, or 2; see details)

beta.prior

The beta prior for the negative binomial used to sample counts, if NULL, a beta distribution is fit to all expression values and given dispersions

return.samples

return the posterior samples of the parameters?

return.points

return the point estimates per replicate as well?

N

the posterior sample size

N.max

the maximal number of posterior samples (necessary if old RNA > f0); if more are necessary, a warning is generated

CI.size

A number between 0 and 1 representing the size of the credible interval

correct.labeling

whether to correct labeling times

Value

a list containing the posterior mean of s and s, its credible intervals and, if return.samples=TRUE a data frame containing all posterior samples

Details

The kinetic parameters s and d are computed using TransformSnapshot. For that, the sample either must be in steady state (this is the case if defined in the reference.columns matrix), or if the levels of reference samples from a specific prior time point are known. This time point is defined by time.experiment (i.e. the difference between the reference samples and samples themselves). If time.experiment is NULL, then the labeling time of the samples is used (e.g. useful if labeling was started concomitantly with the perturbation, and the reference samples are unperturbed samples).

By default, the hierarchical Bayesian model is estimated. If sample.level = 0, the NTRs are sampled from a beta distribution that approximates the mixture of betas from the replicate samples. If sample.level = 1, only the first level from the hierarchical model is sampled (corresponding to the uncertainty of estimating the biological variability). If sample.level = 2, the first and second levels are estimated (corresponding to the full hierarchical model).

Columns can be given as a logical, integer or character vector representing a selection of the columns (samples or cells). The expression is evaluated in an environment having the Coldata, i.e. you can use names of Coldata as variables to conveniently build a logical vector (e.g., columns=Condition=="x").