Skip to contents

Fit the standard mass action kinetics model of gene expression using least squares (i.e. assuming gaussian homoscedastic errors) for the given gene. The fit takes both old and new RNA into account and requires proper normalization, but can be performed without assuming steady state. The parameters are fit per Condition.

Usage

FitKineticsGeneLeastSquares(
  data,
  gene,
  slot = DefaultSlot(data),
  time = Design$dur.4sU,
  chase = FALSE,
  CI.size = 0.95,
  steady.state = NULL,
  use.old = TRUE,
  use.new = TRUE,
  maxiter = 250,
  compute.residuals = TRUE
)

Arguments

data

A grandR object

gene

The gene for which to fit the model

slot

The data slot to take expression values from

time

The column in the column annotation table representing the labeling duration

chase

is this a pulse-chase experiment? (see details)

CI.size

A number between 0 and 1 representing the size of the confidence interval

steady.state

either a named list of logical values representing conditions in steady state or not, or a single logical value for all conditions

use.old

a logical vector to exclude old RNA from specific time points

use.new

a logical vector to exclude new RNA from specific time points

maxiter

the maximal number of iterations for the Levenberg-Marquardt algorithm used to minimize the least squares

compute.residuals

set this to TRUE to compute the residual matrix

Value

A named list containing the model fit:

  • data: a data frame containing the observed value used for fitting

  • residuals: the computed residuals if compute.residuals=TRUE, otherwise NA

  • Synthesis: the synthesis rate (in U/h, where U is the unit of the slot)

  • Degradation: the degradation rate (in 1/h)

  • Half-life: the RNA half-life (in h, always equal to log(2)/degradation-rate

  • conf.lower: a vector containing the lower confidence bounds for Synthesis, Degradation and Half-life

  • conf.upper: a vector containing the lower confidence bounds for Synthesis, Degradation and Half-life

  • f0: The abundance at time 0 (in U)

  • logLik: the log likelihood of the model

  • rmse: the total root mean square error

  • rmse.new: the total root mean square error for all new RNA values used for fitting

  • rmse.old: the total root mean square error for all old RNA values used for fitting

  • total: the total sum of all new and old RNA values used for fitting

  • type: non-equi or equi

If Condition(data) is not NULL, the return value is a named list (named according to the levels of Condition(data)), each element containing such a structure.

Details

The start of labeling for all samples should be the same experimental time point. The fit gets more precise with multiple samples from multiple labeling durations. In particular (but not only) without assuming steady state, also a sample without 4sU (representing time 0) is useful.

The standard mass action kinetics model of gene expression arises from the following differential equation:

$$df/dt = s - d f(t)$$

This model assumes constant synthesis and degradation rates (but not necessarily that the system is in steady state at time 0). From the solution of this differential equation, it is straight forward to derive the expected abundance of old and new RNA at time t for given parameters s (synthesis rate), d (degradation rate) and f0=f(0) (the abundance at time 0). These equations are implemented in f.old.equi (old RNA assuming steady state gene expression, i.e. f0=s/d), f.old.nonequi (old RNA without assuming steady state gene expression) and f.new (new RNA; whether or not it is steady state does not matter).

This function finds s and d such that the squared error between the observed values of old and new RNA and their corresponding functions is minimized. For that to work, data has to be properly normalized.

For pulse-chase designs, only the drop of the labeled RNA is considered. Note that in this case the notion "new" / "old" RNA is misleading, since labeled RNA corresponds to pre-existing RNA!

Examples

sars <- ReadGRAND(system.file("extdata", "sars.tsv.gz", package = "grandR"),
                  design=c("Condition",Design$dur.4sU,Design$Replicate))
#> Warning: Duplicate gene symbols (n=1, e.g. MATR3) present, making unique!
sars <- Normalize(sars)
FitKineticsGeneLeastSquares(sars,"SRSF6",steady.state=list(Mock=TRUE,SARS=FALSE))
#> $Mock
#> $Mock$data
#>            Name Condition Replicate duration.4sU duration.4sU.original no4sU
#> 1  Mock.no4sU.A      Mock         A            0                 no4sU  TRUE
#> 2     Mock.1h.A      Mock         A            1                    1h FALSE
#> 3     Mock.2h.A      Mock         A            2                    2h FALSE
#> 4     Mock.2h.B      Mock         B            2                    2h FALSE
#> 5     Mock.3h.A      Mock         A            3                    3h FALSE
#> 6     Mock.4h.A      Mock         A            4                    4h FALSE
#> 7  Mock.no4sU.A      Mock         A            0                 no4sU  TRUE
#> 8     Mock.1h.A      Mock         A            1                    1h FALSE
#> 9     Mock.2h.A      Mock         A            2                    2h FALSE
#> 10    Mock.2h.B      Mock         B            2                    2h FALSE
#> 11    Mock.3h.A      Mock         A            3                    3h FALSE
#> 12    Mock.4h.A      Mock         A            4                    4h FALSE
#>        Value  use time Type
#> 1  1352.9261 TRUE    0  Old
#> 2   949.1078 TRUE    1  Old
#> 3   694.7509 TRUE    2  Old
#> 4   814.7388 TRUE    2  Old
#> 5   546.2433 TRUE    3  Old
#> 6   328.4560 TRUE    4  Old
#> 7     0.0000 TRUE    0  New
#> 8   275.2315 TRUE    1  New
#> 9   503.5086 TRUE    2  New
#> 10  626.5115 TRUE    2  New
#> 11  672.5057 TRUE    3  New
#> 12  836.2817 TRUE    4  New
#> 
#> $Mock$residuals
#>            Name Type   Absolute    Relative
#> 1  Mock.no4sU.A  old  70.210176  0.05473556
#> 2     Mock.1h.A  old -21.017981 -0.02166521
#> 3     Mock.2h.A  old -38.961101 -0.05310136
#> 4     Mock.2h.B  old  81.026835  0.11043412
#> 5     Mock.3h.A  old  -8.667473 -0.01561958
#> 6     Mock.4h.A  old -91.226323 -0.21736992
#> 7  Mock.no4sU.A  new   0.000000         NaN
#> 8     Mock.1h.A  new -37.358670 -0.11951327
#> 9     Mock.2h.A  new -45.495290 -0.08286879
#> 10    Mock.2h.B  new  77.507573  0.14117854
#> 11    Mock.3h.A  new -55.299413 -0.07598107
#> 12    Mock.4h.A  new -26.751828 -0.03099744
#> 
#> $Mock$Synthesis
#> [1] 358.2743
#> 
#> $Mock$Degradation
#> [1] 0.2793092
#> 
#> $Mock$`Half-life`
#> [1] 2.481648
#> 
#> $Mock$conf.lower
#>   Synthesis Degradation   Half-life 
#> 316.9675391   0.2509529   2.2529248 
#> 
#> $Mock$conf.upper
#>   Synthesis Degradation   Half-life 
#> 399.5811302   0.3076655   2.7620608 
#> 
#> $Mock$f0
#> [1] 1282.716
#> 
#> $Mock$logLik
#> 'log Lik.' -64.91331 (df=3)
#> 
#> $Mock$rmse
#> [1] 54.08212
#> 
#> $Mock$rmse.new
#> [1] 46.98688
#> 
#> $Mock$rmse.old
#> [1] 60.34886
#> 
#> $Mock$total
#> [1] 7600.262
#> 
#> $Mock$type
#> [1] "equi"
#> 
#> 
#> $SARS
#> $SARS$data
#>            Name Condition Replicate duration.4sU duration.4sU.original no4sU
#> 1  SARS.no4sU.A      SARS         A            0                 no4sU  TRUE
#> 2     SARS.1h.A      SARS         A            1                    1h FALSE
#> 3     SARS.2h.A      SARS         A            2                    2h FALSE
#> 4     SARS.2h.B      SARS         B            2                    2h FALSE
#> 5     SARS.3h.A      SARS         A            3                    3h FALSE
#> 6     SARS.4h.A      SARS         A            4                    4h FALSE
#> 7  SARS.no4sU.A      SARS         A            0                 no4sU  TRUE
#> 8     SARS.1h.A      SARS         A            1                    1h FALSE
#> 9     SARS.2h.A      SARS         A            2                    2h FALSE
#> 10    SARS.2h.B      SARS         B            2                    2h FALSE
#> 11    SARS.3h.A      SARS         A            3                    3h FALSE
#> 12    SARS.4h.A      SARS         A            4                    4h FALSE
#>        Value  use time Type
#> 1  2277.5191 TRUE    0  Old
#> 2  1114.9400 TRUE    1  Old
#> 3   398.9634 TRUE    2  Old
#> 4   532.4308 TRUE    2  Old
#> 5   240.0072 TRUE    3  Old
#> 6   111.2108 TRUE    4  Old
#> 7     0.0000 TRUE    0  New
#> 8   591.6879 TRUE    1  New
#> 9  1402.2259 TRUE    2  New
#> 10 1587.9626 TRUE    2  New
#> 11 1485.4221 TRUE    3  New
#> 12 1626.4579 TRUE    4  New
#> 
#> $SARS$residuals
#>            Name Type    Absolute     Relative
#> 1  SARS.no4sU.A  old    4.652953  0.002047174
#> 2     SARS.1h.A  old   35.759486  0.033135777
#> 3     SARS.2h.A  old -113.442756 -0.221392245
#> 4     SARS.2h.B  old   20.024587  0.039079518
#> 5     SARS.3h.A  old   -3.288596 -0.013516861
#> 6     SARS.4h.A  old   -4.308593 -0.037297570
#> 7  SARS.no4sU.A  new    0.000000          NaN
#> 8     SARS.1h.A  new -317.883866 -0.349487391
#> 9     SARS.2h.A  new   60.779933  0.045309267
#> 10    SARS.2h.B  new  246.516651  0.183769351
#> 11    SARS.3h.A  new  -61.082256 -0.039496984
#> 12    SARS.4h.A  new  -17.410274 -0.010591040
#> 
#> $SARS$Synthesis
#> [1] 1289.982
#> 
#> $SARS$Degradation
#> [1] 0.7448396
#> 
#> $SARS$`Half-life`
#> [1] 0.9305992
#> 
#> $SARS$conf.lower
#>    Synthesis  Degradation    Half-life 
#> 1050.3793310    0.5731293    0.7562569 
#> 
#> $SARS$conf.upper
#>    Synthesis  Degradation    Half-life 
#> 1529.5847652    0.9165499    1.2094080 
#> 
#> $SARS$f0
#> [1] 2272.866
#> 
#> $SARS$logLik
#> 'log Lik.' -74.85882 (df=4)
#> 
#> $SARS$rmse
#> [1] 123.878
#> 
#> $SARS$rmse.new
#> [1] 168.1016
#> 
#> $SARS$rmse.old
#> [1] 49.32885
#> 
#> $SARS$total
#> [1] 11368.83
#> 
#> $SARS$type
#> [1] "non.equi"
#> 
#>