Skip to contents

The `rtmb_model` function acts as the core constructor for compiling and combining user-defined data with the model code (defined via `rtmb_code`). It generates an `RTMB_Model` (R6 class) instance, which serves as the foundation for performing Bayesian inference, including Maximum A Posteriori (MAP) estimation, Variational Inference (ADVI), and Markov Chain Monte Carlo (MCMC) sampling.

Usage

rtmb_model(
  data,
  code,
  par_names = list(),
  init = NULL,
  view = NULL,
  null_target = NULL
)

Arguments

data

A named list containing observation data and constants (e.g., sample size, matrices) used in the model.

code

A model definition block generated by rtmb_code(...) (must include parameters and model blocks).

par_names

A named list of character vectors corresponding to the dimensions of specific parameters (optional).

init

A list or numeric vector of initial values for parameters (optional). If not specified, initialized randomly.

view

Character vector of parameter names to be displayed preferentially at the top when outputting results like summary() (optional).

null_target

Character string. To simultaneously create a null model, specify the target parameter to fix and the prior distribution to disable as a formula string (e.g., "delta ~ cauchy(0, r)"). Default is NULL.

Value

An RTMB_Model class instance with a compiled and pre-tested automatic differentiation function.

Details

Model Compilation and Pre-checking: When this function is called, it evaluates the provided data and model blocks. It performs a "sandbox execution" (pre-check) using dummy initial values to dynamically detect common structural errors—such as undefined variables, out-of-bounds indices, or incompatible matrix operations—before proceeding to the computationally expensive Automatic Differentiation (MakeADFun) phase. Cryptic backend errors are caught and translated into user-friendly hints.

Writing AD-Compatible Code (Important): To ensure the model is differentiable, you must follow specific syntax rules when writing code within rtmb_code. Avoid discrete branching (if, ifelse) based on parameters and use numerically stable functions. See rtmb_syntax for a detailed guide on Automatic Differentiation requirements.

Initial Values (init): Initial parameter values can be specified as a flat numeric vector or a named list. If a partial list is provided, the unspecified parameters are automatically initialized with random values drawn from a uniform distribution on the unconstrained scale. If init = NULL, all parameters are initialized randomly.

Parameter Labeling (par_names): By default, vector or matrix parameters are displayed with numeric indices (e.g., mu[1]). You can pass a named list of character vectors to par_names to assign meaningful labels to specific dimensions (e.g., mu[Control]), vastly improving the readability of summary outputs and trace plots.

Available Inference Methods on the Returned Object: The returned RTMB_Model instance provides the following core methods:

  • $optimize(...): Performs Maximum A Posteriori (MAP) or Maximum Likelihood estimation. Returns a MAP_Fit object.

  • $sample(...): Draws posterior samples using the NUTS (No-U-Turn Sampler) algorithm. Returns an MCMC_Fit object.

  • $variational(...): Performs Mean-field or Full-rank Automatic Differentiation Variational Inference (ADVI). Returns a VB_Fit object.

  • $null_model(target, ...): Creates a restricted model by fixing specific parameters (e.g., an effect size to zero), which is useful for Bayes Factor calculation.

Examples

# \donttest{
# Simulate data for 3 groups
set.seed(123)
N <- 60
group_idx <- sample(1:3, N, replace = TRUE)
group_names <- c("Control", "Treatment_A", "Treatment_B")

# True group means: Control = 0, Treatment_A = 2, Treatment_B = -1
true_means <- c(0, 2, -1)
y <- true_means[group_idx] + rnorm(N, mean = 0, sd = 0.5)

data_list <- list(N = N, K = 3, group_idx = group_idx, y = y)

# Define the model using rtmb_code
model_code <- rtmb_code(
  parameters = {
    mu    = Dim(K)             # Vector of length K (group means)
    sigma = Dim(1, lower = 0)  # Scalar (residual standard deviation)
  },
  model = {
    # Priors
    for (k in 1:K) mu[k] ~ normal(0, 10)
    sigma ~ exponential(1)

    # Likelihood
    for (i in 1:N) {
      y[i] ~ normal(mu[group_idx[i]], sigma)
    }
  }
)

# --- 1. Basic Model Creation ---
# Create the RTMB_Model object
mod_basic <- rtmb_model(
  data = data_list,
  code = model_code
)
#> Pre-checking model code...
#> Checking RTMB setup...

# Perform Maximum A Posteriori (MAP) estimation
map_basic <- mod_basic$optimize()
#> Starting optimization...
#> 
#> Optimization converged. Final objective: 46.99
# The summary displays default parameter names: mu[1], mu[2], mu[3]
map_basic$summary()
#> 
#> Call:
#> MAP Estimation via RTMB
#> 
#> Negative Log-Posterior: 46.99
#> Approx. Log Marginal Likelihood (Laplace): -53.43
#> 
#> Point Estimates and 95% Wald CI:
#> variable  Estimate  Std. Error  Lower 95%  Upper 95% 
#> mu[1]     -0.12319     0.09499   -0.30936    0.06298 
#> mu[2]      2.15919     0.10806    1.94741    2.37098 
#> mu[3]     -1.11887     0.09722   -1.30942   -0.92832 
#> sigma      0.44511     0.04041    0.37255    0.53179 
#> 

# Perform MCMC sampling using the named model (chains/iters reduced for speed)
mcmc_basic <- mod_basic$sample(sampling = 500, warmup = 500, chains = 2)
#> Starting sequential sampling (chains = 2)...
#> chain 1 started... 
#> chain 1: iter 100 warmup 
#> chain 1: iter 200 warmup 
#> chain 1: iter 300 warmup 
#> chain 1: iter 400 warmup 
#> chain 1: iter 500 warmup 
#> chain 1: iter 600 sampling 
#> chain 1: iter 700 sampling 
#> chain 1: iter 800 sampling 
#> chain 1: iter 900 sampling 
#> chain 1: iter 1000 sampling 
#> chain 2 started... 
#> chain 2: iter 100 warmup 
#> chain 2: iter 200 warmup 
#> chain 2: iter 300 warmup 
#> chain 2: iter 400 warmup 
#> chain 2: iter 500 warmup 
#> chain 2: iter 600 sampling 
#> chain 2: iter 700 sampling 
#> chain 2: iter 800 sampling 
#> chain 2: iter 900 sampling 
#> chain 2: iter 1000 sampling 
mcmc_basic$summary()
#> variable    mean    sd     map    q2.5   q97.5  ess_bulk  ess_tail  rhat 
#> lp        -49.82  1.45  -48.75  -53.40  -48.06       414       670  1.00 
#> mu[1]      -0.12  0.10   -0.12   -0.31    0.07      1091       690  1.00 
#> mu[2]       2.16  0.11    2.17    1.95    2.39      1036       620  1.00 
#> mu[3]      -1.12  0.10   -1.11   -1.31   -0.91       819       587  1.00 
#> sigma       0.47  0.05    0.45    0.39    0.56       818       753  1.00 

# --- 2. Optional: Adding Custom Parameter Names and initial values ---
# You can optionally use 'par_names' to assign meaningful labels
# to vector or matrix elements for easier interpretation.
mod_named <- rtmb_model(
  data = data_list,
  code = model_code,
  init = list(mu = rep(0, 3), sigma = 1),
  par_names = list(mu = group_names)
)
#> Pre-checking model code...
#> Checking RTMB setup...

map_named <- mod_named$optimize()
#> Starting optimization...
#> 
#> Optimization converged. Final objective: 46.99
# The summary now displays: mu[Control], mu[Treatment_A], mu[Treatment_B]
map_named$summary()
#> 
#> Call:
#> MAP Estimation via RTMB
#> 
#> Negative Log-Posterior: 46.99
#> Approx. Log Marginal Likelihood (Laplace): -53.43
#> 
#> Point Estimates and 95% Wald CI:
#>        variable  Estimate  Std. Error  Lower 95%  Upper 95% 
#> mu[Control]      -0.12319     0.09499   -0.30936    0.06298 
#> mu[Treatment_A]   2.15921     0.10806    1.94742    2.37099 
#> mu[Treatment_B]  -1.11887     0.09722   -1.30942   -0.92832 
#> sigma             0.44510     0.04041    0.37255    0.53178 
#> 

# Perform MCMC sampling using the named model (chains/iters reduced for speed)
mcmc_named <- mod_named$sample(sampling = 500, warmup = 500, chains = 2)
#> Starting sequential sampling (chains = 2)...
#> chain 1 started... 
#> chain 1: iter 100 warmup 
#> chain 1: iter 200 warmup 
#> chain 1: iter 300 warmup 
#> chain 1: iter 400 warmup 
#> chain 1: iter 500 warmup 
#> chain 1: iter 600 sampling 
#> chain 1: iter 700 sampling 
#> chain 1: iter 800 sampling 
#> chain 1: iter 900 sampling 
#> chain 1: iter 1000 sampling 
#> chain 2 started... 
#> chain 2: iter 100 warmup 
#> chain 2: iter 200 warmup 
#> chain 2: iter 300 warmup 
#> chain 2: iter 400 warmup 
#> chain 2: iter 500 warmup 
#> chain 2: iter 600 sampling 
#> chain 2: iter 700 sampling 
#> chain 2: iter 800 sampling 
#> chain 2: iter 900 sampling 
#> chain 2: iter 1000 sampling 
mcmc_named$summary()
#>        variable    mean    sd     map    q2.5   q97.5  ess_bulk  ess_tail  rhat 
#> lp               -49.84  1.40  -48.85  -53.22  -48.09       312       697  1.01 
#> mu[Control]       -0.12  0.10   -0.10   -0.32    0.07      1000       630  1.00 
#> mu[Treatment_A]    2.16  0.11    2.15    1.94    2.38      1108       361  1.00 
#> mu[Treatment_B]   -1.12  0.10   -1.12   -1.33   -0.92       923       406  1.00 
#> sigma              0.47  0.04    0.46    0.40    0.56       836       752  1.01 
# }