Create an RTMB_Model Object — rtmb

The `rtmb_model` function acts as the core constructor for compiling and combining user-defined data with the model code (defined via `rtmb_code`). It generates an `RTMB_Model` (R6 class) instance, which serves as the foundation for performing Bayesian inference, including Maximum A Posteriori (MAP) estimation, Variational Inference (ADVI), and Markov Chain Monte Carlo (MCMC) sampling.

Usage

rtmb_model(
  data,
  code,
  par_names = list(),
  init = NULL,
  view = NULL,
  null_target = NULL
)

Arguments

data: A named list containing observation data and constants (e.g., sample size, matrices) used in the model.
code: A model definition block generated by rtmb_code(...) (must include parameters and model blocks).
par_names: A named list of character vectors corresponding to the dimensions of specific parameters (optional).
init: A list or numeric vector of initial values for parameters (optional). If not specified, initialized randomly.
view: Character vector of parameter names to be displayed preferentially at the top when outputting results like summary() (optional).
null_target: Character string. To simultaneously create a null model, specify the target parameter to fix and the prior distribution to disable as a formula string (e.g., "delta ~ cauchy(0, r)"). Default is NULL.

Value

An RTMB_Model class instance with a compiled and pre-tested automatic differentiation function.

Details

Model Compilation and Pre-checking: When this function is called, it evaluates the provided data and model blocks. It performs a "sandbox execution" (pre-check) using dummy initial values to dynamically detect common structural errors—such as undefined variables, out-of-bounds indices, or incompatible matrix operations—before proceeding to the computationally expensive Automatic Differentiation (MakeADFun) phase. Cryptic backend errors are caught and translated into user-friendly hints.

Writing AD-Compatible Code (Important): To ensure the model is differentiable, you must follow specific syntax rules when writing code within rtmb_code. Avoid discrete branching (if, ifelse) based on parameters and use numerically stable functions. See rtmb_syntax for a detailed guide on Automatic Differentiation requirements.

Initial Values (init): Initial parameter values can be specified as a flat numeric vector or a named list. If a partial list is provided, the unspecified parameters are automatically initialized with random values drawn from a uniform distribution on the unconstrained scale. If init = NULL, all parameters are initialized randomly.

Parameter Labeling (par_names): By default, vector or matrix parameters are displayed with numeric indices (e.g., mu[1]). You can pass a named list of character vectors to par_names to assign meaningful labels to specific dimensions (e.g., mu[Control]), vastly improving the readability of summary outputs and trace plots.

Available Inference Methods on the Returned Object: The returned RTMB_Model instance provides the following core methods:

$optimize(...): Performs Maximum A Posteriori (MAP) or Maximum Likelihood estimation. Returns a MAP_Fit object.
$sample(...): Draws posterior samples using the NUTS (No-U-Turn Sampler) algorithm. Returns an MCMC_Fit object.
$variational(...): Performs Mean-field or Full-rank Automatic Differentiation Variational Inference (ADVI). Returns a VB_Fit object.
$null_model(target, ...): Creates a restricted model by fixing specific parameters (e.g., an effect size to zero), which is useful for Bayes Factor calculation.

Examples

# \donttest{
# Simulate data for 3 groups
set.seed(123)
N <- 60
group_idx <- sample(1:3, N, replace = TRUE)
group_names <- c("Control", "Treatment_A", "Treatment_B")

# True group means: Control = 0, Treatment_A = 2, Treatment_B = -1
true_means <- c(0, 2, -1)
y <- true_means[group_idx] + rnorm(N, mean = 0, sd = 0.5)

data_list <- list(N = N, K = 3, group_idx = group_idx, y = y)

# Define the model using rtmb_code
model_code <- rtmb_code(
  parameters = {
    mu    = Dim(K)             # Vector of length K (group means)
    sigma = Dim(1, lower = 0)  # Scalar (residual standard deviation)
  },
  model = {
    # Priors
    for (k in 1:K) mu[k] ~ normal(0, 10)
    sigma ~ exponential(1)

    # Likelihood
    for (i in 1:N) {
      y[i] ~ normal(mu[group_idx[i]], sigma)
    }
  }
)

# --- 1. Basic Model Creation ---
# Create the RTMB_Model object
mod_basic <- rtmb_model(
  data = data_list,
  code = model_code
)
#> Pre-checking model code...
#> Checking RTMB setup...

# Perform Maximum A Posteriori (MAP) estimation
map_basic <- mod_basic$optimize()
#> Starting optimization...
#> 
#> Optimization converged. Final objective: 46.99
# The summary displays default parameter names: mu[1], mu[2], mu[3]
map_basic$summary()
#> 
#> Call:
#> MAP Estimation via RTMB
#> 
#> Negative Log-Posterior: 46.99
#> Approx. Log Marginal Likelihood (Laplace): -53.43
#> 
#> Point Estimates and 95% Wald CI:
#> variable  Estimate  Std. Error  Lower 95%  Upper 95% 
#> mu[1]     -0.12319     0.09499   -0.30936    0.06298 
#> mu[2]      2.15919     0.10806    1.94741    2.37098 
#> mu[3]     -1.11887     0.09722   -1.30942   -0.92832 
#> sigma      0.44511     0.04041    0.37255    0.53179 
#> 

# Perform MCMC sampling using the named model (chains/iters reduced for speed)
mcmc_basic <- mod_basic$sample(sampling = 500, warmup = 500, chains = 2)
#> Starting sequential sampling (chains = 2)...
#> chain 1 started... 
#> chain 1: iter 100 warmup 
#> chain 1: iter 200 warmup 
#> chain 1: iter 300 warmup 
#> chain 1: iter 400 warmup 
#> chain 1: iter 500 warmup 
#> chain 1: iter 600 sampling 
#> chain 1: iter 700 sampling 
#> chain 1: iter 800 sampling 
#> chain 1: iter 900 sampling 
#> chain 1: iter 1000 sampling 
#> chain 2 started... 
#> chain 2: iter 100 warmup 
#> chain 2: iter 200 warmup 
#> chain 2: iter 300 warmup 
#> chain 2: iter 400 warmup 
#> chain 2: iter 500 warmup 
#> chain 2: iter 600 sampling 
#> chain 2: iter 700 sampling 
#> chain 2: iter 800 sampling 
#> chain 2: iter 900 sampling 
#> chain 2: iter 1000 sampling 
mcmc_basic$summary()
#> variable    mean    sd     map    q2.5   q97.5  ess_bulk  ess_tail  rhat 
#> lp        -49.82  1.45  -48.75  -53.40  -48.06       414       670  1.00 
#> mu[1]      -0.12  0.10   -0.12   -0.31    0.07      1091       690  1.00 
#> mu[2]       2.16  0.11    2.17    1.95    2.39      1036       620  1.00 
#> mu[3]      -1.12  0.10   -1.11   -1.31   -0.91       819       587  1.00 
#> sigma       0.47  0.05    0.45    0.39    0.56       818       753  1.00 

# --- 2. Optional: Adding Custom Parameter Names and initial values ---
# You can optionally use 'par_names' to assign meaningful labels
# to vector or matrix elements for easier interpretation.
mod_named <- rtmb_model(
  data = data_list,
  code = model_code,
  init = list(mu = rep(0, 3), sigma = 1),
  par_names = list(mu = group_names)
)
#> Pre-checking model code...
#> Checking RTMB setup...

map_named <- mod_named$optimize()
#> Starting optimization...
#> 
#> Optimization converged. Final objective: 46.99
# The summary now displays: mu[Control], mu[Treatment_A], mu[Treatment_B]
map_named$summary()
#> 
#> Call:
#> MAP Estimation via RTMB
#> 
#> Negative Log-Posterior: 46.99
#> Approx. Log Marginal Likelihood (Laplace): -53.43
#> 
#> Point Estimates and 95% Wald CI:
#>        variable  Estimate  Std. Error  Lower 95%  Upper 95% 
#> mu[Control]      -0.12319     0.09499   -0.30936    0.06298 
#> mu[Treatment_A]   2.15921     0.10806    1.94742    2.37099 
#> mu[Treatment_B]  -1.11887     0.09722   -1.30942   -0.92832 
#> sigma             0.44510     0.04041    0.37255    0.53178 
#> 

# Perform MCMC sampling using the named model (chains/iters reduced for speed)
mcmc_named <- mod_named$sample(sampling = 500, warmup = 500, chains = 2)
#> Starting sequential sampling (chains = 2)...
#> chain 1 started... 
#> chain 1: iter 100 warmup 
#> chain 1: iter 200 warmup 
#> chain 1: iter 300 warmup 
#> chain 1: iter 400 warmup 
#> chain 1: iter 500 warmup 
#> chain 1: iter 600 sampling 
#> chain 1: iter 700 sampling 
#> chain 1: iter 800 sampling 
#> chain 1: iter 900 sampling 
#> chain 1: iter 1000 sampling 
#> chain 2 started... 
#> chain 2: iter 100 warmup 
#> chain 2: iter 200 warmup 
#> chain 2: iter 300 warmup 
#> chain 2: iter 400 warmup 
#> chain 2: iter 500 warmup 
#> chain 2: iter 600 sampling 
#> chain 2: iter 700 sampling 
#> chain 2: iter 800 sampling 
#> chain 2: iter 900 sampling 
#> chain 2: iter 1000 sampling 
mcmc_named$summary()
#>        variable    mean    sd     map    q2.5   q97.5  ess_bulk  ess_tail  rhat 
#> lp               -49.84  1.40  -48.85  -53.22  -48.09       312       697  1.01 
#> mu[Control]       -0.12  0.10   -0.10   -0.32    0.07      1000       630  1.00 
#> mu[Treatment_A]    2.16  0.11    2.15    1.94    2.38      1108       361  1.00 
#> mu[Treatment_B]   -1.12  0.10   -1.12   -1.33   -0.92       923       406  1.00 
#> sigma              0.47  0.04    0.46    0.40    0.56       836       752  1.01 
# }