Bayesian Modeling Workflow

Principled, iterative model building based on Gelman, Vehtari & McElreath (2025) and Gelman et al. (2020).

The Workflow

Start simple — fit the simplest plausible model first
Prior predictive check — verify priors produce plausible data (see SKILL.md § Prior Predictive)
Fit and diagnose — check convergence before interpreting (see SKILL.md § Diagnostics)
Posterior predictive check — find specific, interpretable misfits (see SKILL.md § Posterior Predictive)
Expand and compare — add one piece of complexity, compare via LOO, repeat

Start Simple

Write the simplest model that could address the question. Fit it. Understand it before adding complexity.

Grouped data: start with complete pooling (single intercept), then no pooling, then partial pooling
Regression: start with one or two key predictors, not all of them
Time series: start with a simple trend or random walk before adding seasonal or autoregressive structure

Complex models are understood in relation to simpler ones. Each expansion reveals what the added complexity buys. Computational problems are easier to diagnose in simple models.

Save results.nc immediately after your first successful sampling run.

Expand and Compare

Label each model explicitly (e.g., “Model 1: complete pooling”, “Model 2: partial pooling”) so the progression is clear.

Add ONE piece of complexity at a time. Fit the expanded model, diagnose, check posterior predictions, then compare:

# Compute log-likelihood if using nutpie
pm.compute_log_likelihood(idata_simple, model=model_simple)
pm.compute_log_likelihood(idata_complex, model=model_complex)

comparison = az.compare({
    "simple": idata_simple,
    "complex": idata_complex,
})
print(comparison[["rank", "elpd_loo", "elpd_diff", "se_diff", "weight"]])

Decision rule: If two models have similar stacking weights, they are effectively equivalent.

Fit the expanded model even when you believe the simpler one is sufficient. The comparison itself is informative: - If the expansion doesn’t improve fit, you’ve demonstrated the simpler model is adequate - If it does improve fit, you’ve identified important structure - The difference between models tells you something about the data-generating process

Reporting the Workflow

Report the sequence of models, not just the final one. The modeling journey IS the analysis.

Summarize with a model progression table:

| Model | Description | ELPD_LOO | elpd_diff | weight |
|-------|-------------|----------|-----------|--------|
| Model 2 | Partial pooling | -222.2 | 0.0 | 0.91 |
| Model 1 | Complete pooling | -234.5 | -12.3 | 0.09 |

Include: prior predictive findings, posterior predictive misfits that motivated each expansion, model comparison results, final parameter estimates with 94% HDIs, and conclusions about what the data support.

Simulating Fake Data

For novel or complex model structures, simulate from your model with known parameter values before touching real data:

Choose plausible parameter values (informed by domain knowledge)
Simulate a dataset from the generative model
Fit the model to simulated data
Check: are the true parameters recovered within posterior intervals?

This catches specification bugs, non-identifiability, and prior-likelihood conflicts before real-data messiness obscures the picture. Most valuable for custom likelihoods, latent variable models, and models with complex indexing.

Hierarchical Build-Up

When building hierarchical models iteratively:

Complete pooling — single intercept, ignoring group structure
No pooling — group-specific intercepts, no sharing — compare to pooled
Partial pooling — hierarchical intercepts — compare to both
Varying slopes — allow slopes to vary by group if the question warrants it
Group-level predictors — explain between-group variance

At each step, compare via LOO and check whether the added structure reduces between-group variance.

Compressed Storage

For large InferenceData objects (many draws, large posterior predictive):

# Compress with zlib (reduces file size 50-80%)
idata.to_netcdf(
    "results/model_v1.nc",
    engine="h5netcdf",
    encoding={var: {"zlib": True, "complevel": 4}
              for group in ["posterior", "posterior_predictive"]
              if hasattr(idata, group)
              for var in getattr(idata, group).data_vars}
)

References

Gelman, Vehtari, and McElreath (2025). “Statistical Workflow.” Philosophical Transactions of the Royal Society A.
Gelman, Vehtari, Simpson, et al. (2020). “Bayesian Workflow.” arXiv:2011.01808.
Gabry, Simpson, Vehtari, Betancourt, and Gelman (2019). “Visualization in Bayesian workflow.” JRSS A.