Notebooks for "Approximate Bayesian Computation with Deep Learning and Conformal Prediction" paper

This page contains the Supplementary Material for the paper, an add-on to: Approximate Bayesian Computation with Deep Learning and Conformal Prediction by Meili Baragatti, Bertrand Cloez, David Métivier, and Isabelle Sanchez.
https://doi.org/10.48550/arXiv.2406.04874. 2024

The following links provide the code used to obtain and reproduce the results of the article. The code is written in Julia for the MA(2), Lotka-Volterra, and Phytoplankton dynamics in a lake examples, and in R for the 2D Gaussian Field example. The GitLab repository containing all the files can be accessed here

Julia Quarto Notebooks

Notebook Description
MA(2) MA(2) time series model. Well known toy Bayesian example with 2 parameters.
Discrete Lotka-Volterra ABC-based inference for predator-prey dynamics with 3 parameters. Very challenging for some extreme parameters.
Phytoplankton Dynamics Toy model of phytoplankton dynamics in a lake. "High" dimensional example with 9 parameters.

Settings

If you are not familiar with Julia, we recommend downloading the latest version using the command line provided on the official Julia website.

Please ensure that you are using Julia (> 1.10) and Quarto (> 1.6).

For each example, download the Project.toml and Manifest.toml files (as well as datasets and pretrained models if available) into the associated folder. Then, activate and instantiate the environment in the correct folder (see notebooks for more details).

For this project, we developed two packages that are not yet in the official Julia registry

GPUs and reproducibility

We trained all the Neural Network models (ABC-CNN and ABC-Conformal) with the Lux.jl deep learning package using GPUs with the extansion LuxCuda.jl on NVIDIA GPUs. This significantly improved training times (by at least 20x for larger applications).

  1. Make sure you do have compatible GPUs or consult the Lux GPU page to use other type of models. If you do not have GPUs just comment out the line using LuxCUDA and set dev = cpu_device().

  2. Despite trying our best to make the notebooks 100% reproducible from scratch -- providing the Manifest.toml of packages version, setting the initial random seed and configuring Lux.jl to “seed” its layers -- GPU operations can still exhibit some non-determinism (e.g. PyTorch doc for discussion and reference about this) due to optimizations that prioritize speed over reproducibility. As a result, training the models across different sessions, software versions, GPUs, or computers may yield slightly varying outcomes.

To accelerate notebooks, we provide in for the Discrete Lotka-Volterra example and Lake toy model example the pre-trained neural networks models as well as the datasets (even if they are fast to regenerate). The code to train them from scratch is also provided. In the MA(2) example, we do not provide this pretrained model as it is the most lightweight example.

Running the notbooks

You can run cell by cell the code or use Quarto render function (for example: quarto render julia/Lake_norm/ABC_Conformal_Lake.qmd --to html) to run the whole notebook at once.

We tested it works on Linux and Windows. However, we did notice on one of our other computers an error when using Quarto render function ERROR: Malt.TerminatedWorkerException() that we could not yet debug. If you encounter this, try deactivating the GPU package or running the notebook cell by cell.

R Markdown Notebooks

Notebook Description
Gaussian Fields Gaussian Field example. Only 1 parameter to estimate. Only example with ABC Random Forest
MA(2) (not in the paper) MA(2) time series model. Well known toy Bayesian example with two parameters.
Discrete Lotka-Volterra (not in the paper) ABC-based inference for predator-prey dynamics with 3 parameters. Very challenging for some extreme parameters.

We are also making the R code available for the MA(2) and Lotka-Volterra examples. Please note that our method was first developed in R, and the examples were later transferred and reworked in Julia. As a result, the Julia code is more sophisticated than the R code, and the results obtained with Julia are sometimes superior to those obtained with R. The speed of Julia allows the examples to be run on much larger datasets than with R.

Installation instructions Please clone or download the repository and consider the R project ABCDconformal.Rproj. R software (version >= 4.1.2), the {reticulate v1.28} package and the following R packages: library(cowplot) library(tidyr) library(plyr) library(dplyr) library(ggplot2) library(GillespieSSA2) library(bigmemory) library(bigalgebra) library(keras) library(DT) library(sf) library(gstat) library(spdep) library(foreach) library(abcrf)

It is very important to install version 1.28 of reticulate to be able to use our code. You can see our session info in all our HTML reports.