MEmilio Surrogate Model

MEmilio Surrogate Model contains machine learning based surrogate models that make predictions based on the MEmilio simulation models. Currently there are only surrogate models for ODE-type models. The simulations of these models are used for data generation. The goal is to create a powerful tool that predicts the infection dynamics faster than a simulation of an expert model, e.g., a metapopulation or agent-based model while still having acceptable errors with respect to the original simulations.

The package can be found in pycode/memilio-surrogatemodel.

For more details, we refer to:

Schmidt A, Zunker H, Heinlein A, Kühn MJ. (2026). Graph neural network surrogates to leverage mechanistic expert knowledge towards reliable and immediate pandemic response. Scientific Reports 16, 6361. DOI:10.1038/s41598-026-39431-5

Installation

See python_packages/Installation for a detailed installation guide.

Dependencies

Required python packages:

  • pandas >= 1.2.2

  • numpy >= 1.22, !=1.25.*

  • tensorflow

  • matplotlib

  • scikit-learn

  • progress

Since we are running simulations to generate the data, the MEmilio memilio-simulation package also needs to be installed.

Usage

The package currently provides the following modules:

  • models: models for different tasks

    Currently we have the following models:

    • ode_secir_simple: A simple model allowing for asymptomatic as well as symptomatic infection states not stratified by age groups.

    • ode_secir_groups: A model allowing for asymptomatic as well as symptomatic infection states stratified by age groups and including one damping.

    Each model folder contains the following files:

    • data_generation: Data generation from expert model simulation outputs.

    • model: Training and evaluation of the model.

    • network_architectures: Contains multiple network architectures.

    • grid_search: Utilities for hyperparameter optimization.

  • tests: This file contains all tests.

ODE-SECIR Simple Model

The ode_secir_simple module provides surrogate models for the basic ODE-SECIR epidemiological model. This model is not stratified by age groups and simulates disease progression through the following compartments:

  • S: Susceptible

  • E: Exposed

  • C: Infected (asymptomatic/pre-symptomatic)

  • I: Infected (symptomatic)

  • R: Recovered

  • H: Hospitalized (severe cases)

  • U: ICU (critical cases)

  • D: Dead

For more details on the model structure and parameters, we refer to the ODE-SECIR model documentation.

Data Generation

The data_generation.py module provides functionality to generate training data for the surrogate models by running multiple simulations of the basic ODE-SECIR model with randomized initial conditions. The data generation process involves:

# Generate dataset with 10,000 simulation runs
# Each with 5 days of input data and 30 days of prediction horizon
data = generate_data(
    num_runs=10000,
    path=path_data,
    input_width=5,
    label_width=30,
    normalize=True,
    save_data=True
)

The data generation process can be summarized as follows:

  1. Randomly initializes the model parameters and initial compartment populations.

  2. Runs the ODE-SECIR simulation using the C++ backend via Python bindings.

  3. Applies logarithmic normalization to improve training stability.

  4. Splits each time series into input and label segments.

  5. Saves the dataset as a pickle file for later use.

Network Architectures

The network_architectures.py module provides different neural network architectures for time series prediction:

  1. MLP (Multi-Layer Perceptron):

    • Simple feedforward networks that take flattened time series as input

    • Available in both single-output and multi-output variants

  2. LSTM (Long Short-Term Memory):

    • Recurrent neural networks specialized for sequence modeling

    • Can process variable-length time series while maintaining temporal information

  3. CNN (Convolutional Neural Network):

    • Uses 1D convolutions to detect patterns in time series data

    • Particularly efficient for capturing local temporal patterns

Model Training and Evaluation

The model.py module provides functionality for:

  1. Preparing data:

    • Splitting data into training, validation, and test sets

    • Processing data for different model architectures (classic vs. time series)

  2. Model training:

    • Initializing models with customizable hyperparameters

    • Training with early stopping and customizable loss functions

  3. Evaluation:

    • Computing error metrics (MAE, MAPE) across compartments

    • Visualizing predictions versus ground truth

Example usage:

# Define model and training parameters
model_parameters = (label_width, num_outputs, hidden_layers,
                   neurons_per_layer, activation, modelname)
training_parameters = (early_stop, max_epochs, loss, optimizer, metrics)

# Initialize and train model
model = initialize_model(model_parameters)
history = network_fit(model, modeltype, training_parameters, path_data)

# Plot results
plot_compartment_prediction_model(test_inputs, test_labels,
                                 modeltype, model, 'InfectedSymptoms')

Hyperparameter Optimization

The grid_search.py module provides tools for systematic hyperparameter optimization:

  1. Cross-validation:

    • K-fold cross-validation to prevent overfitting

    • Evaluation of multiple model architectures and training configurations

  2. Grid search:

    • Systematic exploration of hyperparameter space

    • Tracking and storage of performance metrics

  3. Result analysis:

    • Visualization of hyperparameter importance

    • Selection of optimal model configurations

Graph Neural Network (GNN) Surrogate Models

The Graph Neural Network (GNN) module provides advanced surrogate models that leverage spatial connectivity and age-stratified epidemiological dynamics. These models are designed for immediate and reliable pandemic response by combining mechanistic expert knowledge with machine learning efficiency.

Overview and Scientific Foundation

The GNN surrogate models are based on the research presented in:

Schmidt A, Zunker H, Heinlein A, Kühn MJ. (2026). Graph neural network surrogates to leverage mechanistic expert knowledge towards reliable and immediate pandemic response. Scientific Reports 16, 6361. DOI:10.1038/s41598-026-39431-5

The implementation leverages the mechanistic ODE-SECIR model (see ODE-SECIR documentation) as the underlying expert model, using Python bindings to the C++ backend for efficient simulation during data generation.

Module Structure

The GNN module is located in pycode/memilio-surrogatemodel/memilio/surrogatemodel/GNN and consists of:

  • data_generation.py: Generates training and evaluation data by simulating epidemiological scenarios with the mechanistic SECIR model

  • network_architectures.py: Defines various GNN architectures (ARMAConv, GCSConv, GATConv, GCNConv, APPNPConv) with configurable depth and channels

  • evaluate_and_train.py: Implements training and evaluation pipelines for GNN models

  • grid_search.py: Provides hyperparameter optimization through systematic grid search

  • GNN_utils.py: Contains utility functions for data preprocessing, graph construction, and population data handling

Data Generation

The data generation process in data_generation.py creates graph-structured training data through mechanistic simulations. Use generate_data to run multiple simulations and persist a pickle with inputs, labels, damping info, and contact matrices:

from memilio.surrogatemodel.GNN import data_generation
import memilio.simulation as mio

data = data_generation.generate_data(
    num_runs=5,
    data_dir="/path/to/memilio/data",
    output_path="/tmp/generated_datasets",
    input_width=5,
    label_width=30,
    start_date=mio.Date(2020, 10, 1),
    end_date=mio.Date(2021, 10, 31),
    mobility_file="commuter_mobility.txt",  # or commuter_mobility_2022.txt
    transform=True,
    save_data=True
)

Data Generation Workflow:

  1. Parameter Sampling: Randomly sample epidemiological parameters (transmission rates, incubation periods, recovery rates) from predefined distributions to create diverse scenarios.

  2. Compartment Initialization: Initialize epidemic compartments for each age group in each region based on realistic demographic data. Compartments are initialized using shared base factors.

  3. Mobility Graph Construction: Build a spatial graph where:

    • Nodes represent geographic regions (e.g., German counties)

    • Edges represent mobility connections with weights from commuting data

    • Node features include age-stratified population sizes

  4. Contact Matrix Configuration: Load and configure baseline contact matrices for different location types (home, school, work, other) stratified by age groups.

  5. Damping Application: Apply time-varying dampings to contact matrices to simulate NPIs:

    • Multiple damping periods with random start days

    • Location-specific damping factors (e.g., stronger school closures, moderate workplace restrictions)

    • Realistic parameter ranges based on observed intervention strengths

  6. Simulation Execution: Run the mechanistic ODE-SECIR model using MEmilio’s C++ backend through Python bindings to generate the dataset.

  7. Data Processing: Transform simulation results into graph-structured format:

    • Extract compartment time series for each node (region) and age group

    • Apply logarithmic transformation for numerical stability

    • Store graph topology, node features, and temporal sequences

Network Architectures

The network_architectures.py module provides flexible GNN model construction for supported layer types (ARMAConv, GCSConv, GATConv, GCNConv, APPNPConv).

from memilio.surrogatemodel.GNN import network_architectures

model = network_architectures.get_model(
    layer_type="GCNConv",
    num_layers=3,
    num_channels=64,
    activation="relu",
    num_output=48  # outputs per node
)

Training and Evaluation

The evaluate_and_train.py module provides the training functionality:

from tensorflow.keras.losses import MeanAbsolutePercentageError
from tensorflow.keras.optimizers import Adam
from memilio.surrogatemodel.GNN import evaluate_and_train, network_architectures

dataset = evaluate_and_train.load_gnn_dataset(
    "/tmp/generated_datasets/GNN_data_30days_3dampings_classic5.pickle",
    "/path/to/memilio/data/Germany/mobility",
    number_of_nodes=400
)

model = network_architectures.get_model(
    layer_type="GCNConv",
    num_layers=3,
    num_channels=32,
    activation="relu",
    num_output=48
)

results = evaluate_and_train.train_and_evaluate(
    data=dataset,
    batch_size=32,
    epochs=50,
    model=model,
    loss_fn=MeanAbsolutePercentageError(),
    optimizer=Adam(learning_rate=0.001),
    es_patience=10,
    save_dir="/tmp/model_results",
    save_name="gnn_model"
)

Training Features:

  1. Mini-batch Training: Graph batching for efficient training on large datasets

  2. Custom Loss Functions: MSE, MAE, MAPE, or custom compartment-weighted losses

  3. Early Stopping: Monitors validation loss to prevent overfitting

  4. Save Best Weights: Saves best model weights based on validation performance

Evaluation Metrics:

  • Mean Absolute Error (MAE): Average absolute prediction error per compartment

  • Mean Absolute Percentage Error (MAPE): Mean absolute error as percentage

  • R² Score: Coefficient of determination for prediction quality

Data Splitting:

  • Training Set (70%): For model parameter optimization

  • Validation Set (15%): For hyperparameter tuning and early stopping

  • Test Set (15%): For final performance evaluation

Hyperparameter Optimization

The grid_search.py module enables systematic exploration of hyperparameter space:

from pathlib import Path
from memilio.surrogatemodel.GNN import grid_search, evaluate_and_train

data = evaluate_and_train.create_dataset(
    "/tmp/generated_datasets/GNN_data_30days_3dampings_classic5.pickle",
    "/path/to/memilio/data/Germany/mobility",
    number_of_nodes=400
)

parameter_grid = grid_search.generate_parameter_grid(
    layer_types=["GCNConv", "GATConv"],
    num_layers_options=[2, 3],
    num_channels_options=[16, 32],
    activation_functions=["relu", "elu"]
)

grid_search.perform_grid_search(
    data=data,
    parameter_grid=parameter_grid,
    save_dir=str(Path("/tmp/grid_results")),
    batch_size=32,
    max_epochs=50,
    es_patience=10,
    learning_rate=0.001
)

Utility Functions

The GNN_utils.py module provides essential helper functions used throughout the GNN workflow:

Data Preprocessing:

from memilio.surrogatemodel.GNN import GNN_utils

# Remove confirmed compartments (simplify model)
simplified_data = GNN_utils.remove_confirmed_compartments(
    dataset_entries=dataset,
    num_groups=6
)

# Apply logarithmic scaling
scaled_inputs, scaled_labels = GNN_utils.scale_data(
    data=dataset,
    transform=True
)

Graph Construction:

# Create mobility graph from commuting data
graph = GNN_utils.create_mobility_graph(
    mobility_dir='path/to/mobility',
    num_regions=401,            # German counties
    county_ids=county_list,
    models=models_per_region    # SECIR models for each region
)

# Get baseline contact matrix
contact_matrix = GNN_utils.get_baseline_contact_matrix(
    data_dir='path/to/contact_matrices'
)

Practical Usage Example

Here is a complete example workflow from data generation to model evaluation:

import memilio.simulation as mio
from tensorflow.keras.losses import MeanAbsolutePercentageError
from tensorflow.keras.optimizers import Adam
from memilio.surrogatemodel.GNN import (
    data_generation,
    network_architectures,
    evaluate_and_train
)

# Step 1: Generate and save training data
data_generation.generate_data(
    num_runs=100,
    data_dir="/path/to/memilio/data",
    output_path="/tmp/generated_datasets",
    input_width=5,
    label_width=30,
    start_date=mio.Date(2020, 10, 1),
    end_date=mio.Date(2021, 10, 31),
    save_data=True,
    mobility_file="commuter_mobility.txt"
)

# Step 2: Load dataset and build model
dataset = evaluate_and_train.load_gnn_dataset(
    "/tmp/generated_datasets/GNN_data_30days_3dampings_classic100.pickle",
    "/path/to/memilio/data/Germany/mobility",
    number_of_nodes=400
)

model = network_architectures.get_model(
    layer_type="GCNConv",
    num_layers=4,
    num_channels=128,
    activation="relu",
    num_output=48
)

# Step 3: Train and evaluate
results = evaluate_and_train.train_and_evaluate(
    data=dataset,
    batch_size=32,
    epochs=100,
    model=model,
    loss_fn=MeanAbsolutePercentageError(),
    optimizer=Adam(learning_rate=0.001),
    es_patience=20,
    save_dir="/tmp/model_results",
    save_name="gnn_weights_best"
)

GPU Acceleration:

  • TensorFlow automatically uses GPU when available

  • Spektral layers are optimized for GPU execution

  • Training time can be heavily reduced with appropriate GPU hardware

Additional Resources

Code and Examples:

Related Documentation: