MEmilio Surrogate Model

MEmilio Surrogate Model contains machine learning based surrogate models that make predictions based on the MEmilio simulation models. Currently there are only surrogate models for ODE-type models. The simulations of these models are used for data generation. The goal is to create a powerful tool that predicts the infection dynamics faster than a simulation of an expert model, e.g., a metapopulation or agent-based model while still having acceptable errors with respect to the original simulations.

The package can be found in pycode/memilio-surrogatemodel.

For more details, we refer to:

Schmidt A, Zunker H, Heinlein A, Kühn MJ. (2026). Graph neural network surrogates to leverage mechanistic expert knowledge towards reliable and immediate pandemic response. Scientific Reports 16, 6361. DOI:10.1038/s41598-026-39431-5

Installation

See python_packages/Installation for a detailed installation guide.

Dependencies

Required python packages:

pandas >= 1.2.2
numpy >= 1.22, !=1.25.*
tensorflow
matplotlib
scikit-learn
progress

Since we are running simulations to generate the data, the MEmilio memilio-simulation package also needs to be installed.

Usage

The package currently provides the following modules:

models: models for different tasks

Currently we have the following models:
- ode_secir_simple: A simple model allowing for asymptomatic as well as symptomatic infection states not stratified by age groups.
- ode_secir_groups: A model allowing for asymptomatic as well as symptomatic infection states stratified by age groups and including one damping.
Each model folder contains the following files:
- data_generation: Data generation from expert model simulation outputs.
- model: Training and evaluation of the model.
- network_architectures: Contains multiple network architectures.
- grid_search: Utilities for hyperparameter optimization.
tests: This file contains all tests.

ODE-SECIR Simple Model

The ode_secir_simple module provides surrogate models for the basic ODE-SECIR epidemiological model. This model is not stratified by age groups and simulates disease progression through the following compartments:

S: Susceptible
E: Exposed
C: Infected (asymptomatic/pre-symptomatic)
I: Infected (symptomatic)
R: Recovered
H: Hospitalized (severe cases)
U: ICU (critical cases)
D: Dead

For more details on the model structure and parameters, we refer to the ODE-SECIR model documentation.

Data Generation

The data_generation.py module provides functionality to generate training data for the surrogate models by running multiple simulations of the basic ODE-SECIR model with randomized initial conditions. The data generation process involves:

# Generate dataset with 10,000 simulation runs
# Each with 5 days of input data and 30 days of prediction horizon
data = generate_data(
    num_runs=10000,
    path=path_data,
    input_width=5,
    label_width=30,
    normalize=True,
    save_data=True
)

The data generation process can be summarized as follows:

Randomly initializes the model parameters and initial compartment populations.
Runs the ODE-SECIR simulation using the C++ backend via Python bindings.
Applies logarithmic normalization to improve training stability.
Splits each time series into input and label segments.
Saves the dataset as a pickle file for later use.

Network Architectures

The network_architectures.py module provides different neural network architectures for time series prediction:

MLP (Multi-Layer Perceptron):
- Simple feedforward networks that take flattened time series as input
- Available in both single-output and multi-output variants
LSTM (Long Short-Term Memory):
- Recurrent neural networks specialized for sequence modeling
- Can process variable-length time series while maintaining temporal information
CNN (Convolutional Neural Network):
- Uses 1D convolutions to detect patterns in time series data
- Particularly efficient for capturing local temporal patterns

Model Training and Evaluation

The model.py module provides functionality for:

Preparing data:
- Splitting data into training, validation, and test sets
- Processing data for different model architectures (classic vs. time series)
Model training:
- Initializing models with customizable hyperparameters
- Training with early stopping and customizable loss functions
Evaluation:
- Computing error metrics (MAE, MAPE) across compartments
- Visualizing predictions versus ground truth

Example usage:

# Define model and training parameters
model_parameters = (label_width, num_outputs, hidden_layers,
                   neurons_per_layer, activation, modelname)
training_parameters = (early_stop, max_epochs, loss, optimizer, metrics)

# Initialize and train model
model = initialize_model(model_parameters)
history = network_fit(model, modeltype, training_parameters, path_data)

# Plot results
plot_compartment_prediction_model(test_inputs, test_labels,
                                 modeltype, model, 'InfectedSymptoms')

Hyperparameter Optimization

The grid_search.py module provides tools for systematic hyperparameter optimization:

Cross-validation:
- K-fold cross-validation to prevent overfitting
- Evaluation of multiple model architectures and training configurations
Grid search:
- Systematic exploration of hyperparameter space
- Tracking and storage of performance metrics
Result analysis:
- Visualization of hyperparameter importance
- Selection of optimal model configurations

Graph Neural Network (GNN) Surrogate Models

The Graph Neural Network (GNN) module provides advanced surrogate models that leverage spatial connectivity and age-stratified epidemiological dynamics. These models are designed for immediate and reliable pandemic response by combining mechanistic expert knowledge with machine learning efficiency.

Overview and Scientific Foundation

The GNN surrogate models are based on the research presented in:

Schmidt A, Zunker H, Heinlein A, Kühn MJ. (2026). Graph neural network surrogates to leverage mechanistic expert knowledge towards reliable and immediate pandemic response. Scientific Reports 16, 6361. DOI:10.1038/s41598-026-39431-5

The implementation leverages the mechanistic ODE-SECIR model (see ODE-SECIR documentation) as the underlying expert model, using Python bindings to the C++ backend for efficient simulation during data generation.

Module Structure

The GNN module is located in pycode/memilio-surrogatemodel/memilio/surrogatemodel/GNN and consists of:

data_generation.py: Generates training and evaluation data by simulating epidemiological scenarios with the mechanistic SECIR model
network_architectures.py: Defines various GNN architectures (ARMAConv, GCSConv, GATConv, GCNConv, APPNPConv) with configurable depth and channels
evaluate_and_train.py: Implements training and evaluation pipelines for GNN models
grid_search.py: Provides hyperparameter optimization through systematic grid search
GNN_utils.py: Contains utility functions for data preprocessing, graph construction, and population data handling

Data Generation

The data generation process in data_generation.py creates graph-structured training data through mechanistic simulations. Use generate_data to run multiple simulations and persist a pickle with inputs, labels, damping info, and contact matrices:

from memilio.surrogatemodel.GNN import data_generation
import memilio.simulation as mio

data = data_generation.generate_data(
    num_runs=5,
    data_dir="/path/to/memilio/data",
    output_path="/tmp/generated_datasets",
    input_width=5,
    label_width=30,
    start_date=mio.Date(2020, 10, 1),
    end_date=mio.Date(2021, 10, 31),
    mobility_file="commuter_mobility.txt",  # or commuter_mobility_2022.txt
    transform=True,
    save_data=True
)

Data Generation Workflow:

Parameter Sampling: Randomly sample epidemiological parameters (transmission rates, incubation periods, recovery rates) from predefined distributions to create diverse scenarios.
Compartment Initialization: Initialize epidemic compartments for each age group in each region based on realistic demographic data. Compartments are initialized using shared base factors.
Mobility Graph Construction: Build a spatial graph where:
- Nodes represent geographic regions (e.g., German counties)
- Edges represent mobility connections with weights from commuting data
- Node features include age-stratified population sizes
Contact Matrix Configuration: Load and configure baseline contact matrices for different location types (home, school, work, other) stratified by age groups.
Damping Application: Apply time-varying dampings to contact matrices to simulate NPIs:
- Multiple damping periods with random start days
- Location-specific damping factors (e.g., stronger school closures, moderate workplace restrictions)
- Realistic parameter ranges based on observed intervention strengths
Simulation Execution: Run the mechanistic ODE-SECIR model using MEmilio’s C++ backend through Python bindings to generate the dataset.
Data Processing: Transform simulation results into graph-structured format:
- Extract compartment time series for each node (region) and age group
- Apply logarithmic transformation for numerical stability
- Store graph topology, node features, and temporal sequences

Network Architectures

The network_architectures.py module provides flexible GNN model construction for supported layer types (ARMAConv, GCSConv, GATConv, GCNConv, APPNPConv).

from memilio.surrogatemodel.GNN import network_architectures

model = network_architectures.get_model(
    layer_type="GCNConv",
    num_layers=3,
    num_channels=64,
    activation="relu",
    num_output=48  # outputs per node
)

Training and Evaluation

The evaluate_and_train.py module provides the training functionality:

from tensorflow.keras.losses import MeanAbsolutePercentageError
from tensorflow.keras.optimizers import Adam
from memilio.surrogatemodel.GNN import evaluate_and_train, network_architectures

dataset = evaluate_and_train.load_gnn_dataset(
    "/tmp/generated_datasets/GNN_data_30days_3dampings_classic5.pickle",
    "/path/to/memilio/data/Germany/mobility",
    number_of_nodes=400
)

model = network_architectures.get_model(
    layer_type="GCNConv",
    num_layers=3,
    num_channels=32,
    activation="relu",
    num_output=48
)

results = evaluate_and_train.train_and_evaluate(
    data=dataset,
    batch_size=32,
    epochs=50,
    model=model,
    loss_fn=MeanAbsolutePercentageError(),
    optimizer=Adam(learning_rate=0.001),
    es_patience=10,
    save_dir="/tmp/model_results",
    save_name="gnn_model"
)

Training Features:

Mini-batch Training: Graph batching for efficient training on large datasets
Custom Loss Functions: MSE, MAE, MAPE, or custom compartment-weighted losses
Early Stopping: Monitors validation loss to prevent overfitting
Save Best Weights: Saves best model weights based on validation performance

Evaluation Metrics:

Mean Absolute Error (MAE): Average absolute prediction error per compartment
Mean Absolute Percentage Error (MAPE): Mean absolute error as percentage
R² Score: Coefficient of determination for prediction quality

Data Splitting:

Training Set (70%): For model parameter optimization
Validation Set (15%): For hyperparameter tuning and early stopping
Test Set (15%): For final performance evaluation

Hyperparameter Optimization

The grid_search.py module enables systematic exploration of hyperparameter space:

from pathlib import Path
from memilio.surrogatemodel.GNN import grid_search, evaluate_and_train

data = evaluate_and_train.create_dataset(
    "/tmp/generated_datasets/GNN_data_30days_3dampings_classic5.pickle",
    "/path/to/memilio/data/Germany/mobility",
    number_of_nodes=400
)

parameter_grid = grid_search.generate_parameter_grid(
    layer_types=["GCNConv", "GATConv"],
    num_layers_options=[2, 3],
    num_channels_options=[16, 32],
    activation_functions=["relu", "elu"]
)

grid_search.perform_grid_search(
    data=data,
    parameter_grid=parameter_grid,
    save_dir=str(Path("/tmp/grid_results")),
    batch_size=32,
    max_epochs=50,
    es_patience=10,
    learning_rate=0.001
)

Utility Functions

The GNN_utils.py module provides essential helper functions used throughout the GNN workflow:

Data Preprocessing:

from memilio.surrogatemodel.GNN import GNN_utils

# Remove confirmed compartments (simplify model)
simplified_data = GNN_utils.remove_confirmed_compartments(
    dataset_entries=dataset,
    num_groups=6
)

# Apply logarithmic scaling
scaled_inputs, scaled_labels = GNN_utils.scale_data(
    data=dataset,
    transform=True
)

Graph Construction:

# Create mobility graph from commuting data
graph = GNN_utils.create_mobility_graph(
    mobility_dir='path/to/mobility',
    num_regions=401,            # German counties
    county_ids=county_list,
    models=models_per_region    # SECIR models for each region
)

# Get baseline contact matrix
contact_matrix = GNN_utils.get_baseline_contact_matrix(
    data_dir='path/to/contact_matrices'
)

Practical Usage Example

Here is a complete example workflow from data generation to model evaluation:

import memilio.simulation as mio
from tensorflow.keras.losses import MeanAbsolutePercentageError
from tensorflow.keras.optimizers import Adam
from memilio.surrogatemodel.GNN import (
    data_generation,
    network_architectures,
    evaluate_and_train
)

# Step 1: Generate and save training data
data_generation.generate_data(
    num_runs=100,
    data_dir="/path/to/memilio/data",
    output_path="/tmp/generated_datasets",
    input_width=5,
    label_width=30,
    start_date=mio.Date(2020, 10, 1),
    end_date=mio.Date(2021, 10, 31),
    save_data=True,
    mobility_file="commuter_mobility.txt"
)

# Step 2: Load dataset and build model
dataset = evaluate_and_train.load_gnn_dataset(
    "/tmp/generated_datasets/GNN_data_30days_3dampings_classic100.pickle",
    "/path/to/memilio/data/Germany/mobility",
    number_of_nodes=400
)

model = network_architectures.get_model(
    layer_type="GCNConv",
    num_layers=4,
    num_channels=128,
    activation="relu",
    num_output=48
)

# Step 3: Train and evaluate
results = evaluate_and_train.train_and_evaluate(
    data=dataset,
    batch_size=32,
    epochs=100,
    model=model,
    loss_fn=MeanAbsolutePercentageError(),
    optimizer=Adam(learning_rate=0.001),
    es_patience=20,
    save_dir="/tmp/model_results",
    save_name="gnn_weights_best"
)

GPU Acceleration:

TensorFlow automatically uses GPU when available
Spektral layers are optimized for GPU execution
Training time can be heavily reduced with appropriate GPU hardware

Additional Resources

Code and Examples:

Related Documentation:

MEmilio Simulation Package