An adaptive model hierarchy for data-augmented training of kernel models for reactive flow
††thanks: Funded by BMBF under contracts 05M20PMA and 05M20VSA. Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under contracts OH 98/11-1 and SCHI 1493/1-1, as well as under Germany’s Excellence Strategy EXC 2044 390685587, Mathematics Münster: Dynamics – Geometry – Structure, and EXC 2075 390740016, Stuttgart Center for Simulation Science (SimTech).1 Reference model
We are interested in constructing efficient and accurate models to approximate time-dependent quantities of interest (QoI) in the context of reactive flow, with and where for denotes the set of possible input parameters. As a class of QoI functions, we consider those obtained by applying linear functionals to solution trajectories of, e.g., parametric parabolic partial differential equations. Thus, , where for each parameter , the concentration with and initial condition is the unique weak solution of
(1) |
Here, denotes a Gelfand triple of Hilbert-spaces associated with a spatial Lipschitz-domain and, for , denotes a continuous linear functional and a continuous coercive bilinear form.
As a basic model for reactive flow in catalytic filters, (1) could stem from a single-phase one-dimensional linear advection-diffusion-reaction problem with Dammköhler- and Péclet-numbers as input (thus ), where models the dimensionless molar concentration of a species and the break-through curve measures the concentration at the outflow, as detailed in Gavrilenko et al. (2022).
Since direct evaluations of are not available, we resort to a full order model (FOM) as reference model, yielding
(2) |
which we assume to be a sufficiently accurate approximation of the QoI. For simplicity, we consider a -conforming Finite Element space and obtain the FOM solution trajectory by Galerkin projection of (1) onto and an implicit Euler approximation of the temporal derivative.
2 Surrogate models
The evaluation of (2) may be arbitrarily costly, in particular in multi- or large-scale scenarios where , but also if due to long-time integration or when a high resolution of is required. We thus seek to build a machine learning (ML) based surrogate model
(3) |
to predict all values at once, without time-integration. Such models based on Neural Networks or Kernels typically rely on a large amount of training data
(4) |
rendering their training prohibitively expensive in the aforementioned scenarios; we refer to Gavrilenko et al. (2022) and the references therein and in particular to Santin and Haasdonk (2021). In Gavrilenko et al. (2022) we thus seek to employ an intermediate surrogate to generate sufficient training data.
2.1 Structure preserving Reduced Basis models
The idea of projection-based model order reduction by Reduced Basis (RB) methods is to approximate the state in a low-dimensional subspace and to obtain online-efficient approximations of by Galerkin projection of the FOM detailed in Section 1 onto and a pre-computation of all quantities involving in a possibly expensive offline-computation; we refer to Milk et al. (2016) and the references therein. Using such structure preserving reduced order models (ROM)s we obtain RB trajectories and a RB model
(5) |
with a computational complexity independent of , the solution of which, however, still requires time-integration.
The quality and efficiency of RB models hinges on the problem adapted RB space which could be constructed in an iterative manner steered by a posteriori error estimates using the POD-greedy algorithm from Haasdonk (2013). Instead, we obtain by the method of snapshots
(6) |
consisting of only few a priori selected parameters (e.g. the outermost four points in ), where we use the hierarchic approximate POD from Himpe et al. (2018) for to avoid computing the SVD of a dense snapshot Gramian of size .
2.2 Kernel models
While still requiring time-integration, we can afford to use RB ROMs to generate a sufficient amount of training data
augmented by the FOM-data available as a side-effect from generating . Using this data, we obtain the ML model from (3) using the vectorial greedy orthogonal kernel algorithm from Santin and Haasdonk (2021).
While resulting in substantial computational gains, the presented approach from Gavrilenko et al. (2022) still relies on the traditional offline/online splitting of the computational process to train the RB ROM as well as the ML model to be valid for all of , requiring a priori choices regarding and with a significant impact on the overall performance and applicability of these models.
3 An adaptive model hierarchy
Keil et al. (2021) introduced an approach beyond the classical offline/online splitting where a RB ROM is adaptively enriched based on rigorous a posteriori error estimates, following the path of an optimization procedure through the parameter space. Similarly, we propose an adaptive enrichment yielding a hierarchy of FOM, RB ROM and ML models, based on the standard residual-based a posteriori estimate on the RB output error, , for which we refer to the references in Milk et al. (2016).
As a means to judge if a ML model is trustworthy, we propose a manual validation using the following a posteriori error estimate on the ML QoI error. While not as cheaply computable as , it still allows to validate the ML model without computing .
Proposition 1 (ML model a posteriori error estimate)
Let , denote the RB ROM and ML model approximations of , respectively, and
let denote an upper bound on the RB-output error.
We then have by triangle inequality for all
(7) | ||||
where the right hand side is computable with a computational complexity independent of .
Applying Algorithm 1 to the example of one-dimensional single-phase reactive flow from the last row of Table 1 in Gavrilenko et al. (2022), with , time steps, gives the behaviour shown in Figure 1, where we set , retrain the ML model every 10 collected samples and unconditionally trust the ML model as soon as .111The experiments were performed using pyMOR from Milk et al. (2016) and dune-gdt from https://docs.dune-gdt.org/. For the considered diffusion dominated regime, we only require a single evaluation of (yielding a -dimensional RB ROM), which results in even further computational savings, compared to the results obtained in Gavrilenko et al. (2022).

References
- Gavrilenko et al. (2022) Gavrilenko, P., Haasdonk, B., Iliev, O., Ohlberger, M., Schindler, F., Toktaliev, P., Wenzel, T., and Youssef, M. (2022). A full order, reduced order and machine learning model pipeline for efficient prediction of reactive flows. In Large-Scale Scientific Computing, 378–386. Springer International Publishing.
- Haasdonk (2013) Haasdonk, B. (2013). Convergence rates of the POD-greedy method. ESAIM Math. Model. Numer. Anal., 47(3), 859–873.
- Himpe et al. (2018) Himpe, C., Leibner, T., and Rave, S. (2018). Hierarchical Approximate Proper Orthogonal Decomposition. SIAM Journal on Scientific Computing, 40(5), A3267–A3292.
- Keil et al. (2021) Keil, T., Mechelli, L., Ohlberger, M., Schindler, F., and Volkwein, S. (2021). A non-conforming dual approach for adaptive trust-region reduced basis approximation of PDE-constrained parameter optimization. ESAIM Math. Model. Numer. Anal., 55(3), 1239–1269.
- Milk et al. (2016) Milk, R., Rave, S., and Schindler, F. (2016). pyMOR – generic algorithms and interfaces for model order reduction. SIAM J. Sci. Comput., 38(5), S194–S216.
- Santin and Haasdonk (2021) Santin, G. and Haasdonk, B. (2021). Kernel methods for surrogate modeling. In P. Benner, S. Grivet-Talocia, A. Quarteroni, G. Rozza, W. Schilders, and L.M. Silveira (eds.), Model Order Reduction, volume 2, 311–353. De Gruyter.