Physics-informed CoKriging model of a redox flow battery

Amanda A. Howard Pacific Northwest National Laboratory, Richland, WA Alexandre M. Tartakovsky Pacific Northwest National Laboratory, Richland, WA; Department of Civil and Environmental Engineering, University of Illinois Urbana-Champaign, Urbana, IL [email protected]

Abstract

Redox flow batteries (RFBs) offer the capability to store large amounts of energy cheaply and efficiently, however, there is a need for fast and accurate models of the charge-discharge curve of a RFB to potentially improve the battery capacity and performance. We develop a multifidelity model for predicting the charge-discharge curve of a RFB. In the multifidelity model, we use the Physics-informed CoKriging (CoPhIK) machine learning method that is trained on experimental data and constrained by the so-called “zero-dimensional” physics-based model. Here we demonstrate that the model shows good agreement with experimental results and significant improvements over existing zero-dimensional models. We show that the proposed model is robust as it is not sensitive to the input parameters in the zero-dimensional model. We also show that only a small amount of high-fidelity experimental datasets are needed for accurate predictions for the range of considered input parameters, which include current density, flow rate, and initial concentrations.

In this paper, we develop a multifidelity model for a redox flow battery (RFB) that incorporates both experimental and physics-based modeling results. RFBs are among the most promising flow batteries, and accurate modeling is necessary to predict and further improve the performance of such systems.

Recent advances in machine learning have begun to make way for accurate and reliable predictions of the performance of batteries. Much of the recent work has focused on lithium-ion batteries, including predicting the long-term performance of such batteries over hundreds of cycles [1, 2, 3] and using machine learning for material development [4, 5]. Work on RFBs is more limited. [6] used machine learning approach to optimize the cost and performance of a vanadium flow battery, and [7] uses a deep neural network to couple pore-scale microstructure simulations and device-scale performance. In all cases, a key limitation to machine learning approaches for battery design is the availability of large, robust datasets for training [2, 6]. Experimental datasets are expensive and time-consuming to produce [6], while datasets generated from simulations are too computationally expensive to represent the full device at high accuracy [7]. In this work, we propose using a computationally efficient physics-based model to reduce the amount of experimental data needed for training.

Several physics-based models of RFB systems have been proposed in recent years, including the zero-dimensional (0d) lumped-parameter model [8]. This control-oriented model is derived for a unit cell RFB system and contains several fitting parameters. These parameters are determined by fitting the 0d model to experiments and vary widely between experiments even for the same battery design [8, 9]. The dependence of parameters on initial and operating conditions reduces the predictive ability of the 0d model.

In this work, we extend the multifidelity Physics-Informed CoKriging method, which was originally introduced in [10] and termed as CoPhIK. This approach is based on the Physics-Informed Kriging (PhIK) [11, 12], where unknown coefficients in the physics model, in this case the fitting parameters for the 0d model, are replaced with Gaussian random variables. This turns the 0d model in the stochastic model used to compute covariances for the Gaussian process regression (GPR) [13] model of the RFB. PhIK provides predictions of both the most probable behavior of the system (in the form of the posterior mean) and the uncertainty in the prediction (in the form of posterior variance). However, PhIK is limited to the accuracy of the underlying physics model. CoPhIK overcomes this limitation by using a small experimental dataset to establish the correlation between the physics-based model and the behavior of a real system to improve the model’s predictions. We are able to accurately capture the performance of cell voltage over a charge-discharge cycle. CoPhIK is computationally cheap to run, with a typical prediction requiring less than a minute to run and requires no computationally expensive pre-training. Additionally, CoPhIK produces accurate results with a very small experimental dataset, requiring only fifteen experiments are used for training.

I Experimental data

In this paper, we use experimental data from RFB laboratory experiments as “high-fidelity” data. We also use data generated by the 0d model that we refer to as “low-fidelity” data because of the strong assumptions and simplifications in the 0d model. In this section, we detail the high-fidelity dataset. The low-fidelity data set and 0d model are described in the Methods section.

Refer to caption — Figure 1: Schematic of the unit cell battery used in the experiments from [7].

The experimental dataset is collected in sixteen RFB experiments under conditions listed in the Supplementary Material. A schematic of the unit cells used in the experiments is shown in Figure 1. While many cycles were completed for each experiment, here we consider the performance only of the third cycle for training and validation as we did not observe any significant changes in the battery performance in the following cycles. The first two cycles are discarded as “priming” cycles during which batteries do not reach the expected capacity.

In all experiments the cell potential, $E$ , was measured as a function of time during charging and discharging cycles as shown in Fig. 2a. In our model, it is convenient to express concentration as a function of the state of charge (SOC), see Fig. 2b. The SOC varies between zero for a fully discharged battery and one for a fully charged battery and can be found as $SOC(t)=\frac{C_{V(II)}(t)}{C_{V_{n}}^{0}}$ where $C_{i}(t)$ is the concentration of species $i$ at time $t$ and $C_{i}^{0}$ is the initial concentration of species $i$ . $C_{V_{n}}^{0}$ is the total initial vanadium concentration of the initial half-cell and $C_{V(II)}(t)$ is the concentration of species V(II) at time $t$ [14]. To validate our approach, we model each experiment using data from the remaining fifteen experiments as a training set for the CoPhIK model (described in Section II.3). One challenge with the considered experimental dataset is that the experiments occur on very disparate timescales resulting in the different values of SOC at the end of the cycles, see Figs. 2a and b. Therefore, we rescale the SOC for each experiment so that the charging cycle has range $[0,1]$ and the discharge cycle has range $[1,2]$ through the following operations:

	$\displaystyle SOC^{*}_{C}$	$\displaystyle=(SOC_{C}-SOC_{C}(0))/(\max(SOC_{C})-SOC_{C}(0))$		(1)
	$\displaystyle SOC^{*}_{D}$	$\displaystyle=1+(SOC_{D}-SOC_{D}(0))/(\max(SOC_{D})-SOC_{D}(0))$		(2)

where the subscripts $C$ and $D$ denote charging and discharging, respectively. With this change, the experiments collapse onto a narrow manifold, see Fig. 2c.

Finally, for the CoPhIK model we need data at the same (rescaled) SOC instances for all the experiments. To obtain such data, we select an SOC vector $\mathbf{SOC^{*}}=[0,s_{1}^{*},s_{2}^{*},...,2.0]$ of length $N_{s}$ and interpolate the experimental results onto the $s_{i}^{*}$ components, $i=1,2,...N_{s}$ , of the vector $\mathbf{s^{*}}$ with B-spline interpolation optimization performed using the interpolate package in scipy, see Fig. 2d for an example of the interpolated fit for one of the experiments.

II Multifidelity framework

II.1 Gaussian process regression

We begin by describing the GPR method for modeling cell potential $E$ of the RFB as a function of SOC and parameters describing the battery operation conditions and properties of the battery. Defining the input space of the GPR model is an important step. The number of parameters describing the battery and its operating condition is very large, see the Supplementary Material. Ideally, the input space of the GPR model would include all these parameters and SOC such that the GPR model can predict the cell voltage $E$ at any SOC for any type and size of the RFB. However, the training of such GPR model would require vast amounts of computational time and data, with both training time of the GPR model and required data exponentially increasing with the number of input parameters. Since we only have 16 experiments (15 experiments for training to account for a test data) we must reduce the number of input parameters as much as possible. Therefore, we only include the varying operation conditions in the considered experiments and SOC in the input space and use the 0d model to describe the dependence of $E$ on the rest of parameters. Since all 16 experiments are performed with fixed velocity $v=4.17$ m/s and concentrations $C_{V(II)}^{0}=0$ mol/m³, and $C_{V(V)}^{0}=0$ mol/m³, we exclude $v$ , $C_{V(II)}^{0}$ , and $C_{V(V)}^{0}$ from the input space. The resulting input space of the GPR model is

\mathbb{D}=[s,j,V_{r},w_{m},C_{V(III)}^{0},C_{V(IV)}^{0},C_{H^{+},pos}^{0},C_{H_{2}O,pos}^{0},C_{H^{+},neg}^{0},C_{H_{2}O,neg}^{0}]\subseteq\mathbb{R}^{d},

with $d=10$ . The cell voltage, $E$ , is the output (or state of interest) of the GPR model, $E\;:\;\mathbb{D}\rightarrow\mathbb{R}$ . In other words, the GPR model aims to predict $E$ for any $s$ , $j$ , $V_{r}$ , $w_{m}$ , $C_{V(III)}^{0}$ , $C_{V(IV)}^{0}$ , $C_{H^{+},pos}^{0}$ , $C_{H_{2}O,pos}^{0}$ , $C_{H^{+},neg}^{0}$ , and $C_{H_{2}O,neg}^{0}$ with fixed $v=4.17$ m/s and $C_{V(II)}^{0}=C_{V(V)}^{0}=0$ mol/m³.

Next, we organize the measurements of $E$ at locations $\mathbf{X}=\{{\mathbf{x}^{(i)}}\}_{i=1}^{N}$ in the observation vector $\mathbf{E}=(E^{1},E^{2},\ldots,E^{N})^{T}$ , where $N=15\cdot N_{s}$ is the total number of observations for fifteen experiments, $N_{s}$ is the size of the $\mathbf{s}^{*}$ vector of SOC measurements, and $E^{i}$ is the $E$ measurement at the location $\mathbf{x}^{(i)}$ .

In the GPR approach we treat $E(\mathbf{x})$ as a Gaussian process as described in the Methods section. Estimating the mean $\mu({\mathbf{x}})$ and covariance $k({\mathbf{x}},{\mathbf{x}}^{\prime})$ models is the main challenge in GPR-based methods (see Methods.) In the standard data-driven GPR, $\mu({\mathbf{x}})$ and $k({\mathbf{x}},{\mathbf{x}}^{\prime})$ would be computed from data by minimizing the so-called marginal log-likelihood function [15]. It should be noted that the prior statistics (i.e., $\mu({\mathbf{x}})$ and $k({\mathbf{x}},{\mathbf{x}}^{\prime})$ ) computed in this approach do not satisfy the physical laws governing the system of interest unless the phase space is very densely sampled, which can lead to significant errors and nonphysical behavior [11, 12]. To address this limitation of the data-driven GPR method, in the following Section II.2 we propose to estimate the prior statistics from the 0d model where we treat unknown parameters as random variables. A similar approach was used in [11, 12] in GPR models of subsurface transport and power grid dynamics.

II.2 Estimating prior statistics from the stochastic 0d model

The motivation behind this approach is that the 0d model, which is described in Methods, has the unknown parameters $k_{1},k_{2},\sigma_{e}$ , and $S$ denoting the reference rate constants, ionic concentration of the electrolyte, and specific surface area, which depend on operating conditions and must be calibrated for each experiment. In this work we treat these parameters as Gaussian random variables with prescribed (prior) mean and variance that turns the 0d model in a stochastic model. Next, we use the Monte Carlo (MC) method to compute the prior GP model of $E$ based on the stochastic 0d model. Specifically, we generate $N_{MC}$ realizations of the $k_{1},k_{2},\sigma_{e}$ , and $S$ parameters and compute $E$ for each realization of the parameters from the 0d model as described in Methods. Then, the prior mean and covariance of $E$ are calculated from this set of $\{E^{(i)}_{L}\}_{i=1}^{N_{MC}}$ realizations as

	$\displaystyle\mu_{L}({\mathbf{x}})$	$\displaystyle=\frac{1}{N_{MC}}\sum_{m=1}^{N_{MC}}E^{(m)}_{L}({\mathbf{x}})$		(3)
	$\displaystyle k_{L}({\mathbf{x}},{\mathbf{x}}^{\prime})$	$\displaystyle=\frac{1}{N_{MC}-1}\sum_{m=1}^{N_{MC}}(E^{(m)}_{L}({\mathbf{x}})-\mu_{L}({\mathbf{x}}))(E^{(m)}_{L}({\mathbf{x}}^{\prime})-\mu_{L}({\mathbf{x}}^{\prime})).$		(4)

Here, the superscript $L$ signifies that the corresponding quantities are obtained from a numerical model that is based on approximation and has a lower fidelity than the experimental data. The estimates of $E$ in terms of the low-fidelity prior statistics are given as:

\mu_{p}({\mathbf{x}}^{*})=\mu_{L}({\mathbf{x}^{*}})+{\mathbf{c}}_{L}({\mathbf{x}^{*}})^{T}{\mathbf{C}}^{-1}_{L}(\mathbf{E}-\mathbf{\bm{\mu}}_{L})

(5)

\sigma^{2}_{p}({\mathbf{x}^{*}})=\sigma^{2}_{L}({\mathbf{x}^{*}})-{\mathbf{c}}_{L}({\mathbf{x}^{*}})^{T}{\mathbf{C}}^{-1}_{L}{\mathbf{c}}_{L}({\mathbf{x}^{*}}),

(6)

where the components of ${\mathbf{c}}_{L}({\mathbf{x}^{*}})$ and ${\mathbf{C}}_{L}$ are given by $k_{L}$ .

When the physics-based model of a considered system is accurate the prior statistics computed from Eqs (5) and (6) were shown to provide an accurate model of the system given the system’s measurements [11, 12]. However, if the physics-based model cannot capture certain features of the system (as is the case with the 0d model), then the prior statistics computed would produce a poor GPR model of the system [10]. In the following section we demonstrate how the discrepancy between the observation data and the 0d model can be accounted for in the GPR prediction.

II.3 Correcting bias in the 0d model

The systematic errors in the 0d model (also called model bias or misspecification) can be corrected by establishing the correlation between the 0d model predictions and the experimental data using the CoKriging method [10, 16]. This idea was originally formulated for systems where the data are sparse, so measurements of other states or other parameters are additionally used to model the data [17, 18]. CoKriging has recently been used with success for data-driven multifidelity modeling to predict system responses using data with two levels of fidelity [19, 20] adapted from Kennedy and O’Hagan’s CoKriging framework [16].

In the multifidelity framework, we treat $E$ observed in the experiments as high-fidelity data. We then use the 0d model to generate the low-fidelity data $\mathbf{E}_{L}=\left(E_{L}^{(1)},\ldots,E_{L}^{(N_{L})}\right)^{T}$ at locations $\mathbf{X}_{L}=\{{\mathbf{x}^{(i)}_{L}}\}_{i=1}^{N_{L}}$ , where $E_{L}^{(i)}\in{\mathbb{R}}$ , and ${\mathbf{x}}_{L}^{(i)}\in{\mathbb{D}}\subseteq{\mathbb{R}}^{d}$ . Note that in this work we generate $E_{L}^{(i)}$ using MC simulations (see Section V.2) and $N_{L}=N_{MC}$ , the number of realizations in the MC simulations.

Following [10, 16], we define an auto-regressive model for the high-fidelity data, $E({\mathbf{x}})=\rho E_{L}({\mathbf{x}})+E_{D}({\mathbf{x}})$ , where $E_{L}({\mathbf{x}})$ is the GP model of the low-fidelity data, $E_{D}({\mathbf{x}})$ is the GP model describing the mismatch between the high and low fidelity models, and $\rho$ is the regression parameter. The observation covariance matrix for this model is

\tilde{\mathbf{C}}=\begin{bmatrix}{\mathbf{C}}_{L}({\mathbf{X}}_{L},{\mathbf{X}}_{L})&\rho{\mathbf{C}}_{L}({\mathbf{X}}_{L},{\mathbf{X}})\\ \rho{\mathbf{C}}_{L}({\mathbf{X}},{\mathbf{X}}_{L})&\rho^{2}{\mathbf{C}}_{L}({\mathbf{X}},{\mathbf{X}})+{\mathbf{C}}_{D}({\mathbf{X}},{\mathbf{X}})\end{bmatrix}

(7)

where the elements of the ${\mathbf{C}}_{L}$ matrices are computed as $k_{L}(\mathbf{x},\mathbf{x}^{\prime})$ , where $k_{L}$ is the covariance function of $E_{L}$ and the elements of the ${\mathbf{C}}_{D}$ matrix are given by $k_{D}(\mathbf{x},\mathbf{x}^{\prime})$ , where $k_{D}$ is the covariance function of $E_{D}$ .

The posterior mean and variance of $E$ at any point ${\mathbf{x}}^{*}\in{\mathbb{D}}$ conditioned on the measurements of $\mathbf{E}$ and $\mathbf{E}_{L}$ are given by

	$\displaystyle\mu_{p}({\mathbf{x}}^{*})$	$\displaystyle=\mu_{H}({\mathbf{x}}^{})+\tilde{\mathbf{c}}({\mathbf{x}}^{})^{T}\tilde{\mathbf{C}}^{-1}\left(\tilde{\mathbf{E}}-\tilde{\bm{\mu}}\right)$		(8)
	$\displaystyle\sigma^{2}_{p}({\mathbf{x}}^{*})$	$\displaystyle=\rho^{2}\hat{\sigma}_{L}^{2}({\mathbf{x}}^{})+\sigma_{D}^{2}({\mathbf{x}}^{})-\tilde{\mathbf{c}}({\mathbf{x}}^{})^{T}\tilde{\mathbf{C}}^{-1}\tilde{\mathbf{c}}({\mathbf{x}}^{}),$		(9)

where $\tilde{\mathbf{E}}=(\mathbf{E}_{L}^{T},\mathbf{E}^{T})^{T}$ , $\mu({\mathbf{x}}^{*})=\rho\mu_{L}({\mathbf{x}}^{*})+\mu_{D}({\mathbf{x}}^{*})$ , $\sigma_{L}^{2}({\mathbf{x}}^{*})=k_{L}({\mathbf{x}}^{*},{\mathbf{x}}^{*})$ , $\sigma_{D}^{2}({\mathbf{x}}^{*})=k_{D}({\mathbf{x}}^{*},{\mathbf{x}}^{*})$ , $\tilde{\bm{\mu}}=(\bm{\mu}_{L}^{T},\bm{\mu}^{T})^{T}$ , $\tilde{\mathbf{c}}({\mathbf{x}}^{*})=(\rho{\mathbf{c}}_{L}({\mathbf{x}}^{*})^{T},{\mathbf{c}}({\mathbf{x}}^{*})^{T})^{T}$ , and $k({\mathbf{x}},{\mathbf{x}}^{\prime})=\rho^{2}k_{L}({\mathbf{x}},{\mathbf{x}}^{\prime})+k_{D}({\mathbf{x}},{\mathbf{x}}^{\prime}).$

In the data-driven approaches of [16, 17, 18, 19, 20], the prior statistics (i.e., $\mu_{L}({\mathbf{x}}^{*})$ , $\mu_{D}({\mathbf{x}}^{*})$ , $k_{L}({\mathbf{x}},{\mathbf{x}}^{\prime})$ , and $k_{D}({\mathbf{x}},{\mathbf{x}}^{\prime})$ ) are learned by maximizing the likelihood function of $Y$ . Here, we compute $\mu_{L}({\mathbf{x}}^{*})$ and $k_{L}({\mathbf{x}},{\mathbf{x}}^{\prime})$ from the Monte Carlo simulations based on the 0d model with random parameters $k_{1},k_{2},\sigma_{e}$ , and $S$ as described in Section II.2. Then, we compute $\mu_{L}({\mathbf{x}}^{*})$ and $k_{L}({\mathbf{x}},{\mathbf{x}}^{\prime})$ using the data set $\mathbf{E_{D}}=\mathbf{E}-\rho\mathbf{E}_{L}^{X}$ with the corresponding locations $\mathbf{X}$ (where $\mathbf{E}_{L}^{X}=[E_{L,1},E_{L,1},...,E_{L,N}]^{T}$ and $E_{L,i}=\mu_{L}(\mathbf{x}_{i})$ ( $\mathbf{x}_{i}\in\mathbf{X}$ )) by minimizing the log likelihood:

\ln L(\mu_{D},\sigma_{D},\lambda_{D})=\frac{1}{2}(\mathbf{E}_{D}-\bm{\mu}_{D})^{T}\mathbf{C}^{-1}_{D}(\mathbf{E}_{D}-\bm{\mu}_{D})-\frac{1}{2}\ln|\mathbf{C}_{D}|-\frac{N}{2}\ln 2\pi.

(10)

Here, we set $\rho=1.0$ and assume that $E_{D}(\mathbf{x})$ has the prior exponential covariance function

k_{D}(\mathbf{x},\mathbf{x}^{\prime})=\sigma_{D}^{2}\exp\left(-\frac{|\mathbf{x}-\mathbf{x}^{\prime}|}{\lambda_{D}}\right),

(11)

and constant $\mu_{D}$ . In Eq (10), the $(i,j)$ component of the covariance matrix $\mathbf{C}_{D}$ is given by $k_{D}(\mathbf{x}_{i},\mathbf{x}_{j})$ ( $(\mathbf{x}_{i},\mathbf{x}_{j})\in\mathbf{X}_{D}$ ).

II.4 Determining the SOC scaling parameter

A key step in the data preparation for training the CoKriging model is to rescale SOC in different experiments according to Eqs. (1) and (2). For data from an experiment $i$ , the scaling parameters in Eqs. (1) and (2) (i.e., $SOC_{i}(0)$ and $\max(SOC_{i})$ ) are given by the experimental data. Furthermore, for a given set of input parameters, the CoKriging model predicts voltage as a function of the scaled SOC. However, to covert the scaled SOC to time, one needs to evaluate scaling parameters as functions of the input parameters. In the following, we introduce two scaling parameters denoted by $\theta_{C}=1/(\max(SOC_{C})-SOC_{C}(0))$ and $\theta_{D}=1/(\max(SOC_{D})-SOC_{D}(0))$ . When $\max(SOC_{C})$ and $\max(SOC_{D})$ are known these scaling parameters are trivial to find. However, we may not know $SOC_{C}(0)$ and $SOC_{D}(0)$ for arbitrary conditions, so we first need to estimate them or, equivalently, the scaling parameters. We compute $\theta_{C}$ using the standard GPR model trained on a data set consisting of the scaling parameters $\theta_{C}$ and corresponding initial voltages $E_{0}=E(t=0)$ from the known experiments. We note that in the considered lab experiments, $-\theta_{D}=0.05+\theta_{C}$ , which allows us to estimate $\theta_{D}$ given $\theta_{C}$ without needing the voltage at any SOC during the discharge cycle. We assumed that there is uncertainty in the $\theta_{C}$ values obtained from the experiments with the variance $\sigma^{2}_{\theta}=0.5$ The non-zero $\sigma^{2}_{\theta}$ reflects the fact that the different values of $\theta_{C}$ were obtained from the experiments with very similar values of $E_{0}$ . The positive value of $\sigma^{2}_{\theta}$ is also needed to make the measurement covariance matrix well-conditioned. Figure 3 shows the GPR model prediction of $\theta_{C}$ as a function of $E_{0}$ and the prediction uncertainty. The GPR model uncertainty increase outside of the range $1.3-1.5$ V of $E_{0}$ , where the data are available. This uncertainty can be reduced by conducting experiments with the expended range of $E_{0}$ . Within $1.3-1.5$ V range, the uncertainty in GPR model prediction can be reduced by considering additional variables that affect $\theta_{c}$ .

III The multifidelity model prediction of charge and discharge curves

CoPhIK can accurately predict the behavior of the charge-discharge curves, as shown in Fig. 4. With only one data point used, $E(t=0)$ , CoPhIK matches the experimental data well, and accurately captures the jump between the charge and discharge cycles. The largest disagreement between the experimental voltage and the CoPhIK results occurs at the tail of the discharge curve, where we have very few measurements available to accurately capture the curve. The variance in the CoPhIK results, shown as the shaded regions in Fig. 4, decreases as more measurements are used in the CoPhIK prediction.

We performed single-fidelity PhIK and multi-fidelity CoPhIK simulations for all sixteen experiments in our dataset and computed the average of the discrete $\ell_{2}$ and $\ell_{\infty}$ norms, as shown in table 1. The average $\ell_{2}$ error with CoPhIK is approximately two orders of magnitude lower than the average error with PhIK, and the CoPhIK $\ell_{\infty}$ error is approximately one order of magnitude smaller. By including the available experimental data we are able to reduce the error in our predictions over a fully physics-informed approach.

In Fig. 5 we provide a comparison between CoPhIK and PhIK. It is clear that PhIK struggles to capture a physical shape for the charge and discharge curves, resulting in overall larger errors. PhIK shows a smaller variance than CoPhIK because the uncertainty of PhIK is bounded by the parametric uncertainty in the OD model and does not take into account the model uncertainty in the 0d model associated with the simplifying assumptions in the 0d model.

	$\ell_{2}$ , 1pt.	$\ell_{2}$ , 2pts.	$\ell_{2}$ , 3pts.	$\ell_{\infty}$ , 1pt.	$\ell_{\infty}$ , 2pts.	$\ell_{\infty}$ , 3pts.
PhIK	$1.40\times 10^{-2}$	$1.26\times 10^{-2}$	$1.28\times 10^{-2}$	$9.56\times 10^{-1}$	$9.50\times 10^{-1}$	$9.31\times 10^{-1}$
CoPhIK	$4.36\times 10^{-4}$	$1.74\times 10^{-4}$	$1.11\times 10^{-4}$	$8.81\times 10^{-2}$	$8.38\times 10^{-2}$	$8.69\times 10^{-2}$

Table 1: Average of the

\ell_{2}

and

\ell_{\infty}

errors between the experimental data with PhIK and CoPhIK.

III.1 Sensitivity to choice of initial parameters

In this section we conduct CoPhIK simulations as described above, except the input parameters to the Gaussian random variables in Eqs. (30)–(33) are taken from literature [8] instead of [21]. The rest of the data remains the same. Fig. 8 demonstrates that $E$ computed from the 0d model with these parameters overestimates the experimental values during charging and underestimates the discharging section by approximately 10 $\%$ . As a consequence, the mean of the low-fidelity dataset also over- and underestimates the experimental data. However, the results with CoPhIK can compensate for the discrepancies between the experimental and low-fidelity data. In Fig. 6, we show the 0d and CoPhIK predictions of $E$ for the same experimental data set as in Fig. 4. The mean of the low-fidelity dataset, $\langle E\rangle$ , does not agree well with the experimental data. However, the CoPhIK prediction agrees very well. In some cases, the uncertainty is higher than in Fig. 4, but the absolute error is of the same order of magnitude with both sets of input parameters to the 0d model. The average $\ell_{2}$ errors are $4.09\times 10^{-4}$ , $1.64\times 10^{-4}$ , and $9.44\times 10^{-5}$ with one, two, and three data points, respectively, and the average $\ell_{\infty}$ errors are $8.85\times 10^{-2}$ , $9.66\times 10^{-2}$ , and $9.99\times 10^{-2}$ with one, two, and three data points. Comparing with Table 1, there is no significant difference between using the input parameters in this work and the input parameters from [8] for the $\ell_{2}$ error, and results are similar for the $\ell_{\infty}$ error. Very similar results are seen when using the parameters from [9].

These results show that the CoPhIK framework removes the potential bias because of the difference in the prior and actual statistics of the parameters in the 0d model, i.e., the proposed CoPhIK framework is non-sensitive to the choice of the prior statistics of the parameters in the 0d model.

IV Conclusions

We have shown that the CoPhIK method can be used to accurately model RFB systems given a small number of experimental observations. Specifically, we demonstrated that the CoPhIK method accurately predicts the charge-discharge curves of a RFB using only initial conditions from the Supplementary Information and charge-discharge data collected in sixteen experiments with similar but not identical batteries and different initial conditions. We note that the resulting predictions are not sensitive to the prior distribution of the parameters in the 0d model, allowing for accurate predictions without the need to fit the input parameters. As there is disagreement in the literature over what these parameters should be, with suggested values ranging over several orders of magnitude, our model allows for flexibility without the need for expensive and time-consuming calibration procedures, which may require more data than is available. Additionally, the CoPhIK model is able to capture key features of the charge-discharge curve, including the tails that can occur at the beginning and end of each cycle. The 0d model, which does not incorporate data, cannot predict these features. The CoPhIK model is inexpensive to run, with a typical run taking less than a minute. This allows for accurate, rapid prediction of the RFB system’s performance.

In future work we look to use the CoPhIK model to conduct active learning to reduce uncertainty in our model predictions while running minimal expensive experiments. By targeting experiments in areas where the uncertainty is highest, the uncertainty of new predictions can be limited while minimizing the experimental cost and workload. For example, in Fig. 3, the uncertainty is highest for small and large initial voltage values. By conducting targeted experiments at higher and lower initial voltages, the uncertainty of the scaling parameter would be minimized.

V Acknowledgements

The authors thank J. Bao and R. Tipireddy for helpful discussions regarding the 0d modeling and W. Wang and L. Yan for help with interpreting the experimental design and data. This research was supported by the Energy Storage Materials Initiative (ESMI), under the Laboratory Directed Research and Development (LDRD) Program at Pacific Northwest National Laboratory (PNNL). PNNL is a multi-program national laboratory operated for the U.S. Department of Energy (DOE) by Battelle Memorial Institute under Contract No. DE-AC05-76RL01830.

References

Liu et al. [2015] D. Liu, J. Zhou, D. Pan, Y. Peng, and X. Peng, Lithium-ion battery remaining useful life estimation with an optimized relevance vector machine algorithm with incremental learning, Measurement 63, 143 (2015).
Severson et al. [2019] K. A. Severson, P. M. Attia, N. Jin, N. Perkins, B. Jiang, Z. Yang, M. H. Chen, M. Aykol, P. K. Herring, D. Fraggedakis, et al., Data-driven prediction of battery cycle life before capacity degradation, Nature Energy 4, 383 (2019).
Weigert et al. [2011] T. Weigert, Q. Tian, and K. Lian, State-of-charge prediction of batteries and battery–supercapacitor hybrids using artificial neural networks, Journal of Power Sources 196, 4061 (2011).
Sanchez-Lengeling and Aspuru-Guzik [2018] B. Sanchez-Lengeling and A. Aspuru-Guzik, Inverse molecular design using machine learning: Generative models for matter engineering, Science 361, 360 (2018).
Wu et al. [2018] B. Wu, S. Han, K. G. Shin, and W. Lu, Application of artificial neural networks in design of lithium-ion batteries, Journal of Power Sources 395, 128 (2018).
Li et al. [2020] T. Li, F. Xing, T. Liu, J. Sun, D. Shi, H. Zhang, and X. Li, Cost, performance prediction and optimization of a vanadium flow battery by machine-learning, Energy & Environmental Science 13, 4353 (2020).
Bao et al. [2020] J. Bao, V. Murugesan, C. J. Kamp, Y. Shao, L. Yan, and W. Wang, Machine learning coupled multi-scale modeling for redox flow batteries, Advanced Theory and Simulations 3, 1900167 (2020).
Shah et al. [2011] A. A. Shah, R. Tangirala, R. Singh, R. G. Wills, and F. C. Walsh, A dynamic unit cell model for the all-vanadium flow battery, Journal of the Electrochemical Society 158, 10 (2011).
Eapen et al. [2019] D. E. Eapen, S. R. Choudhury, and R. Rengaswamy, Low grade heat recovery for power generation through electrochemical route: Vanadium Redox Flow Battery, a case study, Applied Surface Science 474, 262 (2019).
Yang et al. [2019] X. Yang, D. Barajas-Solano, G. Tartakovsky, and A. M. Tartakovsky, Physics-informed cokriging: A gaussian-process-regression-based multifidelity method for data-model convergence, Journal of Computational Physics 395, 410 (2019).
Tartakovsky and Tipireddy [2019] A. Tartakovsky and R. Tipireddy, Physics-informed machine learning method for forecasting and uncertainty quantification of partially observed and unobserved states in power grids, in Proceedings of the 52nd Hawaii International Conference on System Sciences (2019).
Yang et al. [2018] X. Yang, G. Tartakovsky, and A. Tartakovsky, Physics-informed kriging: A physics-informed gaussian process regression method for data-model convergence, arXiv preprint arXiv:1809.03461 (2018).
Rasmussen [2003] C. E. Rasmussen, Gaussian processes in machine learning, in Summer School on Machine Learning (Springer, 2003) pp. 63–71.
Chen et al. [2014] C. L. Chen, H. K. Yeoh, and M. H. Chakrabarti, An enhancement to Vynnycky’s model for the all-vanadium redox flow battery, Electrochimica Acta 120, 167 (2014).
Williams and Rasmussen [2006] C. K. Williams and C. E. Rasmussen, Gaussian processes for machine learning, Vol. 2 (MIT press Cambridge, MA, 2006).
Kennedy and O’Hagan [2000] M. C. Kennedy and A. O’Hagan, Predicting the output from a complex computer code when fast approximations are available, Biometrika 87, 1 (2000).
Stein and Corsten [1991] A. Stein and L. Corsten, Universal kriging and cokriging as a regression procedure, Biometrics , 575 (1991).
Knotters et al. [1995] M. Knotters, D. Brus, and J. O. Voshaar, A comparison of kriging, co-kriging and kriging combined with regression for spatial interpolation of horizon depth with censored observations, Geoderma 67, 227 (1995).
Le Gratiet and Garnier [2014] L. Le Gratiet and J. Garnier, Recursive co-kriging model for Design of Computer experiments with multiple levels of fidelity, International Journal for Uncertainty Quantification 4, 365 (2014).
Perdikaris et al. [2015] P. Perdikaris, D. Venturi, J. O. Royset, and G. E. Karniadakis, Multi-fidelity modelling via recursive co-kriging and gaussian–markov random fields, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 471, 20150018 (2015).
Chen et al. [2021] Y. Chen, Z. Xu, C. Wang, J. Bao, B. Koeppel, L. Yan, P. Gao, and W. Wang, Analytical modeling for redox flow battery design, Journal of Power Sources 482, 228817 (2021).
Shah et al. [2008] A. A. Shah, M. J. Watt-Smith, and F. C. Walsh, A dynamic performance model for redox-flow batteries involving soluble species, Electrochimica Acta 53, 8087 (2008).

Methods

V.1 Gaussian process regression

In GPR we wish to find the cell voltage $E\in{\mathbb{R}}$ at input variables $\mathbf{x}\in{\mathbb{D}}$ , $E(\mathbf{x})$ , as a Gaussian process:

\hat{E}(\mathbf{x})\sim\mathcal{GP}\left(\mu({\mathbf{x}}),k({\mathbf{x}},{\mathbf{x}}^{\prime})\right)

(12)

where $\mu(\cdot):{\mathbb{D}}\rightarrow{\mathbb{R}}$ and and $k(\cdot,\cdot):{\mathbb{D}}\times{\mathbb{D}}\rightarrow{\mathbb{R}}$ are the mean and covariance function, given by

	$\displaystyle\mu({\mathbf{x}})$	$\displaystyle=\mathbb{E}\{\hat{E}({\mathbf{x}})\}$		(13)
	$\displaystyle k({\mathbf{x}},{\mathbf{x}}^{\prime})$	$\displaystyle=Cov\{\hat{E}({\mathbf{x}}),\hat{E}({\mathbf{x}}^{\prime})\}=\mathbb{E}\left\{[\hat{E}({\mathbf{x}})-\mu({\mathbf{x}})][\hat{E}({\mathbf{x}}^{\prime})-\mu({\mathbf{x}}^{\prime})]\right\},$		(14)

where the operator $\mathbb{E}\{\cdot\}$ is the ensemble mean. We denote the covariance matrix to be the $N\times N$ matrix given by

{\mathbf{C}}=\begin{bmatrix}k(\mathbf{x}^{(1)},\mathbf{x}^{(1)})&\ldots&k(\mathbf{x}^{(1)},\mathbf{x}^{(N)})\\ \vdots&\ddots&\vdots\\ k(\mathbf{x}^{(N)},\mathbf{x}^{(1)})&\ldots&k(\mathbf{x}^{(N)},\mathbf{x}^{(N)})\end{bmatrix}

(15)

and define the covariance vector as

{\mathbf{c}}({\mathbf{x}}^{*})=\left(k(\mathbf{x}^{*},\mathbf{x}^{(1)}),\ldots,k(\mathbf{x}^{*},\mathbf{x}^{(N)})\right)^{T}.

(16)

The GPR prediction of $E$ at ${\mathbf{x}}^{*}$ is then given by the posterior distribution $\hat{E}_{p}({\mathbf{x}}^{*})\sim\mathcal{N}(\mu_{p}({\mathbf{x}}^{*}),\sigma^{2}_{p}({\mathbf{x}}^{*}))$ , where the posterior mean and variance are given by

\mu_{p}({\mathbf{x}}^{*})=\mu({\mathbf{x}^{*}})+{\mathbf{c}}({\mathbf{x}^{*}})^{T}{\mathbf{C}}^{-1}(\mathbf{E}-\mathbf{\bm{\mu}})

(17)

\sigma^{2}_{p}({\mathbf{x}^{*}})=\sigma^{2}({\mathbf{x}^{*}})-{\mathbf{c}}({\mathbf{x}^{*}})^{T}{\mathbf{C}}^{-1}{\mathbf{c}}({\mathbf{x}^{*}}),

(18)

where $\bm{\mu}=(\mu(\mathbf{x}^{(1)}),\mu(\mathbf{x}^{(2)}),\ldots,\mu(\mathbf{x}^{(N)}))^{T}$ and $\sigma^{2}({\mathbf{x}^{*}})=k(\mathbf{x},\mathbf{x})$ .

To ensure that ${\mathbf{C}}$ is invertible we substitute ${\mathbf{C}}={\mathbf{C}}+\epsilon\mathbf{I}$ , where $\epsilon\ll 1$ . This is equivalent to assuming there is i.i.d. observation noise with variance $\epsilon$ [10].

V.2 Monte-Carlo estimates of the mean and covariance of $E$ given by the 0d model

The 0d model of an RFB was originally proposed in [22, 8] and modified by [9] to give a fast estimate of redox-flow battery performance. The 0d model gives the cell voltage, $E$ , due to an applied current density, $j_{app}$ , in the form

E=E_{cell}^{rev}-\eta_{act}-\eta_{ohm},

(19)

where $E_{cell}^{rev}$ is the reversible Open Current Voltage (OCV), $\eta_{act}$ is the activation overpotential, and $\eta_{ohm}$ is the ohmic losses. A short explanation of the 0d model is given below.

The two half reactions in the RFB are given by

	$\displaystyle V^{3+}+e^{-}$	$\displaystyle\rightleftarrows V^{2+}$		(20)
	$\displaystyle VO^{2+}+H_{2}O$	$\displaystyle\rightleftarrows VO_{2}^{+}+2H^{+}e^{-}$		(21)

From Nernst’s equation [9],

E_{cell}^{rev}(T)=E_{2}^{0}(T)-E_{1}^{0}(T)+\frac{RT}{F}\ln\left(\frac{C_{V(II)}C_{V(V)}C^{2}_{H_{p}^{+}}C_{H_{p}^{+}}}{C_{V(III)}C_{V(IV)}C_{H_{2}O_{p}}C_{H_{n}^{+}}}\right).

(22)

where $C_{i}$ is the molar concentration of species $i$ , $E_{1}^{0}$ and $E_{2}^{0}$ are the formal potentials for the reactions at the negative and positives electrodes, $R$ is the molar gas constant, $T$ is the cell temperature, and $F$ is the Faraday constant.

The activation overpotential is given by

\eta_{act}=|\eta_{pos}|+|\eta_{neg}|

(23)

where

\eta_{neg}=-\frac{2RT}{F}a\sinh\left(\frac{j_{app}}{2Fk_{1}^{*}\sqrt{C_{V(II)}C_{V(III)}}}\right)

(24)

\eta_{pos}=\frac{2RT}{F}a\sinh\left(\frac{j_{app}}{2Fk_{2}^{*}\sqrt{C_{V(IV)}C_{V(V)}}}\right)

(25)

and

	$\displaystyle k_{1}^{*}$	$\displaystyle=k_{1}\exp\left(-\frac{FE_{1}^{0}(T_{ref})}{R}\left[T_{ref}^{-1}-T^{-1}\right]\right)$		(26)
	$\displaystyle k_{2}^{*}$	$\displaystyle=k_{2}\exp\left(\frac{FE_{2}^{0}(T_{ref})}{R}\left[T_{ref}^{-1}-T^{-1}\right]\right).$		(27)

The coefficients $k_{1}$ and $k_{2}$ are the reference rate constants for reactions (20) and (21) at $T_{ref}=293K$ , respectively, and are typically determined from experiments.

The ohmic losses are given by

\eta_{ohm}=\left(\frac{w_{m}}{\sigma_{m}}+\frac{w_{e}}{\sigma_{e}\epsilon^{1.5}}+\frac{w_{c}}{\sigma_{c}}\right)j_{app}

(28)

with the current collector conductivity $\sigma_{c}$ , the membrane conductivity

\sigma_{m}=(0.5139\lambda-0.326)\exp\left(1268\left[303^{-1}-T^{-1}\right]\right)

(29)

where $\lambda=22$ is the membrane water content for a fully saturated membrane, and the electrolyte ionic conductivity $\sigma_{e}$ to be determined from experimental results. The widths of the membrane, electrolyte, and current collector are denoted by $w_{m}$ , $w_{e}$ , and $w_{c}$ , respectively. The specific area is given by $A_{s}=SV_{e}$ , where $S$ is the specific surface area that is determined from experiments and $V_{e}$ is the electrode volume. Mass balance for each species is used to determine $C_{i}$ for $i=V(II),$ $V(III),$ $V(IV),$ $V(V),$ $H^{+},$ and $H_{2}O$ . For details see [22, 8].

While most of the parameters in the 0d model can be directly measured or estimated based on the battery design and material properties, the parameters $k_{1},k_{2},\sigma_{e}$ , and $S$ denoting the reference rates for reactions (20) and (21), the electrode conductivity, and the specific surface area, must be calibrated [8, 9, 21]. Table 2 presents values of these parameters from the calibration studies in [8, 9, 21]. The fixed parameters are given in Table 3.

	[8]	[9]	[21]
$k_{1}$ ( $m/s$ )	$3.56\times 10^{-6}$	$1.28\times 10^{-5}$	$5.0\times 10^{-8}$
$k_{2}$ ( $m/s$ )	$3\times 10^{-9}$	$3\times 10^{-5}$	$1\times 10^{-7}$
$\sigma_{e}$ ( $s/m$ )	100	1000	500
$S$ ( $m^{2}$ / $m^{3}$ )	$1.62\times 10^{4}$	$1.62\times 10^{4}$	$3.48\times 10^{4}$

Table 2: Comparison of the 0d model parameters from [8], [9], and [21].

A comparison of the 0d model predictions with these three sets of parameters for two experiments is given in Fig. 7. For all considered experiments, the parameters in [9] underestimate $E$ , while the parameters in [8] overestimate $E$ . The parameters from [21] were obtained from an experiment in a cell similar to the one used in the experiments considered in this work, and so they provide the best agreement in Fig. 7 and serve as the starting point for the MC simulations.

To model the $i$ th experiment ( $i=1,...,16$ ), in the MC simulation method, we solve the 0d equations $N_{MC}=1000$ times with the input parameters

{\mathbf{x}}=[t,j,V_{r},w_{m},C_{V(III)}^{0},C_{V(IV)}^{0},C_{H^{+},pos}^{0},C_{H_{2}0,pos}^{0},C_{H^{+},neg}^{0},C_{H_{2}0,neg}^{0}]

taken from Table 5 for experiment $i$ . The parameters $\mathbf{v}=[k_{1},k_{2},\sigma_{e},S]^{T}$ are taken as Gaussian random variables with the mean values $\langle v_{i}\rangle$ given by [21] and standard deviation $\sigma_{v_{i}}=0.25\langle v_{i}\rangle$ :

$\displaystyle k_{1}^{m}$	$\displaystyle=\xi^{m}(k_{1},0.25k_{1})$	(30)
$\displaystyle k_{2}^{m}$	$\displaystyle=\xi^{m}(k_{2},0.25k_{2})$	(31)
$\displaystyle\sigma_{e}^{m}$	$\displaystyle=\xi^{m}(\sigma_{e},0.25\sigma_{e})$	(32)
$\displaystyle S^{m}$	$\displaystyle=\xi^{m}(S,0.25S)$	(33)

where $\xi^{m}(\mu,\sigma)$ denotes a Gaussian random variable with mean $\mu$ and standard deviation $\sigma.$ Fig. 8 shows 50 realizations of $E(t)$ computed from the 0d model for one of the experiments as well as $E(t)$ observed in the experiment and computed from the 0d model with $\mathbf{v}=\langle\mathbf{v}\rangle$ . Because the 0d model involves logarithm functions, for some values of $\mathbf{v}$ the output may be undefined. Therefore, we only consider MC realizations where the value is defined for all time. This figure shows that the mean cell voltage $\mu_{L}$ obtained from the MC method cannot accurately predict the tails of the charge and discharge curves. Similar results are obtained for the other experiments.

Parameters	Values
Electrode porosity	$\epsilon=0.67$
Membrane water content	$\lambda=22$
Current collector conductivity	$\sigma_{c}=91000$ s m^-1
Reference potential, reaction 20	$E^{0}_{1}=-0.255$ V
Reference potential, reaction 21	$E^{0}_{2}=1.0$ V
Faraday constant	$F=96485.3329$ sA/mol
Gas constant	$R=8.314462618$ J/K/mol
Reference temperature	$T_{ref}=293$ K

Table 3: Parameters used in the 0d model.

Supplementary information

Detailed description of high-fidelity data

The experiments use Nafion 115 and 212 membranes. The positive and negative electrodes are made of 3 mm thick graphite felt with active areas of 10 cm². The electrolytes are prepared by dissolving 1.5 M $VOSO_{4}$ (Aldrich, 99%) in 3.5 M $H_{2}SO_{4}$ solution (Aldrich, 96–98%). The electrolytes are placed in two glass reservoirs and circulated at a flow rate of 20 mL/min by a peristaltic pump. The flow cell electrochemical performance are measured and recorded using a potentiostat/galvanostat (Arbin Instrument, USA) with the fixed voltage range of 0.8 to 1.6 V. Table 5 lists parameters that were varied between the experiments while the parameters that were kept constant in all experiments are given in Table 4. The current density in each experiment is kept constant, although it varied between experiences.

Parameters	Values
$v$	4.17 m/s
$C_{V(II)}(0)$	0.0 mol/m³
$C_{V(V)}(0)$	0.0 mol/m³
Temperature	298.15 K
Electrode height	0.05 m
Electrode breadth	0.025 m
Electrode width	0.004 m
Collector width	0.015 m

Table 4: Initial parameters for the high fidelity data set that are fixed in this paper.

Exp.	$j$	$V_{r}$	$C_{V(III)}^{0}$	$C_{V(IV)}^{0}$	$C_{H^{+},pos}^{0}$	$C_{H_{2}0,pos}^{0}$	$C_{H^{+},neg}^{0}$	$C_{H_{2}0,neg}^{0}$	Membrane	Notes
	(A)	(m³)	(mol/m³)	(mol/m³)	(mol/m³)	(mol/m³)	(mol/m³)	(mol/m³)
1	0.5	$20\times 10^{-6}$	$1.5\times 10^{3}$	$1.5\times 10^{3}$	$3.85\times 10^{3}$	$4.46\times 10^{4}$	$3.025\times 10^{3}$	$4.61\times 10^{4}$	115
2	0.75	$80\times 10^{-6}$	$1.5\times 10^{3}$	$1.5\times 10^{3}$	$3.85\times 10^{3}$	$4.46\times 10^{4}$	$3.025\times 10^{3}$	$4.61\times 10^{4}$	115
3	0.75	$80\times 10^{-6}$	$1.5\times 10^{3}$	$1.5\times 10^{3}$	$3.85\times 10^{3}$	$4.46\times 10^{4}$	$3.025\times 10^{3}$	$4.61\times 10^{4}$	115
4	0.4	$30\times 10^{-6}$	$1.5\times 10^{3}$	$1.5\times 10^{3}$	$3.85\times 10^{3}$	$4.46\times 10^{4}$	$3.025\times 10^{3}$	$4.61\times 10^{4}$	115	Sulfuric acid electrolyte
5	0.5	$50\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	115	Membrane soaked in $H_{2}O_{2}$ for 1 hour pretreatment
6	0.5	$50\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	212	Membrane soaked in $H_{2}O_{2}$ and $O_{2}$ plasma pretreatment
7	0.69	$30\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	115
8	0.75	$45\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	115
9	0.69	$45\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	115
10	0.75	$45\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	115
11	0.8	$45\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	212	Varying temperature
12	0.4	$50\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	212	Sulfuric-hydrochloric acid electrolyte
13	0.4	$25\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	212	10% of $NH_{4}Cl$
14	0.5	$40\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	212	Acetic acid additive
15	0.5	$40\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	212
16	0.5	$40\times 10^{-6}$	$2.0\times 10^{3}$	$2.0\times 10^{3}$	$5.0\times 10^{3}$	$4.753\times 10^{4}$	$3.0\times 10^{3}$	$4.953\times 10^{4}$	212	Different electrode from Toyobo

Table 5: Initial parameters for the high fidelity data set considered in this paper. We consider the current

j

, the reservoir volume

V_{r}

, the membrane, and the initial concentrations of

V(III)

V(IV)

H^{+}

, and

H_{2}O

, denoted by

C_{i}^{0}

for

i=V(III)

V(IV)

H^{+}

, and

H_{2}O

. For

H^{+}

, and

H_{2}O

the

pos

and

neg

subscripts denote the initial concentrations in the positive and negative reservoirs, respectively. The membranes used are Nafion 115 and Nafion 212, and the membrane width is

w_{m}=1.27\times 10^{-4}

m for Nafion 115 and

w_{m}=5.08\times 10^{-5}

m for Nafion 212.

Physics-informed CoKriging model of a redox flow battery

Abstract

I Experimental data

II Multifidelity framework

II.1 Gaussian process regression

II.2 Estimating prior statistics from the stochastic 0d model

II.3 Correcting bias in the 0d model

II.4 Determining the SOC scaling parameter

III The multifidelity model prediction of charge and discharge curves

III.1 Sensitivity to choice of initial parameters

IV Conclusions

V Acknowledgements

References

Methods

V.1 Gaussian process regression

V.2 Monte-Carlo estimates of the mean and covariance of EE given by the 0d model

Supplementary information

Detailed description of high-fidelity data

V.2 Monte-Carlo estimates of the mean and covariance of $E$ given by the 0d model