OptiGrad: A Fair and more Efficient Price Elasticity Optimization via a Gradient Based Learning

1 Introduction

2 Problem Statement

3 OptiGrad: Price Elasticity Optimization via Gradient Descent

4 Experiments

5 Conclusion

References

2.1 Price Optimization Methods

2.2 Discrimination and Unfairness

2.3 Fairness Challenges in Commercial Premium

3.1 Formalization without Fairness

3.2 Formalization with Fairness

OptiGrad - Without Fairness

OptiGrad - With Fairness Enforcement

Individual optimization

2.2.1 Demographic Parity

2.2.2 Using Adversarial Learning to Ensure Fairness

OptiGrad Formulation

Bounded Interval Constraint

Abstract

Definition 1.

Definition 2.

Definition 3.

\nameVincent Grari \email[email protected]
\addrAXA Group Operations
TRAIL, LIP6, Sorbonne Université, Paris, France
\AND\nameMarcin Detyniecki \email[email protected]
\addrAXA Group Operations
\addrPolish Academy of Science, IBS PAN, Warsaw, Poland
TRAIL, LIP6, Sorbonne Université, Paris, France

This paper presents a novel approach to optimizing profit margins in non-life insurance markets through a gradient descent-based method, targeting three key objectives: 1) maximizing profit margins, 2) ensuring conversion rates, and 3) enforcing fairness criteria such as demographic parity (DP). Traditional pricing optimization, which heavily lean on linear and semi definite programming, encounter challenges in balancing profitability and fairness. These challenges become especially pronounced in situations that necessitate continuous rate adjustments and the incorporation of fairness criteria. Specifically, indirect Ratebook optimization, a widely-used method for new business price setting, relies on predictor models such as XGBoost or GLMs/GAMs to estimate on downstream individually optimized prices. However, this strategy is prone to sequential errors and struggles to effectively manage optimizations for continuous rate scenarios. In practice, to save time actuaries frequently opt for optimization within discrete intervals (e.g., range of [-20%, +20%] with fix increments) leading to approximate estimations. Moreover, to circumvent infeasible solutions they often use relaxed constraints leading to suboptimal pricing strategies. The reverse-engineered nature of traditional models complicates the enforcement of fairness and can lead to biased outcomes. Our method addresses these challenges by employing a direct optimization strategy in the continuous space of rates and by embedding fairness through an adversarial predictor model. This innovation not only reduces sequential errors and simplifies the complexities found in traditional models but also directly integrates fairness measures into the commercial premium calculation. We demonstrate improved margin performance and stronger enforcement of fairness highlighting the critical need to evolve existing pricing strategies.

Keywords: Price Elasticity, Gradient Based Optmization, Adversarial Learning

The non-life insurance sector is currently experiencing a significant transition, driven by rapid evolution attributed to intensified competition among insurers, regulations, and new technologies. These dynamics necessitate the agile development of new pricing strategies. Such new strategies must not only ensure profitability and foster customer acquisition and retention but also align with an ethical paradigm that emphasizes fairness and equity, as underscored by recent legislative initiatives (Parliament and of the European Union, 2016; COM, 2021; Ito and Fujimaki, 2017). The insurance industry has long been anchored by traditional pricing methodologies such as individual and Ratebook optimizations, crucial for determining both new business commercial and renewals premiums. These methodologies rely heavily on linear and semi definite programming techniques, forming the analytical backbone of pricing strategies (De Larrard, 2016; Verschuren, 2022; Hashorva et al., 2018). These traditional approaches face increasing challenges in reconciling the threefold goals of profitability, customer retention/acquisition, and ethical considerations.

In real world operations, a common limitation of current pricing strategies is their dependence on discrete optimized rates rather than exploring the full potential of continuous rate variations. This conventional approach is limited to a predefined set of options, narrowing the optimization scope, inevitably resulting in approximate estimations. A significant reason for this limitation is the extensive time necessary for an exhaustive exploration of continuous optimized rate. Furthermore, to address the challenges of infeasibility solutions within this constrained optimization framework, constraints are frequently relaxed (De Larrard, 2016; Verschuren, 2022). This can inadvertently lead to an increased risk of errors accumulating in the pricing process.

On another dimension, the reliance on reverse-engineered predictive models in conventional Indirect Ratebook optimization complicates the enforcement of fairness. Traditionally, the prevailing fairness approach within the insurance pricing (Lindholm et al., 2022a; Grari et al., 2022; Xin and Huang, 2023; Lindholm et al., 2022b, 2023) has emphasized ensuring fairness at the pure premium calculation stage only, where predictive models are designed to prevent claim frequency and cost predictions from being influenced by sensitive variables such as gender or race. However, we claim that the fairness enforcement is not guarantee at the commercial premium layer. In fact, by adding a second layer - commercial premium - could compromise or negate the fairness achieved at the pure premium level. This occurs because the commercial layer integrates an additional objective, balancing profit margins with conversion rates, which can inadvertently reintroduce biases. For instance, in our experiment, we observed that increasing conversion rates increases biases against certain sensitive variables. This new challenge requires new methods that enforces fairness directly at the final pricing layer.

This paper introduces OptiGrad, a novel approach that employs a gradient descent learning strategy specifically tailored for profit margin optimization in the non-life insurance sector. OptiGrad is designed with the explicit intent to achieve three core objectives: maximizing profit margins, ensuring a minimum conversion rate, and most importantly, enforcing fairness through criteria such as demographic parity (DP). This framework employs an offline differentiable conversion model and a differentiable pure premium model. The key intuition of OptiGrad is adopting neural network architecture principles, particularly the application of gradient descent, where the output of one model is optimized in conjunction with the input of other models. The optimization procedure leverages the chain rule to derive the optimal pricing rate by integrating inputs from both the conversion and pure premium models into the objective function, utilizing a differentiable model for estimation. The differentiability of these models, plays a crucial role in incorporating traditional fairness mechanisms. This particularity allows the use of techniques such as fair adversarial networks, which rely on gradient-based optimization, to be directly integrated into the objective function. Through experimental testing, OptiGrad proves to enhance the Global Written Margin (GWM) while keeping the same conversion rate by simultaneously improving fairness at the market premium level, shifting the focus away from the traditional approach of targeting the pure premium.

Throughout this document, we consider $X$ as the set customer features, a variable $h_{w_{h}}(X)$ as the estimated actuarial risk. This estimation may represent frequency or average cost models, or the pure premium derived from the product of these two sub-models, all trained using a differentiable supervised machine learning algorithm $h_{w_{h}}$ with parameters $w_{h}$ . Furthermore, we introduce $f_{w_{f}}$ a differentiable conversion model that has been trained on a historical dataset X containing the actual conversion binary information $Y$ . Among the features, $S$ will denote a sensitive attribute (e.g., gender or race feature), that cannot be used at test-time, but, is a variable that we must observe to ensure fairness of the model. Depending on the context, the sensitive attribute domain $\Omega_{S}$ of the sensitive attribute $S$ may be discrete or continuous. The training data consists of $n$ examples ${(x_{i},s_{i},y_{i})}$ that is sampled from a training distribution $p$ , where $x_{i}\in\mathbb{R}^{d}$ is the feature vector with $d$ -dimensional feature vector for the $i$ -th policyholder, $s_{i}\in\Omega_{S}$ denotes the sensitive attribute’s value, and $y_{i}\in\Omega_{Y}$ is the binary conversion label.

In optimizing commercial pricing, several strategies are deployed with distinctive methodologies. Individual Optimization, for instance, calculates the optimal price for each customer based on their unique risk profile. Alternatively, Indirect Ratebook Optimization applies reverse-engineering on downstream optimized prices, which are frequently the outcomes of Individual Optimization. Each method has its own advantages and limitations, which can vary depending on whether the optimization is conducted in an online or offline environment. Individual Optimization, for example, is noted for its precision, eliminating the potential for compounded errors that may arise from the reverse-engineering process inherent in Indirect Ratebook Optimization. However, its practical application can be complex, with challenges in operationalizing rating prices due to the necessity of direct price calculations. On the other hand, Indirect Ratebook Optimization facilitates easier implementation within the insurance rating system for production, though its offline nature limits market responsiveness.

This study primarily focuses on Direct Ratebook Optimization, which seeks to combine the production deployment advantages of Indirect Optimization with a more straightforward method of price optimization, eliminating the need for reverse engineering. This method strives for an optimal balance between accuracy and operational feasibility.

Individual pricing optimization under constraints allows to maintain a consistent pricing strategy. Constraints may vary significantly by market, in some countries, regulatory frameworks enforce limits on price adjustments to prevent extreme high prices, with specific thresholds set limits. Furthermore, it’s crucial to consider the impact on vulnerable segments to avoid large price increases that could lead to customer dissatisfaction and negatively affect the brand. Therefore, optimization strategies need to navigate between maintaining equity and leveraging market dynamics, ensuring that price adjustments are justifiable and align with both business objectives and customer expectations.

The optimization program can then be described as below:

The commercial price $p_{i}=c_{i}*h_{w_{h}}(x_{i})$ is defined via a commercial coefficient $c_{i}$ for each individual customer $i$ . The predictor model $h_{w_{h}}$ takes as input the customers features $X$ . The conversion probability $f_{w_{f}}$ takes as input both the features set $X$ and the adjusted commercial price.

The primary objective is to maximize the Global Written Margin (GWM), defined as the total sum of the margin weighted by conversion. This serves as a significant indicator of profitability. In addition, two local constraints allows to maintain that individual coefficient in a specific range $c_{i}\in[a,b]$ . These constraints ensure that pricing strategies adhere to a realistic and controllable strategy, mitigating potential risks associated with uncontrolled pricing and are also designed to avoid penalization on vulnerable segments (i.e., low price elasticity segment). The last constraint is crucial to maintain a sufficient level of conversion rate forcing the average portfolio conversion above a specific threshold $\gamma$ .

For practical reasons, a relaxation is employed. The following equation corresponds to the linearization of the formula 1.

\operatorname*{arg\,max}_{c_{1},..,c_{n}}\sum_{i=1}^{n}[(c_{i}*h_{w_{h}}(x_{i})-h_{w_{h}}(x_{i}))*f_{w_{f}}(x_{i},c_{i}*h_{w_{h}}(x_{i}))]+\lambda*\frac{1}{n}\sum_{i=1}^{n}f(x_{i},c_{i}*h_{w_{f}}(x_{i}))

(2)

In this formulation, the specific threshold $\gamma$ is no longer explicit. However, the hyperparameter $\lambda\in\mathbb{R^{+}}$ allows for a trade-off without the need for precise information about the minimum average conversion rate. Higher values of $\lambda$ lead to increased conversion rates but generally result in a lower Gross Written Margin (GWM).

This problem is typically addressed using the Sequential Quadratic Programming method (SQP). It is important to note that local constraints can be preserved in practice by addressing discrete optimization issues, employing specific choices for pricing rate adjustments (e.g., +10%, +30% with 1-point fixed increments). This methodology enhances the model’s flexibility and practical utility in real-world applications, albeit potentially at the cost of optimum accuracy.

The quest for fairness in AI systems requires the articulation of a clearly defined fairness criteria. First, there is the concept of information sanitization that restricts the use of sensitive data in predictive model training. A key instance from the European insurance sector is the implementation of the Gender Directive. Although nearly a decade passed before legal guidelines for its application to insurance activities were established, the directive was fundamentally aimed at upholding the principle of equal treatment between men and women in the access and provision of goods and services, including insurance. The result was a prohibition on the use of gender as a rating variable in insurance pricing. The directive enforced gender equality in insurance pricing across the European Union from December 21, 2012, as detailed in (Schmeiser et al., 2014).

However, a major challenge for actuaries is that sensitive variables can be highly correlated with several other features. For example, the combination the driver’s occupation and car’s size, color could inadvertently introduce gender bias in the prediction of car insurance prices. Geographic information has also been noted as dependent to the race or national origin in Insurance (Saxena et al., 2024). An illustrative instance is shown (Angwin et al., 2017), which revealed that individuals residing in neighborhoods predominantly populated by racial minorities are subject to elevated insurance premiums compared to individuals with equivalent risk profiles residing in other neighborhoods. A significant aspect of fairness in the Fair Machine Learning community is statistical or group fairness. This partitions the world into groups defined by one or several sensitive attributes. It requires that a specific relevant statistic about the classifier is equal across those groups. This is particularly challenging in the context of continuous sensitive attributes, where it is crucial to guarantee distributional independence instead of looking at average expectation between groups. In the following, we outline the most popular definition and mitigating strategy used in recent research.

The most common objective in fair machine learning is Demographic parity by (Dwork et al., 2011). Based on this definition, a model is considered fair if the output prediction $\widehat{Y}$ from features $X$ is independent of the sensitive attribute $S$ : $\widehat{Y}\perp\!\!\!\perp S$ .

A machine learning algorithm achieves Demographic Parity if the associated prediction $\widehat{Y}$ is independent of the sensitive attribute $S$ ¹¹1For the binary case, it is equivalent to $\mathbb{P}(\widehat{Y}=y|S)=\mathbb{P}(\widehat{Y}=y)$ :

\displaystyle\mathbb{P}(\widehat{Y}\leq y|S=s)=\mathbb{P}(\widehat{Y}\leq y),\leavevmode\nobreak\ \forall s.

(3)

The use of Demographic Parity was originally introduced in this context of binary scenarios (Dwork et al., 2011), where the underlying idea is that each demographic group has the same chance for a positive outcome.

A classifier is considered fair according to the demographic parity principle if

\mathbb{P}(\widehat{Y}=1|S=0)=\mathbb{P}(\widehat{Y}=1|S=1).

Considering the evaluation of fairness between the sensitive attribute $S$ and a continuous variable $P=C\times h(X)$ , which represents the final commercial premium offered to customers, we delve into the continuous scenario of Demographic Parity. Traditional methods for measuring dependence in continuous cases include Pearson’s correlation, Kendall’s tau, and Spearman’s rank correlation. However, these measures predominantly capture specific types of association patterns, such as linear or monotonically increasing relationships. For example, two variables that have a quadratic relationship, would not exhibit correlation in the Pearson sense. To overcome the limitations of these conventional linearity-centric measures, the HGR (Hirschfeld-Gebelein-Rényi) coefficient offers a viable alternative. It is a normalized measure which is capable of correctly measuring linear and non-linear relationships, it can handle multi-dimensional random variables and it is invariant with respect to changes in marginal distributions (Lopez-Paz et al., 2013).

For two jointly distributed random variables $U\in\mathcal{U}$ and $V\in\mathcal{V}$ , the Hirschfeld-Gebelein-Rényi maximal correlation is defined as:

\displaystyle HGR(U,V)=\sup_{\begin{subarray}{c}\phi:\mathcal{U}\rightarrow\mathbb{R},\psi:\mathcal{V}\rightarrow\mathbb{R}\\ E(\phi(U))=E(\psi(V))=0\\ E(\phi^{2}(U))=E(\psi^{2}(V))=1\end{subarray}}\rho(\phi(U),\psi(V))=\sup_{\begin{subarray}{c}\phi:\mathcal{U}\rightarrow\mathbb{R},\psi:\mathcal{V}\rightarrow\mathbb{R}\\ E(\phi(U))=E(\psi(V))=0\\ E(\phi^{2}(U))=E(\psi^{2}(V))=1\end{subarray}}E(\phi(U)\psi(V))

(4)

where $\rho$ is the Pearson linear correlation coefficient ¹¹1 $\rho(U,V)$ := $\frac{Cov(U;V)}{\sigma_{U}\sigma_{V}}$ , where $Cov(U;V)$ , $\sigma_{U}$ and $\sigma_{V}$ are the covariance between $U$ and $V$ , the standard deviation of $U$ and the standard deviation of $V$ , respectively. with some measurable functions $\phi$ and $\psi$ .

The HGR coefficient is equal to 0 if the two random variables are independent. If they are strictly dependent the value is 1. The dimensional spaces for the functions $\phi$ and $\psi$ are infinite. This property is the reason why the HGR coefficient proved difficult to compute. One way to approximate this coefficient is to require that $\phi$ and $\psi$ belong to Reproducing Hilbert Kernel’s spaces (RKHS) by taking the largest canonical correlation between two sets of copula random projections. This has been done efficiently under the name of Randomized Dependency Coefficient (RDC) (Lopez-Paz et al., 2013). We will make use of this approximated metric.

Machine learning fairness interventions can be broadly classified into three main categories: pre-processing, in-processing, and post-processing. Pre-processing techniques (Kamiran and Calders, 2012; Bellamy et al., 2018; Calmon et al., 2017) seek to adjust the input data to reduce bias prior to training. In contrast, post-processing methods (Hardt et al., 2016; Chen et al., 2019) aim to correct the outputs of already trained models. Meanwhile, in-processing strategies (Zafar et al., 2017; Zhang et al., 2018b; Wadsworth et al., 2018; Louppe et al., 2017) directly tackle bias during the model training phase. This paper places an emphasis on in-processing fairness, with a particular focus on adversarial learning as this approach is recognized as an effective framework for scenarios where intervention during the training process is viable (Louppe et al., 2017; Wadsworth et al., 2018; Zhang et al., 2018a; Grari et al., 2021b).

Among the notable fair adversarial methods, the approach developed by Zhang et al. (2018a) is highlighted:

\min_{w_{g}}\quad{\mathbb{E}_{(x,y,s)\sim p}{\;\mathcal{L_{Y}}(g_{w_{g}}(x),y)}}\\ \quad\textrm{s.t.}\quad\min_{w_{a}}\mathbb{E}_{(x,y,s)\sim p}{\mathcal{L_{S}}(a_{w_{a}}(h_{w_{h}}(x)),s)}>\epsilon^{\prime}

(5)

In this adversarial setup, $\mathcal{L}_{\mathcal{Y}}$ denotes the loss associated to the predictor objective, $\mathcal{L}_{\mathcal{S}}$ denotes the loss associated with the sensitive attribute reconstruction, such as a log loss for a binary sensitive attribute. The objective is to train a model, $g_{w_{w}}$ , that not only minimizes the conventional loss associated with the prediction task but also ensures that an adversarial model, $a_{w_{a}}$ with parameters $w_{a}$ , is unable to accurately infer the sensitive demographic groups from the predictor’s output, $g_{w_{g}}(x)$ . Specifically, the loss $\mathcal{L_{S}}(a_{w_{a}}(g_{w_{g}}(x)),s)$ should exceed a predefined threshold $\epsilon^{\prime}$ . To effectively balance the predictor’s accuracy and the adversary’s inability to determine sensitive attributes, a relaxed version of the formulation is adopted: $\min_{w_{g}}\max_{w_{a}}\mathbb{E}_{(x,y,s)\sim p}{\left[\mathcal{L_{Y}}(g_{w_{g}}(x),y)\right]}-\lambda_{S}\mathbb{E}_{(x,y,s)\sim p}{\left[\mathcal{L_{S}}(a_{w_{a}}(g_{w_{g}}(x)),s)\right]}$ . Here, the coefficient $\lambda_{S}\in\mathbb{R}^{+}$ modulates the balance between the predictor’s accuracy and the adversary’s performance on reconstructing the sensitive attribute. A higher $\lambda_{S}$ value intensifies the focus on restricting the adversary’s capability to predict $S$ , while a lower value favors the enhancement of the predictor’s efficiency in its main predictive task.

However, our consideration involves debiasing a commercial predictor model characterized by continuous outputs. The application of these debiasing algorithms within regression tasks, such as those using mean square loss, fails to achieve the demographic parity objective - it only aligns the conditional expectation $\mathbb{E}(S|\widehat{Y})=\mathbb{E}(S)$ rather than ensuring probability independence. For this continuous problem, the $HGR$ approach proposed by (Grari et al., 2021b) uses an adversarial network that takes the form of two inter-connected neural networks for approximating the optimal transformations functions $\phi$ and $\psi$ . The HGR estimation denoted as $\widehat{HGR}^{w_{\phi},w_{\psi}}_{U\sim\mathcal{D_{U}},V\sim\mathcal{D_{V}}}(U,V)$ is the HGR neural estimation between two variables $U$ and $V$ , computed via two inter-connected neural networks $\phi$ and $\psi$ with parameters $w_{\phi}$ and $w_{\psi}$ (Grari et al., 2021b, a):

\mathop{\widehat{HGR}^{w_{\phi},w_{\psi}}}_{U\sim\mathcal{D_{U}},V\sim\mathcal{D_{V}}}(U,V)=\mathop{max}_{w_{\phi},w_{\psi}}\mathbb{E}_{U\sim\mathcal{D_{U}},V\sim\mathcal{D_{V}}}(\widehat{\phi}_{w_{\phi}}(U)\widehat{\psi}_{w_{\psi}}(V))

(6)

where $\mathcal{D_{U}}$ (resp. $\mathcal{D_{V}}$ ) is the distribution of $U$ (resp. $V$ ), and $\widehat{\phi}$ (resp. $\widehat{\psi}$ ) refer to standardized outputs of network $\phi$ (resp. $\psi$ ). The mitigation approach proposed uses an adversarial network to penalize this HGR estimation: $\operatorname*{arg\,min}_{w_{g}}\max_{{w_{\phi},w_{\psi}}}\mathbb{E}_{(x,y,s)\sim p}\mathcal{\mathcal{L_{Y}}}(g_{w_{g}}(x),y)+\mathop{\widehat{HGR}^{w_{\phi},w_{\psi}}}_{(x,s)\sim p}(g_{w_{g}}(x),s)$

In this paper so far, we have primarily focused on mitigating biases of predictor model with conventional loss functions $\mathcal{L}_{Y}$ , such as the log loss function or mean squared error (MSE). It’s worth noting the absence of a loss function directly tailored to commercial premiums. For example, Equation (1) has not been treated as a differentiable loss function in current state-of-the-art. Traditionally, due to the intricate nature of commercial premiums, most approaches concentrate on debiasing the average cost and or frequency— pure premium (Lindholm et al., 2022a; Grari et al., 2022; Xin and Huang, 2023; Lindholm et al., 2022b, 2023; Hu et al., 2023; Moriah et al., 2023)—as it effectively mitigates biases inherent in traditional loss functions and can be readily applied at the pure premium layer.

However, complexities arise when considering commercial premium pricing. The introduction of a secondary layer of commercial premium may potentially compromise the fairness achieved at the pure premium level. This introduces an additional objective balancing profit margins with conversion rates, which, inadvertently, could reintroduce biases.

This section delves into the complexities associated with integrating fairness considerations into commercial premium optimization, with a specific focus on car insurance pricing. To empirically investigate these challenges, we conducted experiments utilizing the Atoti Dataset¹¹1Further details provided in the Experimental Section, specifically designed for elasticity optimization purposes. In this section, we perform an Indirect Ratebook Optimization with an XGBoost model trained on individually optimized prices, wherein we adjusted the hyperparameter $\lambda$ in Eq. 2, to find a balance between Gross Written Margin (GWM) and minimum conversion rates.

We explore various scenarios by incrementally adjusting the $\lambda$ hyperparameter and evaluating the resultant fairness levels, measured with respect to a sensitive attribute—age, which was excluded from the model training but utilized solely for fairness assessment. Our findings, depicted in Figure 1, indicate that $\lambda$ the influences fairness. For example in extreme settings, characterized by either low conversion rates (approximately 23%) or higher conversion rate (approximately 27%)—exhibit the most pronounced biases. Interestingly, bias decreases at intermediate $\lambda$ values, highlighting the intricate interplay between fairness considerations and premium pricing optimization. These results suggest that addressing fairness in commercial premium introduces a layer of complexity extending beyond the simple management of profit or conversion rates.

Refer to caption — Figure 1: Efficiency frontier analysis.

Price elasticity optimization poses a challenge for traditional methods to enforce fairness due to the presence of non-differentiable elements. To overcome this obstacle, we propose a novel approach called OptiGrad. This method introduces a differentiable coefficient model, denoted as $c_{w_{c}}$ (with parameters $w_{c}$ ) that serves as the calculation of the commercial premium $p_{i}=c_{w_{c}}(x_{i})*h_{w_{h}}(x_{i})$ , which takes customer features $X$ as input.

In the following, we will explore two distinct implementations of OptiGrad. The first implementation, discussed in detail in Subsection 3.1, initially focuses on the optimization process without considering fairness enforcement. Subsection 3.2 incorporates fairness into the methodology.

In this section, we consider the actuarial pure premium model $h_{w_{h}}$ and the conversion model $f_{w_{f}}$ to be differentiable and already trained. Both models are typically represented using General Linear Models (GLMs), which is a common practice in the actuarial field, although it is worth noting that they can also be represented as neural networks.

Our main proposition introduces a differentiable predictor model ${c}_{w_{c}}$ designed to maximize profit margins while ensuring a minimum conversion rate above a level $\gamma$ . This formulation is a Direct Ratebook optimization which determines the optimal weight $w_{c}$ of the coefficient ${c}_{w_{c}}$ . The mathematical formulation is the following:

$\displaystyle\max_{w_{c}}$	$\displaystyle{\mathbb{E}_{x\sim p}[{\;(c_{w_{c}}(x)h_{w_{h}}(x)-h_{w_{h}}(x))f_{w_{f}}(x,c_{w_{c}}(x)*h_{w_{h}}(x))}}]$	(7)
s.t.	$\displaystyle c_{w_{c}}(x)\geq a$
	$\displaystyle c_{w_{c}}(x)\leq b$
	$\displaystyle\mathbb{E}_{x\sim p}[{f_{w_{f}}(x,c_{w_{c}}(x)*h_{w_{h}}(x))}]>\gamma$

In the above equation, $a$ and $b$ denote the lower and upper bounds for coefficient $c_{w_{c}}(x)$ , respectively, thus forming the interval $[a,b]$ . $\gamma$ represents the minimum expected conversion rate $f_{w_{f}}(x,c_{w_{c}}(x)*h_{w_{h}}(x))$ over the distribution $p$ .

The primary challenge in this formulation arises from the necessity to confine the coefficient within the bounded interval $[a,b]$ . To address this constraint while circumventing penalization inefficiencies, we employ a transformed sigmoid monotonic function, defined as $\widehat{c}_{w_{c}}(x)=\sigma(c_{w_{c}}(x))\cdot(b-a)+a$ . This approach ensures the monotonicity and differentiability while bounding its output within the interval $[a,b]$ , where $\widehat{c}_{w_{c}}:\mathcal{X}\rightarrow[a,b]$ .

To make Eq. 7 differentiable, we implement a Lagrangian relaxation, introducing the $\lambda_{f}$ hyperparameter. The overall OptiGrad optimization problem can thus finally be formulated as:

	$\displaystyle\min_{w_{c}}-\frac{1}{n}\sum_{i=1}^{n}[(\widehat{c}_{w_{c}}(x_{i})h_{w_{h}}(x_{i})-h_{w_{h}}(x_{i}))*f_{w_{f}}(x_{i},\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i}))]$		(8)
	$\displaystyle-\underbrace{\lambda_{f}\frac{1}{n}\sum_{i=1}^{n}{f_{w_{f}}(x_{i},\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i}))}}_{\text{Maintaining a conversion rate}}$

The optimization problem is therefore defined as minimizing the inverse sign of Equation 7, which consists of maximizing the two main components: profitability (GWM) and minimum conversion under the $\lambda_{f}$ term, emphasizes the importance of maintaining conversion rate.

To address this optimization challenge, we introduce the OptiGrad algorithm (Algorithm 1). This algorithm employs gradient descent to iteratively updates the weights $w_{c}$ , aiming to minimize the objective function detailed in Equation 8. In each iteration, the coefficient $\widehat{c}_{w_{c}}$ are updated on a batch of size $b$ , based on the gradient of the objective function with respect to $w_{c}$ . This procedure is repeated for a predetermined number of epochs or until convergence is achieved, resulting in optimized coefficients $w_{c}$ that effectively balance profit maximization with adherence to minimum conversion rate requirements.

Algorithm 1 OptiGrad: Price Optimization Elasticity with Gradient Descent

0: Training data

X

, initialize coefficients

w_{c}

, already trained

w_{h}

w_{f}

parameters, batch size

b

, number of epochs

n_{e}

, learning rates

\alpha_{c}

, regularization parameter

\lambda_{f}

1: for epoch

\in[1,...n_{e}]

2: for each mini-batch

(X_{\text{batch}})

of size

b

3: Update

\widehat{c}_{w_{c}}(X_{\text{batch}})

using gradient descent:

w_{c}\leftarrow w_{c}-\alpha_{c}\nabla_{w_{c}}(-\frac{1}{b}\sum_{i=1}^{b}[(\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i})-h_{w_{h}}(x_{i}))\cdot f_{w_{f}}(x_{i},\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i}))]

-\lambda_{f}\frac{1}{b}\sum_{i=1}^{b}f_{w_{f}}(x_{i},\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i}))

6: end for

7: end for

In this section, the OptiGrad formulation (referenced in Equation 7) is extended to integrate fairness criteria by leveraging the Hirschfeld-Gebelein-Rényi (HGR) Neural Network (HGR_NN) architecture, referenced in section 2.2.2. The choice of the HGR_NN architecture is motivated by its capability to effectively measure dependencies with continuous features, a crucial aspect given that commercial pricing is continuous (Grari et al., 2021b). Integrating an HGR-based differentiable estimation into the framework enables the optimization of three key objectives: maximizing profit margins, guaranteeing a minimum conversion rates, and reducing bias.

The HGR estimation, implemented via two interconnected networks, $\phi_{w_{\phi}}:\mathcal{P}\rightarrow\mathbb{R}$ and $\psi_{w_{\psi}}:\mathcal{S}\rightarrow\mathbb{R}$ , quantifies the nonlinear dependence between the adjusted commercial premium, denoted as $c_{w_{c}}(x)*h_{w_{h}}(x)$ (the input for $\phi_{w_{\phi}}$ ), and the sensitive attribute (the input for $\psi_{w_{\psi}}$ ). This framework enables the evaluation of fairness at each stage of the training process, facilitating corrective measures to mitigate bias. The goal is to ensure that the pricing strategy optimization does not inadvertently reinforce bias described by the sensitive attribute $S$ (binary or continuous).

$\displaystyle\max_{w_{c}}$	$\displaystyle{\mathbb{E}_{x\sim p}[{(c_{w_{c}}(x)h_{w_{h}}(x)-h_{w_{h}}(x))f_{w_{f}}(x,c_{w_{c}}(x)*h_{w_{h}}(x))}]}$	(9)
s.t.	$\displaystyle c_{w_{c}}(x)\geq a$
	$\displaystyle c_{w_{c}}(x)\leq b$
	$\displaystyle\mathbb{E}_{x\sim p}[{f_{w_{f}}(x,c_{w_{c}}(x)*h_{w_{h}}(x))}]>\gamma$
	$\displaystyle\underbrace{{\mathbb{E}_{(x,s)\sim p}(\widehat{\phi}_{w_{\phi}}(c_{w_{c}}(x)h_{w_{h}}(x))\widehat{\psi}_{w_{\psi*}}(s))}}_{\text{HGR Component}}<\epsilon^{\prime}$

Equation 9 introduces the HGR component, it aims to ensure that the non linear dependence between the commercial price and the sensitive attribute $S$ is below a certain threshold $\epsilon^{\prime}\in\mathbb{R}$ . This HGR estimation corresponds to the expectation of the product of the optimally standardized outputs of both networks ( $\widehat{\phi}_{w_{\phi^{*}}}$ and $\widehat{\psi}_{w_{\psi^{*}}}$ ) with $w_{\phi}^{*}$ and $w_{\psi}^{*}$ the optimal neural networks parameters maximizing the expectation of the HGR.

By maintaining the same Lagrangian relaxation approach, the overall optimization problem of our Fair Price Elastic Optimization framework (Fair-OptiGrad) can thus finally be formulated as:

	$\displaystyle\min_{w_{c}}\max_{w_{\phi},w_{\psi}}-\frac{1}{n}\sum_{i=1}^{n}[(\widehat{c}_{w_{c}}(x_{i})h_{w_{h}}(x_{i})-h_{w_{h}}(x_{i}))f_{w_{f}}(x_{i},\widehat{c}_{w_{c}}(x_{i})*h_{w_{h}}(x_{i}))]$
	$\displaystyle-\lambda_{f}\frac{1}{n}\sum_{i=1}^{n}{f_{w_{f}}(x_{i},\widehat{c}_{w_{c}}(x_{i})*h_{w_{h}}(x_{i}))}$		(10)
	$\displaystyle+\lambda_{S}\frac{1}{n}\sum_{i=1}^{n}\widehat{\phi}_{w_{\phi}}(\widehat{c}_{w_{c}}(x_{i})h_{w_{h}}(x_{i}))\widehat{\psi}_{w_{\psi}}(s)$

In this objective, the third term represents the fairness component. The $\lambda_{S}$ hyper-parameter controls its impact in the optimization. The optimization is transformed in a min-max objective in order to recover at each step the optimal neural network $\widehat{\psi}_{w_{\psi}}$ and $\widehat{\phi}_{w_{\phi}}$ . This maximization of the HGR estimation can be updated by multiple steps of gradient ascent for each gradient descent iteration on $w_{c}$ for accurately estimating the HGR. This allows to evaluate the fairness more accurately at each stage of the training process.

Algorithm 2 Fair-OptiGrad: Price Optimization Elasticity with Demographic Parity

0: Training data

X

, initialize coefficients

w_{c}

, already trained

w_{h}

w_{f}

parameters, initialize

\widehat{\phi}_{w_{\phi}}

and

\widehat{\psi}_{w_{\psi}}

networks, batch size

b

, number of epochs

n_{e}

, learning rates

\alpha_{c}

\alpha_{\phi}

\alpha_{\psi}

, regularization parameters

\lambda_{f}

\lambda_{S}

, number of ascent steps

n_{a}

1: for epoch

\in[1,...,n_{e}]

2: for each mini-batch

(X_{\text{batch}},S_{\text{batch}})

of size

b

3: Standardize the function

\widehat{\phi}_{w_{\phi}}

and

\widehat{\psi}_{w_{\psi}}

4: for

j=1

n_{a}

5: Update

\widehat{\phi}_{w_{\phi}}

and

\widehat{\psi}_{w_{\psi}}

by gradient ascent to maximize the HGR estimation

w_{\phi}\leftarrow w_{\phi}+\alpha_{\phi}\nabla_{w_{\phi}}\left(\frac{1}{b}\sum_{i=1}^{b}\widehat{\phi}_{w_{\phi}}(\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i}))\cdot\widehat{\psi}_{w_{\psi}}(s_{i})\right)

w_{\psi}\leftarrow w_{\psi}+\alpha_{\psi}\nabla_{w_{\psi}}\left(\frac{1}{b}\sum_{i=1}^{b}\widehat{\phi}_{w_{\phi}}(\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i}))\cdot\widehat{\psi}_{w_{\psi}}(s_{i})\right)

8: end for

9: Update

\widehat{c}_{w_{c}}(X_{\text{batch}})

using gradient descent for minimizing loss

10:

w_{c}\leftarrow w_{c}-\alpha_{c}\nabla_{w_{c}}(-\frac{1}{b}\sum_{i=1}^{b}[(\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i})-h_{w_{h}}(x_{i}))\cdot f_{w_{f}}(x_{i},\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i}))]

-\lambda_{f}\frac{1}{b}\sum_{i=1}^{b}f_{w_{f}}(x_{i},\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i}))

+\lambda_{S}\frac{1}{b}\sum_{i=1}^{b}\widehat{\phi}_{w_{\phi}}(\widehat{c}_{w_{c}}(x_{i})\cdot h_{w_{h}}(x_{i}))\cdot\widehat{\psi}_{w_{\psi}}(s_{i}))

11: end for

12: end for

The algorithm takes as input a training set from which it samples batches of size $b$ at each iteration. At each iteration, it first standardizes the output scores of networks $\phi_{w_{\phi}}$ and $\psi_{w_{\psi}}$ to ensure 0 mean and a variance of 1 on the batch. Then it computes the adversary objective function to estimate the $HGR$ neural estimate with $n_{a}$ gradient ascent iterations. At the end of each iteration, the algorithm updates the prediction parameters $\omega_{c}$ by one step of gradient descent.

The dataset studied in this experiment comes from Atoti²²2https://data.atoti.io/notebooks/price-elasticity/data.csv
https://data.atoti.io/notebooks/price-elasticity/test_df.csv based on the Kaggle automobile insurance quotes and sales dataset³³3https://www.kaggle.com/datasets/ranja7/vehicle-insurance-customer-data, enriched through synthetic data generation. It comprises 46,129 instances, divided into training (60%), development (25%), and test (20%) sets. From the 18 variables present, the customer identifier (cust_id) has been excluded. The Sale variable is employed in the training of a logistic regression model, designated as $f_{w_{f}}(X)$ . In line with the methodology adopted in (De Larrard, 2016), a logarithmic transformation is applied to the historical observed price. This transformation is critical for ensuring that the conversion probability model $f_{w_{f}}$ outputs a value of zero for infinitely high prices and reaches its maximum at a price of zero. As illustrated in Figure 2, the conversion function exhibits a substantial dependency on the price, with the probability decreasing as the price increases. Moreover, when the price is significantly high, the demand probability is observed to approach zero. In Figure 3, we observe that if the price of the all portfolio are increased by a fix percentage the conversion rate decreases from $0.3$ to $0.18$ .

To assess our OptiGrad framework, two empirical scenarios have been designed. The first scenario involves the use of OptiGrad and compares it with conventional methods that does not include fairness considerations. The second scenario focuses on enforcing demographic parity to ensure fairness. It should be noted that although the methodology discussed in Treetanthiploet et al. (2023) also consider a Direct Ratebook Optimization, it cannot be directly compared to our study. The difference stems from its reliance on market quantile prices from competitors, which is outside the scope of this paper, and its omission of the local constraints that are central to our strategy.

In our first experiment, we compare the OptiGrad framework against conventional state-of-the-art methods, specifically focusing on Individual Optimization (as outlined in Section 2.1), which is constrained to the training data, and an Indirect Ratebook Optimization, which employs an XGBoost model ( $300$ trees with a max depth to $5$ ) trained on the outputs from the Individual Optimization. This offline configuration permits the evaluation of the latter on both the test and dev datasets. Moreover, for our OptiGrad methodology we investigate the effect of model complexity—comparing a simple regression against a deep neural network—on the coefficient $c_{w_{c}}$ . For all scenarios we have constrained the upper and lower limits to $1.6$ and $1.2$ , respectively. Each algorithm has a tradeoff hyperparameter that allows to varying the minimum conversion rate. We have therefore ensured a thorough exploration of these tradeoffs by sweeping across hyperparameter values for each algorithm.

The results, depicted in Figure 4, indicate that our OptiGrad approach outperform the traditional methodology in performance on the test and dev datasets. On the training dataset, Individual Optimization exhibits comparable effectiveness. The least effective model is the Indirect Optimization, which significantly underperformed due to the loss of performance in attempting to reverse-engineer the downstream individually optimized prices. Furthermore, we observe that increasing the complexity of our coefficient model $c_{w_{x}}$ has no impact.

To analyse the enforcement of fairness of our OptiGrad framework, we investigated two distinct scenarios: (i) with a continuous sensitive attribute: Age, (ii) and a binary sensitive attribute: Employment status. It is important to note that these sensitive attributes are used solely during the training phase and are not employed in the testing phase, which is dedicated to the assessment of the fairness.

Our Fair-OptiGrad algorithm incorporates two hyperparameters to balance the trade-off between achieving a minimum conversion rate and enforcing fairness. Following a similar approach to our initial scenario, we conducted an exhaustive hyperparameter sweep to optimally balance these two components.

In Figure 5, we observe that adjusting the parameter $\lambda_{S}$ effectively promotes fairness across different demographics. Increasing $\lambda_{S}$ from 0 to 1250 results in a significant reduction in the HGR from 0.15 to 0.07 for age, and from 0.14 to 0.06 for employment. This indicates a robust enforcement of fairness as $\lambda_{S}$ increases. Most important observation is that the Pareto front for the different $\lambda_{s}$ remains similar with the exception of the extreme values ( $\lambda_{s}=1250$ ) while unfairness (HGR) is reduced in average.

To conclude, our research introduces the OptiGrad framework as an effective tool to optimize the complexities of (i) profit maximization, (ii) conversion rate optimization, and (iii) fairness in the context of commercial insurance premiums. This approach outperforms traditional discrete pricing methods by utilizing continuous rate optimization, resulting in improved precision and efficiency. Importantly, OptiGrad innovatively integrates fairness directly into the computation of commercial premiums, addressing a blind spot in previous approaches that focused solely on optimizing pure premiums.

The outcomes of this study underscore the effectiveness of traditional gradient descent techniques in augmenting insurance pricing strategies, providing a foundation for achieving a balanced between profitability and ethical considerations in insurance pricing. Nevertheless, this approach implies the differentiability of the different components.

Future research directions include better understanding extreme cases where fairness requirement may be unrealistic. Furthermore, we are also interested on the exploration of additional fairness criteria, such as equalized odds or equal opportunity. This could further refine and improve the fairness and societal impact of insurance pricing algorithms.

$\displaystyle\operatorname*{arg\,max}_{c_{1},..,c_{n}}$

$\displaystyle\sum_{i=1}^{n}[(c_{i}*h_{w_{h}}(x_{i})-h_{w_{h}}(x_{i}))*f_{w_{f}}(x_{i},c_{i}*h_{w_{h}}(x_{i}))]$

$\displaystyle c_{i}\leq b,\quad i=1,\ldots,n.$

$\displaystyle c_{i}\geq a,\quad i=1,\ldots,n.$

$\displaystyle\frac{1}{n}\sum_{i=1}^{n}f_{w_{f}}(x_{i},c_{i}*h_{w_{h}}(x_{i}))\geq\gamma$

Angwin et al. (2017) J. Angwin, J. Larson, L. Kirchner, and S. Mattu. Minority neighborhoods pay higher car insurance premiums than white areas with the same risk. ProPublica, April, 5:2017, 2017.
Bellamy et al. (2018) R. K. Bellamy, K. Dey, M. Hind, S. C. Hoffman, S. Houde, K. Kannan, P. Lohia, J. Martino, S. Mehta, A. Mojsilovic, et al. Ai fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv, 1810.01943, 2018.
Calmon et al. (2017) F. P. Calmon, D. Wei, K. N. Ramamurthy, and K. R. Varshney. Optimized data pre-processing for discrimination prevention. arXiv, 1704.03354, 2017.
Chen et al. (2019) J. Chen, N. Kallus, X. Mao, G. Svacha, and M. Udell. Fairness under unawareness: Assessing disparity when protected class is unobserved. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 339–348, 2019.
COM (2021) E. COM. Laying down harmonised rules on artificial intelligence (artificial intelligence act) and amending certain union legislative acts. Proposal for a regulation of the European parliament and of the council, 2021.
De Larrard (2016) A. De Larrard. Commercial price optimization strategies in car insurance. Insitut des actuaires, 36(2):614–645, 2016.
Dwork et al. (2011) C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness Through Awareness. arXiv, 1104.3913, 2011. ISSN 0039-6109. doi: 10.1145/2090236.2090255. URL http://arxiv.org/abs/1104.3913.
Grari et al. (2021a) V. Grari, O. E. Hajouji, S. Lamprier, and M. Detyniecki. Learning unbiased representations via rényi minimization. In Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part II 21, pages 749–764. Springer, 2021a.
Grari et al. (2021b) V. Grari, S. Lamprier, and M. Detyniecki. Fairness-aware neural rényi minimization for continuous features. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pages 2262–2268, 2021b.
Grari et al. (2022) V. Grari, A. Charpentier, and M. Detyniecki. A fair pricing model via adversarial learning. arXiv preprint arXiv:2202.12008, 2022.
Hardt et al. (2016) M. Hardt, E. Price, and N. Srebro. Equality of opportunity in supervised learning. In Advances in neural information processing systems, pages 3315–3323, 2016.
Hashorva et al. (2018) E. Hashorva, G. Ratovomirija, M. Tamraz, and Y. Bai. Some mathematical aspects of price optimisation. Scandinavian Actuarial Journal, 2018(5):379–403, 2018.
Hu et al. (2023) F. Hu, P. Ratz, and A. Charpentier. Fairness in multi-task learning via wasserstein barycenters. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 295–312. Springer, 2023.
Ito and Fujimaki (2017) S. Ito and R. Fujimaki. Optimization beyond prediction: Prescriptive price optimization. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1833–1841, 2017.
Kamiran and Calders (2012) F. Kamiran and T. Calders. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1):1–33, 2012.
Lindholm et al. (2022a) M. Lindholm, R. Richman, A. Tsanakas, and M. V. Wüthrich. Discrimination-free insurance pricing. ASTIN Bulletin: The Journal of the IAA, 52(1):55–89, 2022a.
Lindholm et al. (2022b) M. Lindholm, R. Richman, A. Tsanakas, and M. V. Wüthrich. A discussion of discrimination and fairness in insurance pricing, 2022b.
Lindholm et al. (2023) M. Lindholm, R. Richman, A. Tsanakas, and M. V. Wuthrich. What is fair? proxy discrimination vs. demographic disparities in insurance pricing. Proxy Discrimination vs. Demographic Disparities in Insurance Pricing (May 2, 2023), 2023.
Lopez-Paz et al. (2013) D. Lopez-Paz, P. Hennig, and B. Schölkopf. The randomized dependence coefficient. In Advances in neural information processing systems, pages 1–9, 2013.
Louppe et al. (2017) G. Louppe, M. Kagan, and K. Cranmer. Learning to pivot with adversarial networks. In Advances in neural information processing systems, pages 981–990, 2017.
Moriah et al. (2023) M. Moriah, F. Vermet, and A. Charpentier. Measuring and mitigating biases in motor insurance pricing. arXiv preprint arXiv:2311.11900, 2023.
Parliament and of the European Union (2016) E. Parliament and C. of the European Union. Regulation (eu) 2016/679 of the european parliament and of the council, 2016. URL https://data.europa.eu/eli/reg/2016/679/oj.
Saxena et al. (2024) N. A. Saxena, W. Zhang, and C. Shahabi. Spatial fairness: The case for its importance, limitations of existing work, and guidelines for future research. arXiv preprint arXiv:2403.14040, 2024.
Schmeiser et al. (2014) H. Schmeiser, T. Störmer, and J. Wagner. Unisex insurance pricing: Consumers’ perception and market implications. The Geneva Papers on Risk and Insurance - Issues and Practice, 39(2):322–350, 2014.
Treetanthiploet et al. (2023) T. Treetanthiploet, Y. Zhang, L. Szpruch, I. Bowers-Barnard, H. Ridley, J. Hickey, and C. Pearce. Insurance pricing on price comparison websites via reinforcement learning. arXiv preprint arXiv:2308.06935, 2023.
Verschuren (2022) R. M. Verschuren. Customer price sensitivities in competitive insurance markets. Expert Systems with Applications, 202:117133, 2022.
Wadsworth et al. (2018) C. Wadsworth, F. Vera, and C. Piech. Achieving fairness through adversarial learning: an application to recidivism prediction. arXiv:1807.00199, 2018.
Xin and Huang (2023) X. Xin and F. Huang. Antidiscrimination insurance pricing: Regulations, fairness criteria, and models. North American Actuarial Journal, pages 1–35, 2023.
Zafar et al. (2017) M. B. Zafar, I. Valera, M. G. Rogriguez, and K. P. Gummadi. Fairness Constraints: Mechanisms for Fair Classification. In AISTATS’17, pages 962–970, Fort Lauderdale, FL, USA, 20–22 Apr 2017.
Zhang et al. (2018a) B. H. Zhang, B. Lemoine, and M. Mitchell. Mitigating Unwanted Biases with Adversarial Learning. Association for the Advancement of Artificial Intelligence, jan 2018a. ISSN 15477401. doi: 10.1080/08827508.2012.738731.
Zhang et al. (2018b) B. H. Zhang, B. Lemoine, and M. Mitchell. Mitigating unwanted biases with adversarial learning. In AAAI’18, pages 335–340, 2018b.