Hierarchical Optimal Power Flow with Improved Gradient Evaluation

Heng Liang,¹ Xinyang Zhou,² and Changhong Zhao¹ This work was supported by Hong Kong Research Grants Council through ECS Award No. 24210220 and by CUHK through faculty startup fund.¹H. Liang and C. Zhao are with the Department of Information Engineering, the Chinese University of Hong Kong, New Territories, HKSAR, China. {lh021, chzhao}@ie.cuhk.edu.hk²X. Zhou is with the National Renewable Energy Laboratory, Golden, CO, USA. [email protected]©2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. This paper is accepted for presentation at the American Control Conference, Atlanta, GA, USA, 2022.

Abstract

Existing algorithms to solve alternating-current optimal power flow (AC-OPF) often exploit linear approximations to simplify system models and accelerate computations. In this paper, we improve a recent hierarchical OPF algorithm, which rested on primal-dual gradients evaluated in a linearized distribution power flow model. Specifically, we identify a risk of voltage violation arising from the model linearization, and propose a more accurate gradient evaluation method to eliminate that risk. We further develop a hierarchical primal-dual algorithm to solve OPF based on the proposed gradient evaluation method. Numerical results on IEEE networks show that our algorithm can enhance voltage safety with satisfactory computational efficiency.

I Introduction

Optimal power flow (OPF) is a fundamental optimization problem that aims to find a cost-minimizing operating point subject to the physical laws and safety limits of a power network. The power sector is witnessing a rising need for fast and scalable OPF solvers, which can instruct a large network of controllable units (smart appliances, electric vehicles, energy storage devices, etc.) to make timely response to the growing variations of wind and solar generations. Such a need is especially urgent in power distribution networks, where massive renewable energy sources and controllable units are being deployed. However, distribution networks are also where the speed and scalability requirements are most challenging, as the high resistance-to-reactance ratios of distribution lines necessitate nonlinear and nonconvex alternating-current (AC) OPF rather than its simple direct-current approximation.

Numerous efforts have been made to overcome the computational challenges to AC-OPF. Many of them conducted convex relaxations, convex inner approximations, or linearizations of AC power flow; comprehensive reviews were provided in [1, 2]. Meanwhile, distributed OPF algorithms were developed and shown to be more scalable in terms of computation and communication and more robust to single-point failure, compared to their centralized counterparts [3, 4, 5, 6]. To further reduce computational efforts associated with solving AC power flow, some OPF algorithms were implemented by iteratively actuating the power system with intermediate decisions and updating the decisions based on system feedback [7, 8, 9].

From the vast literature, we bring to attention a hierarchical distributed primal-dual gradient algorithm [10] (and its extension to multi-phase networks [11]). It leveraged the radial structure of distribution networks to avoid repetitive computation and communication, and thus significantly accelerated large-scale OPF computations. However, the algorithm in [10] derived approximate gradients from the linearized distribution power flow model in [12, 13]. Such linearization may cause the solver to optimistically estimate nodal voltages to be safe, while they actually already exceed safety limits. The consequent risk of voltage violation will be revealed later in this paper.

To prevent such violation, we develop an improved gradient evaluation method by approximating the partial derivatives of the quadratic terms associated with line currents and power losses. Our analysis shows that with moderate extra computations, the proposed method returns more accurate gradient estimations than [10]. Meanwhile, the proposed method preserves the structure in [10] for gradient evaluation, and thus enables us to develop an improved hierarchical OPF algorithm. Numerical experiments on IEEE 37-node and 123-node networks demonstrate enhanced safety of voltage regulation achieved by the proposed algorithm with a moderate increase in computation time, compared to the previous method based on the linearized model.

The rest of this paper is organized as follows. Section II introduces the distribution network model, the OPF problem, and a primal-dual algorithm to solve it. Section III motivates and elaborates the improved gradient evaluation method. Section IV presents the hierarchical OPF algorithm based on the improved gradient evaluation. Section V reports the numerical experiments. Section VI concludes this paper.

II Modeling and Preliminary Algorithm

II-A Power network model and OPF problem

We model a single-phase equivalent power distribution network as a directed tree graph $\mathcal{T}:=\{\mathcal{N}^{+},\mathcal{E}\}$ , where $\mathcal{N}^{+}=\mathcal{N}\cup\{0\}$ , with $0$ indexing the root node (the substation, also known as the slack bus), and $\mathcal{N}=\{1,...,N\}$ indexing other nodes. The set $\mathcal{E}$ collects the ordered pairs of nodes representing the lines in the network. We define the directions of lines as pointing away from the root, e.g., in line $(i,j)$ , node $i$ is closer to the root than node $j$ . Let $p_{i}$ and $q_{i}$ denote the net active and reactive power injections (i.e., power supply minus consumption) and $v_{i}$ denote the squared voltage magnitude at each node $i\in\mathcal{N}^{+}$ . Let $\ell_{ij}$ denote the squared current magnitude, $z_{ij}=r_{ij}+\boldsymbol{i}x_{ij}$ denote the series impedance, and $P_{ij}$ and $Q_{ij}$ denote the sending-end active and reactive power on each line $(i,j)\in\mathcal{E}$ .

We adopt the distribution power flow model [12, 13]:


$\displaystyle P_{ij}$	$\displaystyle=-p_{j}+\sum_{k:(j,k)\in\mathcal{E}}P_{jk}+r_{ij}\ell_{ij},~{}\forall j\in\mathcal{N}$	(1a)
$\displaystyle Q_{ij}$	$\displaystyle=-q_{j}+\sum_{k:(j,k)\in\mathcal{E}}Q_{jk}+x_{ij}\ell_{ij},~{}\forall j\in\mathcal{N}$	(1b)
$\displaystyle v_{j}$	$\displaystyle=v_{i}\!-\!2\left(r_{ij}P_{ij}\!+\!x_{ij}Q_{ij}\right)\!+\!\left\|z_{ij}\right\|^{2}\ell_{ij},\forall(i,j)\in\mathcal{E}$	(1c)
$\displaystyle\ell_{ij}v_{i}$	$\displaystyle=P_{ij}^{2}+Q_{ij}^{2},~{}\forall(i,j)\in\mathcal{E}.$	(1d)

Assume the power injections $(p_{i},q_{i})$ to be controllable within a given operating region:

\displaystyle\mathcal{Y}_{i}=\left\{\left(p_{i},q_{i}\right)\mid\underline{p}_{i}\leqslant p_{i}\leqslant\bar{p}_{i},\underline{q}_{i}\leqslant q_{i}\leqslant\bar{q}_{i}\right\},~{}\forall i\in\mathcal{N}.

(2)

Define vectors $\boldsymbol{p}:=\left[p_{1},\ldots,p_{N}\right]^{\top}$ , $\boldsymbol{q}:=\left[q_{1},\ldots,q_{N}\right]^{\top}$ , $\boldsymbol{v}:=\left[v_{1},\ldots,v_{N}\right]^{\top}\in\mathbb{R}^{N}$ . Following [8], we express the voltage vector as a function of power injections, i.e., $\boldsymbol{v}(\boldsymbol{p},\boldsymbol{q})$ , implicitly defined by (1). Consider the following OPF problem:


$\displaystyle\min_{\boldsymbol{p},\boldsymbol{q}}$	$\displaystyle\sum_{i\in\mathcal{N}}f_{i}\left(p_{i},q_{i}\right)$	(3a)
s.t.	$\displaystyle\underline{\boldsymbol{v}}\leqslant\boldsymbol{v}(\boldsymbol{p},\boldsymbol{q})\leqslant\overline{\boldsymbol{v}}$	(3b)
	$\displaystyle\left(p_{i},q_{i}\right)\in\mathcal{Y}_{i},~{}\forall i\in\mathcal{N}$	(3c)

where $f_{i}$ is a strongly convex cost function for the control of power injections at node $i\in\mathcal{N}$ , inequality (3b) is element-wise, given constant squared voltage limits $\underline{\boldsymbol{v}}$ and $\overline{\boldsymbol{v}}$ .

II-B Primal-dual gradient algorithm

Let $\underline{\boldsymbol{\mu}}$ and $\overline{\boldsymbol{\mu}}$ be the dual variables associated with (3b) to penalize the violation of voltage lower and upper bounds, respectively. To design a convergent primal-dual algorithm, we consider the regularized Lagrangian of (3):

	$\displaystyle\mathcal{L}_{\epsilon}(\boldsymbol{p},\boldsymbol{q};\overline{\boldsymbol{\mu}},\underline{\boldsymbol{\mu}})=\sum_{i\in\mathcal{N}}f_{i}\left(p_{i},q_{i}\right)\qquad\qquad$
	$\displaystyle+\underline{\boldsymbol{\mu}}^{\top}(\underline{\boldsymbol{v}}-\boldsymbol{v}(\boldsymbol{p},\boldsymbol{q}))+\overline{\boldsymbol{\mu}}^{\top}(\boldsymbol{v}(\boldsymbol{p},\boldsymbol{q})-\overline{\boldsymbol{v}})-\frac{\epsilon}{2}\\|\boldsymbol{\mu}\\|_{2}^{2}$		(4)

where $\boldsymbol{\mu}=\left(\underline{\boldsymbol{\mu}},\overline{\boldsymbol{\mu}}\right)$ and $\epsilon>0$ is a constant factor for the regularization term. Function (II-B) is strongly convex in primal variables $(\boldsymbol{p},\boldsymbol{q})$ and strongly concave in dual variables $\boldsymbol{\mu}$ . The saddle point of (II-B) serves as an approximate solution to (3), with a bounded error that is linear in $\epsilon$ [10, 14]. The following primal-dual gradient algorithm has been commonly used to iteratively approach such a saddle point:


	$\displaystyle\boldsymbol{u}(t+1)=\bigg{[}\boldsymbol{u}(t)-\sigma_{u}\ \left(\frac{\partial f\left(\boldsymbol{u}(t)\right)}{\partial\boldsymbol{u}}\right)^{\top}$
	$\displaystyle-\sigma_{u}\left(\frac{\partial\boldsymbol{v}\left(\boldsymbol{u}(t)\right)}{\partial\boldsymbol{u}}\right)^{\top}\left(\overline{\boldsymbol{\mu}}(t)-\underline{\boldsymbol{\mu}}(t)\right)\bigg{]}_{\mathcal{Y}}$		(5a)
	$\displaystyle\underline{\boldsymbol{\mu}}(t+1)=\left[\underline{\boldsymbol{\mu}}(t)+\sigma_{\mu}\left(\underline{\boldsymbol{v}}-\boldsymbol{v}(t)-\epsilon\underline{\boldsymbol{\mu}}(t)\right)\right]_{+}$		(5b)
	$\displaystyle\overline{\boldsymbol{\mu}}(t+1)=\left[\overline{\boldsymbol{\mu}}(t)+\sigma_{\mu}\left(\boldsymbol{v}(t)-\overline{\boldsymbol{v}}-\epsilon\overline{\boldsymbol{\mu}}(t)\right)\right]_{+}$		(5c)

where $\boldsymbol{u}:=(\boldsymbol{p},\boldsymbol{q})$ collects the controllable power injections; $f(\boldsymbol{u}):=\sum_{i\in\mathcal{N}}f_{i}(p_{i},q_{i})$ is the objective (3a); $\boldsymbol{v}(t)=\boldsymbol{v}\left(\boldsymbol{u}(t)\right)$ both denote the voltage profile under power injections $\boldsymbol{u}(t)$ at time step $t$ . Note that $\boldsymbol{v}\left(\boldsymbol{u}(t)\right)$ does not have to be calculated by (1) or any other mathematical model of power flow; instead, it can be measured from the power network after actuating the network with the intermediate decisions $\boldsymbol{u}(t)$ during the process (5). This feedback-based implementation has been widely adopted to compensate for likely modeling errors [7, 8, 9, 11]. The subscripts $\mathcal{Y}$ and $+$ represent the projections onto $\prod_{i\in\mathcal{N}}\mathcal{Y}_{i}$ and the nonnegative orthant, respectively. Without loss of generality, we use constant step sizes $\sigma_{u},\sigma_{\mu}>0$ to update the primal and dual variables, respectively.

The gradient $\partial{f}/\partial{\boldsymbol{u}}$ in (5a) can be easily computed, and the voltage $\boldsymbol{v}(t)$ in (5b)–(5c) can be conveniently measured, both locally at each node $i\in\mathcal{N}$ . Therefore, the remaining key challenge is to compute the gradient $\partial{\boldsymbol{v}}/\partial{\boldsymbol{u}}$ that potentially couples all the nodes in the power network. This will be addressed in the next section.

III Improved Gradient Evaluation

III-A Prior work and motivation for improvement

In power flow equations (1), the nonlinear and implicit dependence of voltage $\boldsymbol{v}$ on power injections $\boldsymbol{u}$ makes it difficult to quickly and precisely compute the gradient $\partial{\boldsymbol{v}}/\partial{\boldsymbol{u}}$ . Reference [8] proposed a backward-forward sweep method for gradient calculation, which is computationally inefficient in large networks. In contrast, prior work [10] considered the linearized power flow model in [12, 13]:


$\displaystyle\hat{P}_{ij}$	$\displaystyle=-p_{j}+\sum_{k:(j,k)\in\mathcal{E}}\hat{P}_{jk},~{}\forall j\in\mathcal{N}$	(6a)
$\displaystyle\hat{Q}_{ij}$	$\displaystyle=-q_{j}+\sum_{k:(j,k)\in\mathcal{E}}\hat{Q}_{jk},~{}\forall j\in\mathcal{N}$	(6b)
$\displaystyle\hat{v}_{j}$	$\displaystyle=\hat{v}_{i}-2\left(r_{ij}\hat{P}_{ij}+x_{ij}\hat{Q}_{ij}\right),~{}\forall(i,j)\in\mathcal{E}$	(6c)

which yields the following approximation of $\partial{\boldsymbol{v}}/\partial{\boldsymbol{u}}$ :


	$\displaystyle\frac{\partial\hat{v}_{j}(\boldsymbol{p},\boldsymbol{q})}{\partial{p_{h}}}=R_{jh}:=\sum_{(i,k)\in\mathbb{P}_{j\wedge h}}2\cdot r_{ik},~{}\forall j,h\in\mathcal{N}$		(7a)
	$\displaystyle\frac{\partial\hat{v}_{j}(\boldsymbol{p},\boldsymbol{q})}{\partial{q_{h}}}=X_{jh}:=\sum_{(i,k)\in\mathbb{P}_{j\wedge h}}2\cdot x_{ik},~{}\forall j,h\in\mathcal{N}$		(7b)

where $\mathbb{P}_{j\wedge h}$ denotes the common part of the unique paths from nodes $j$ and $h$ back to the root node.

We illustrate the error of linear model (6) by introducing the following Lemma.

Lemma 1

Let $g_{\xi}$ and $\eta_{\xi}$ denote the active and reactive power loss downstream of node $\xi\in\mathcal{N}$ . The voltage error induced by (6) for all $h\in\mathcal{N}$ is:

\displaystyle\hat{v}_{h}-v_{h}=\sum_{(\zeta,\xi)\in\mathbb{P}_{h}}\left(2\left(r_{\zeta\xi}g_{\xi}+x_{\zeta\xi}\eta_{\xi}\right)+(r_{\zeta\xi}^{2}+x_{\zeta\xi}^{2})\ell_{\zeta\xi}\right),

(8)

where the expressions of $g_{\xi}$ and $\eta_{\xi}$ are:

\displaystyle g_{\xi}=\sum_{(\alpha,\beta)\in\operatorname{down}(\xi)}r_{\alpha\beta}\ell_{\alpha\beta},\quad\eta_{\xi}=\sum_{(\alpha,\beta)\in\operatorname{down}(\xi)}x_{\alpha\beta}\ell_{\alpha\beta}.

Proof:

See Appendix.

Lemma 1 reveals that the voltage computed with the linear model (6) is higher than that under the accurate nonlinear model (1). Such errors would accumulate as we trace the nodes further away from the root. Therefore, under the power injections $(\boldsymbol{p},\boldsymbol{q})$ solved by (5) with the linear model (6) and the approximate gradient (7), even though the model optimistically estimates the voltages to be safe, the actual voltages may already drop below their lower bounds. The need to prevent such voltage violation motivates us to develop an improved gradient evaluation method, as elaborated below.

III-B Improved gradient evaluation

Consider an arbitrary node $h\in\mathcal{N}$ , and $u_{h}:=(p_{h},q_{h})$ . We take the partial derivatives of the variables in the nonlinear power flow model (1):


$\displaystyle\frac{\partial P_{ij}}{\partial u_{h}}=$	$\displaystyle-\frac{\partial p_{j}}{\partial u_{h}}+\sum_{k:(j,k)\in\mathcal{E}}\frac{\partial P_{jk}}{\partial u_{h}}+r_{ij}\frac{\partial\ell_{ij}}{\partial u_{h}},~{}\forall j\in\mathcal{N}$	(9a)
$\displaystyle\frac{\partial Q_{ij}}{\partial u_{h}}=$	$\displaystyle-\frac{\partial q_{j}}{\partial u_{h}}+\sum_{k:(j,k)\in\mathcal{E}}\frac{\partial Q_{jk}}{\partial u_{h}}+x_{ij}\frac{\partial\ell_{ij}}{\partial u_{h}},\forall j\in\mathcal{N}$	(9b)
$\displaystyle\frac{\partial v_{j}}{\partial u_{h}}=$	$\displaystyle\frac{\partial v_{i}}{\partial u_{h}}\!-\!2\left(r_{ij}\frac{\partial P_{ij}}{\partial u_{h}}\!+\!x_{ij}\frac{\partial Q_{ij}}{\partial u_{h}}\right)\!+\!\left\|z_{ij}\right\|^{2}\frac{\partial\ell_{ij}}{\partial u_{h}}$	(9c)
$\displaystyle\frac{\partial\ell_{ij}}{\partial u_{h}}=$	$\displaystyle\frac{2P_{ij}}{v_{i}}\frac{\partial P_{ij}}{\partial u_{h}}+\frac{2Q_{ij}}{v_{i}}\frac{\partial Q_{ij}}{\partial u_{h}}-\frac{\ell_{ij}}{v_{i}}\frac{\partial v_{i}}{\partial u_{h}},~{}\forall(i,j)\in\mathcal{E}.$	(9d)

The exact partial derivatives are hard to solve from (9) due to their complex interdependence. To facilitate computation, we first simplify (9d) by replacing the partial derivatives with their approximations under the linear model (6):

\displaystyle\frac{\partial\hat{\ell}_{ij}}{\partial u_{h}}

\displaystyle=\frac{2P_{ij}}{v_{i}}\frac{\partial\hat{P}_{ij}}{\partial u_{h}}+\frac{2Q_{ij}}{v_{i}}\frac{\partial\hat{Q}_{ij}}{\partial u_{h}}-\frac{\ell_{ij}}{v_{i}}\frac{\partial\hat{v}_{i}}{\partial u_{h}}.

(10)

We recall the following results from [8] for all the nodes $h\in\mathcal{N}$ and lines $(i,j)\in\mathcal{E}$ :


$\displaystyle\frac{\partial{\hat{P}_{ij}}}{\partial p_{h}}$	$\displaystyle=-\mathds{1}\left(j\in\mathbb{P}_{h}\right),\quad$	$\displaystyle\frac{\partial{\hat{P}_{ij}}}{\partial q_{h}}$	$\displaystyle=0$	(11a)
$\displaystyle\frac{\partial{\hat{Q}_{ij}}}{\partial q_{h}}$	$\displaystyle=-\mathds{1}\left(j\in\mathbb{P}_{h}\right),\quad$	$\displaystyle\frac{\partial{\hat{Q}_{ij}}}{\partial p_{h}}$	$\displaystyle=0$	(11b)

where $\mathds{1}(j\in\mathbb{P}_{h})$ is an indicator that equals 1 if node $j$ lies on the unique path from node $h$ to the root, and 0 otherwise. By (11) and (7), we can calculate $\partial\hat{\boldsymbol{\ell}}/\partial{\boldsymbol{u}}$ in (10) as follows:


$\displaystyle\frac{\partial\hat{\ell}_{ij}}{\partial p_{h}}$	$\displaystyle=-\frac{1}{v_{i}}\left(2P_{ij}\cdot\mathds{1}(j\in\mathbb{P}_{h})+\ell_{ij}R_{ih}\right)$	(12a)
$\displaystyle\frac{\partial\hat{\ell}_{ij}}{\partial q_{h}}$	$\displaystyle=-\frac{1}{v_{i}}\left(2Q_{ij}\cdot\mathds{1}(j\in\mathbb{P}_{h})+\ell_{ij}X_{ih}\right).$	(12b)

The results in (7), (11), (12) can replace the partial derivatives on the right-hand side (RHS) of (9c) to obtain an improved gradient evaluation $\partial{\boldsymbol{v}}/\partial{\boldsymbol{u}}$ as:


	$\displaystyle\frac{\partial v_{j}}{\partial p_{h}}=\frac{\partial\hat{v}_{i}}{\partial p_{h}}-2\left(r_{ij}\frac{\partial\hat{P}_{ij}}{\partial p_{h}}+x_{ij}\frac{\partial\hat{Q}_{ij}}{\partial p_{h}}\right)+\left\|z_{ij}\right\|^{2}\frac{\partial\hat{\ell}_{ij}}{\partial p_{h}}$
	$\displaystyle=\left(1\!-\!\frac{\left\|z_{ij}\right\|^{2}\ell_{ij}}{v_{i}}\right)R_{ih}+2\left(r_{ij}\!-\!\frac{\left\|z_{ij}\right\|^{2}P_{ij}}{v_{i}}\right)\cdot\mathds{1}(j\in\mathbb{P}_{h})$		(13a)
	$\displaystyle\frac{\partial v_{j}}{\partial q_{h}}=\frac{\partial\hat{v}_{i}}{\partial q_{h}}-2\left(r_{ij}\frac{\partial\hat{P}_{ij}}{\partial q_{h}}+x_{ij}\frac{\partial\hat{Q}_{ij}}{\partial q_{h}}\right)+\left\|z_{ij}\right\|^{2}\frac{\partial\hat{\ell}_{ij}}{\partial q_{h}}$
	$\displaystyle=\left(1\!-\!\frac{\left\|z_{ij}\right\|^{2}\ell_{ij}}{v_{i}}\right)X_{ih}+2\left(x_{ij}\!-\!\frac{\left\|z_{ij}\right\|^{2}Q_{ij}}{v_{i}}\right)\cdot\mathds{1}(j\in\mathbb{P}_{h})$		(13b)

where node $i$ is the unique parent of node $j$ in the directed tree network. By (7a), there is $R_{jh}=R_{ih}+2r_{ij}\mathds{1}(j\in\mathbb{P}_{h})$ , which converts the result in (13a) into:

\displaystyle\frac{\partial v_{j}}{\partial p_{h}}=R_{jh}-\frac{\left|z_{ij}\right|^{2}\ell_{ij}}{v_{i}}R_{ih}-\frac{2\left|z_{ij}\right|^{2}P_{ij}}{v_{i}}\mathds{1}(j\in\mathbb{P}_{h}).

(14)

The first term on the RHS of (14) is the same as (7a) derived from the linear model (6), while the second and third terms compensate for the effects of the quadratic terms that were neglected from (6). The improved partial derivatives (13b) with respect to reactive power injections $q_{h}$ have the same structure as (13a), and thus we will not repeat.

IV Hierarchical OPF Algorithm

Based on the primal-dual framework (5) and the improved gradient evaluation (13), we design a scalable hierarchical algorithm to solve the OPF problem (3). The proposed algorithm utilizes the subtree structure of radial distribution networks discovered in [11, 10], while achieving more accurate and safer voltage regulation.

The distribution tree network $\mathcal{T}:=\{\mathcal{N}^{+},\mathcal{E}\}$ is composed of subtrees $\mathcal{T}_{k}=\left\{\mathcal{N}_{k},\mathcal{E}_{k}\right\}$ indexed by $k\in\mathcal{K}=\{1,\ldots,K\}$ and a set of nodes $\mathcal{N}_{0}$ that are not clustered into any subtree. Let $n_{k}^{0}$ denote the root node of subtree $k$ , which is the node in $\mathcal{N}_{k}$ that is nearest to the root node of the whole network. Given a distribution network, the division of subtrees and unclustered nodes may not be unique, but we assume it always satisfies the following conditions.

Assumption 1

The subtrees are non-overlapping, i.e., $\mathcal{N}_{k_{1}}\cap\mathcal{N}_{k_{2}}=\emptyset$ for any $k_{1},k_{2}\in\mathcal{K}$ , $k_{1}\neq k_{2}$ .

Assumption 2

For any subtree root node $n_{k}^{0}$ , $k\in\mathcal{K}$ or any unclustered node $n\in\mathcal{N}_{0}$ , its path to the network root node only goes through a subset of nodes in $\mathcal{N}_{0}$ , but not any node in another subtree.

Refer to caption — Figure 1: The clustering of the IEEE 37-node network on the left satisfies Assumption 2. The one on the right does not, because the path from a subtree root node 730 to the network root 799 goes through another subtree.

The left subfigure of Figure 1 shows a clustering of the IEEE 37-node network that satisfies Assumption 2, while the one in the right subfigure does not.

The distribution system operator plays a role of central controller (CC). It is sufficient for the CC to know the structure and parameters of the backbone network, which only connects the unclustered nodes $\mathcal{N}_{0}$ and the subtree root nodes $\left\{n_{k}^{0},~{}k\in\mathcal{K}\right\}$ . The CC also maintains two-way communications to receive/send/relay information from/to/between the backbone network nodes. Each subtree network $k\in\mathcal{K}$ is known to and managed by the $k$ -th regional controller (RC $k$ ), which communicates with the nodes in subtree $k$ , the parent node of the subtree root $n_{k}^{0}$ , and the CC.

With the settings above, we now explain the hierarchical structure underlying the vector $\left(\frac{\partial\boldsymbol{v}}{\partial\boldsymbol{u}}\right)^{\top}\left(\overline{\boldsymbol{\mu}}-\underline{\boldsymbol{\mu}}\right)$ in (5a). Without loss of generality, we only consider the element of this vector corresponding to $p_{h}$ for a particular node $h\in\mathcal{N}$ , which can be discussed in two cases below. The element corresponding to $q_{h}$ can be calculated in the same manner.

Case 1: If $h\in\mathcal{N}_{k}$ is in a subtree $k\in\mathcal{K}$ , we have:

	$\displaystyle\quad\sum_{j\in\mathcal{N}}\frac{\partial v_{j}}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right)=\sum_{j\in\mathcal{N}_{k}}\frac{\partial v_{j}}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right)$
	$\displaystyle\qquad+\sum_{k^{\prime}\in\mathcal{K}\backslash\{k\}}~{}\sum_{j\in\mathcal{N}_{k^{\prime}}}\frac{\partial v_{j}}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right)$
	$\displaystyle\qquad+\sum_{j\in\mathcal{N}_{0}}\frac{\partial v_{j}}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right).$		(15)

The first term on the RHS of (15) sums over nodes $j$ in the same subtree $k$ as node $h$ . It can be calculated using (13a) by RC $k$ , and then sent to node $h$ which requires this information to compute (15) and carry out (5a). In particular, the calculation of $\partial v_{j}/\partial p_{h}$ requires RC $k$ to receive parameters $R_{ih}$ , $z_{ij}$ and measurements $\ell_{ij}$ , $P_{ij}$ , $v_{i}$ from line $(i,j)$ that connects node $j$ to its parent node $i$ .

The second term on the RHS of (15) sums over nodes $j$ in all the subtrees $k^{\prime}$ except subtree $k$ that hosts node $h$ . By Assumption 2, there must be $j\notin\mathbb{P}_{h}$ , which simplifies (13a) and leads to:

	$\displaystyle\sum_{k^{\prime}\in\mathcal{K}\backslash\{k\}}~{}\sum_{j\in\mathcal{N}_{k^{\prime}}}\frac{\partial v_{j}}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right)\qquad\qquad\qquad$
	$\displaystyle=\sum_{k^{\prime}\in\mathcal{K}\backslash\{k\}}R_{n_{k}^{0}n_{k^{\prime}}^{0}}\sum_{j\in\mathcal{N}_{k^{\prime}}}\left(1\!-\!\frac{\left\|z_{ij}\right\|^{2}\ell_{ij}}{v_{i}}\right)\cdot\left(\overline{\mu}_{j}\!-\!\underline{\mu}_{j}\right).$		(16)

In (16), the sum over $j\in\mathcal{N}_{k^{\prime}}$ can be calculated by RC $k^{\prime}$ . The CC receives such sums from all the RCs $k^{\prime}\in\mathcal{K}\backslash\{k\}$ , adds them up after weighting by $R_{n_{k}^{0}n_{k^{\prime}}^{0}}$ , and sends the result to RC $k$ . Upon receiving this result, RC $k$ broadcasts it to all the nodes in subtree $k$ , including node $h$ which requires this information to compute (15) and carry out (5a).

The third term on the RHS of (15) sums over all the unclustered nodes $j\in\mathcal{N}_{0}$ . It can be calculated using (13a) by the CC and then relayed by RC $k$ to node $h$ . In particular, $R_{ih}=R_{in_{k}^{0}}$ and $\mathds{1}(j\in\mathbb{P}_{h})=\mathds{1}(j\in\mathbb{P}_{n_{k}^{0}})$ are known to the CC for the calculation of (13a).

Case 2: If $h\in\mathcal{N}_{0}$ is an unclustered node, we have:

	$\displaystyle\sum_{j\in\mathcal{N}}\frac{\partial v_{j}}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right)$	$\displaystyle=$	$\displaystyle\sum_{k\in\mathcal{K}}~{}\sum_{j\in\mathcal{N}_{k}}\frac{\partial v_{j}}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right)$		(17)
			$\displaystyle+\sum_{j\in\mathcal{N}_{0}}\frac{\partial v_{j}}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right).$		(17)

The first term on the RHS of (17) sums over nodes $j$ in all the subtrees $k\in\mathcal{K}$ . By Assumption 2, there must be $j\notin\mathbb{P}_{h}$ , which simplifies (13a) and leads to:

	$\displaystyle\sum_{k\in\mathcal{K}}~{}\sum_{j\in\mathcal{N}_{k}}\frac{\partial v_{j}}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right)\qquad\qquad\qquad$
	$\displaystyle=\sum_{k\in\mathcal{K}}R_{hn_{k}^{0}}\sum_{j\in\mathcal{N}_{k}}\left(1-\frac{\left\|z_{ij}\right\|^{2}\ell_{ij}}{v_{i}}\right)\cdot\left(\overline{\mu}_{j}-\underline{\mu}_{j}\right).$		(18)

In (18), the sum over $j\in\mathcal{N}_{k}$ can be calculated by RC $k$ . The CC receives such sums from all the RCs $k\in\mathcal{K}$ , adds them up after weighting by $R_{hn_{k}^{0}}$ , and sends the result to node $h$ for subsequent computations of (17) and (5a).

The second term on the RHS of (17) sums over all the unclustered nodes $j\in\mathcal{N}_{0}$ . It can be calculated using (13a) by the CC and then sent to node $h$ . In particular, $R_{ih}$ and $\mathds{1}(j\in\mathbb{P}_{h})$ are known to the CC for the calculation of (13a), since node $j$ , its parent node $i$ , and node $h$ are all in the backbone network.

To summarize, the computation of the key term $\left(\frac{\partial\boldsymbol{v}}{\partial\boldsymbol{u}}\right)^{\top}\left(\overline{\boldsymbol{\mu}}-\underline{\boldsymbol{\mu}}\right)$ in (5a) can be performed through the coordination of the CC and the RCs in a hierarchical manner. This inspires our design of Algorithm 1 to solve the OPF problem (3). Compared to the conventional centralized primal-dual gradient method, the hierarchical Algorithm 1 can reduce computational complexity as analyzed in [11]. Due to space limitation, the convergence proof of Algorithm 1 is deferred to the journal version of this work.

Algorithm 1 Hierarchical OPF Algorithm

1:repeat

2: At time step

t

, every node

h\in\mathcal{N}

performs local update of primal variables:


	$\displaystyle p_{h}(t+1)=\left[p_{h}(t)-\sigma_{u}\left(\frac{\partial f_{h}\left({p}_{h}(t),{q}_{h}(t)\right)}{\partial p_{h}}+\alpha_{h}(t)\right)\right]_{\mathcal{Y}_{h}}$
	$\displaystyle q_{h}(t+1)=\left[q_{h}(t)-\sigma_{u}\left(\frac{\partial f_{h}\left({p}_{h}(t),{q}_{h}(t)\right)}{\partial q_{h}}+\beta_{h}(t)\right)\right]_{\mathcal{Y}_{h}}$

where

\alpha_{h}(t)=\quad\sum_{j\in\mathcal{N}}\frac{\partial v_{j}\left(\boldsymbol{u}(t)\right)}{\partial p_{h}}\cdot\left(\overline{\mu}_{j}(t)-\underline{\mu}_{j}(t)\right)

is given by (15) if

h\in\mathcal{N}_{k}

is in subtree

k

, or (17) if

h\in\mathcal{N}_{0}

is unclustered; and

\beta_{h}(t)=\quad\sum_{j\in\mathcal{N}}\frac{\partial v_{j}\left(\boldsymbol{u}(t)\right)}{\partial q_{h}}\cdot\left(\overline{\mu}_{j}(t)-\underline{\mu}_{j}(t)\right)

. The updated decisions

\boldsymbol{u}(t+1)=\left(\boldsymbol{p}(t+1),\boldsymbol{q}(t+1)\right)

are actuated by controllable devices.

3: Based on local voltage measurement

v_{h}(t)

, every node

h\in\mathcal{N}

updates its dual variables locally:

\displaystyle\underline{\mu}_{h}(t+1)=\left[\underline{\mu}_{h}(t)+\sigma_{\mu}\left(\underline{v}_{h}-v_{h}(t)-\epsilon\underline{\mu}_{h}(t)\right)\right]_{+},\quad\overline{\mu}_{h}(t+1)=\left[\overline{\mu}_{h}(t)+\sigma_{\mu}\left(v_{h}(t)-\overline{v}_{h}-\epsilon\overline{\mu}_{h}(t)\right)\right]_{+}.

4: Every RC

k\in\mathcal{K}

calculates the following weighted sum of dual variables and sends it to the CC:

\displaystyle\sum_{j\in\mathcal{N}_{k}}\left(1-\frac{\left|z_{ij}\right|^{2}\ell_{ij}}{v_{i}}\right)\cdot\left(\overline{\mu}_{j}(t+1)-\underline{\mu}_{j}(t+1)\right).

5: The CC computes the second and third terms on the RHS of (15) for each destination node

h\in\mathcal{N}_{k}

in a subtree

k

, and computes the first and second terms on the RHS of (17) for each unclustered destination node

h\in\mathcal{N}_{0}

. The CC adds up those terms and sends the result to the corresponding RC

k

that hosts the destination node

h\in\mathcal{N}_{k}

, or to

h\in\mathcal{N}_{0}

directly if it is unclustered. The terms needed for computing

\beta_{h}(t+1)

are calculated and sent by the CC similarly.

6: If

h\in\mathcal{N}_{0}

is unclustered, it already obtains

\alpha_{h}(t+1)

as (17). Otherwise if

h\in\mathcal{N}_{k}

, RC

k

computes the first term on the RHS of (15), adds it with the result received from the CC in step 5, and sends the result

\alpha_{h}(t+1)

to the destination node

h

. The term

\beta_{h}(t+1)

is obtained similarly.

7:until

\left\|\boldsymbol{u}(t+1)-\boldsymbol{u}(t)\right\|_{2}<\delta

for some preset threshold

\delta>0

, or a maximum number of iterations is reached.

8:Otherwise

t\leftarrow(t+1)

and go back to step 2.

V Numerical Experiments

We conduct numerical experiments to demonstrate our improvement over the previous method [10] based on the linearized power flow model (6).

V-A Experiment setup

We consider the single-phase equivalent models of the IEEE 37-node and 123-node test networks. They are clustered into subtrees as shown by Figure 1 (left panel) and Figure 2, respectively.

For the ease of illustration, we make the following modifications to the original network models on the IEEE PES website (https://cmte.ieee.org/pes-testfeeders/resources/):

1.

We model each multi-phase line as a single-phase line with the average impedance of multiple phases. We also convert the multi-phase wye- and delta-connected loads into single-phase loads by taking their average over multiple phases.
2.

We multiply the original load data of the 37-node network by six, and those of the 123-node network by two, to create scenarios with serious voltage issues.
3.

All the loads are treated as constant-power loads. Detailed models of capacitors, regulators, and breakers are not simulated.

For each network, let $\mathcal{N}_{L}$ denote the set of nodes that host nonzero loads. We denote the negative net power injections from the IEEE load data by $(\underline{p}_{h},\underline{q}_{h})$ for all load nodes $h\in\mathcal{N}_{L}$ , and treat them as the nominal injections or the most preferred injections. The feasible power injection regions in (2) are defined as $\mathcal{Y}_{h}=\left\{(p_{h},q_{h})\mid\underline{p}_{h}\leq p_{h}\leq 0.3\underline{p}_{h}<0,\underline{q}_{h}\leq q_{h}\leq 0.3\underline{q}_{h}<0\right\}$ . The OPF objective function in (3a) is defined as $f_{h}(p_{h},q_{h})=(p_{h}-\overline{p}_{h})^{2}+(q_{h}-\overline{q}_{h})^{2}$ for all $h\in\mathcal{N}_{L}$ , which aims to minimize the disutility caused by the deviation from the nominal loads.

For each network, the voltage magnitude of the root (slack) node is fixed at $1.05$ per unit (p.u.). The lower and upper limits for safe voltage are set at $0.95$ p.u. and $1.05$ p.u., respectively. The step sizes for primal and dual variable updates are empirically chosen as $\sigma_{u}=2\times 10^{-3}$ and $\sigma_{\mu}=1\times 10^{-3}$ , respectively.

For each given power injection $\boldsymbol{u}=\left(\boldsymbol{p},\boldsymbol{q}\right)$ , an OpenDSS power flow simulation is performed to obtain the corresponding voltage $\boldsymbol{v}(\boldsymbol{u})$ and the associated line-flow quantities $(\boldsymbol{P},\boldsymbol{Q},\boldsymbol{\ell})$ , which are treated as the feedback signals measured from the power network. The Python 3.7 programs for OPF algorithms and the OpenDSS simulations are run on a laptop equipped with Intel Core i7-9750H CPU @ 2.6 GHz, 16 GB RAM, and Windows 10 Professional OS.

V-B Numerical results

Due to space limitation, we only show the simulation results in the 123-node network model. The results in the 37-node network are similar.

Voltage safety: The voltage magnitudes at different nodes of the 123-node network are plotted in Figure 3, for three cases. The first case, referred to as “no control”, just takes the nominal power injections $(\underline{\boldsymbol{p}},\underline{\boldsymbol{q}})$ from the IEEE data, which causes severe violation of the voltage lower bound. The second case, referred to as “linear control”, applies the primal-dual gradient algorithm in [10] based on the linearized power flow model (6) to determine power injections, which enhances the overall voltage level but still leaves most nodes below the voltage lower bound. The third case, referred to as “improved control (ours)”, solves the OPF with the proposed Algorithm 1 based on our improved gradient evaluation method, which effectively lifts all the voltages above the lower bound. This result verifies our analysis in Section III that the improved gradient evaluation can prevent voltage violation caused by the linear model that optimistically estimates the voltages to be safe.

Computational efficiency: Figure 4 shows the change of voltage at a node of the 123-node network (similar trends are displayed for the voltages at other nodes). This result verifies again that the voltage converges to a safe level under the proposed Algorithm 1, compared to the previous algorithm [10] that leads to a violation of voltage lower bound. Moreover, Algorithm 1 converges in fewer iterations. Indeed, Algorithm 1 spends 205 seconds to complete 2,000 iterations, compared to the previous algorithm in [10] which spends 166 seconds. This would be an acceptable increase, considering the improved voltage safety achieved by Algorithm 1.

VI Conclusion

We proposed a hierarchical OPF algorithm based on a new gradient evaluation method that is more accurate than previous methods based on the linearized power flow model. The proposed method can prevent the previous voltage violation with moderate extra computations, as verified by numerical results on IEEE networks.

In future work, we will provide formal convergence proof and complexity analysis of the proposed algorithm. We will also extend the proposed algorithm to multi-phase networks, as what was done from [10] to [11].

[Proof of Lemma 1]

Proof: Take the difference between the linear model (6) and the actual nonlinear model (1):


$\displaystyle\hat{v}_{j}-v_{j}=$	$\displaystyle\hat{v}_{i}-v_{i}-2\left(r_{ij}\left(\hat{P}_{ij}-P_{ij}\right)+x_{ij}\left(\hat{Q}_{ij}-Q_{ij}\right)\right)$
	$\displaystyle-(r_{ij}^{2}+x_{ij}^{2})\ell_{ij},$	(20a)
$\displaystyle\hat{P}_{ij}-P_{ij}$	$\displaystyle=\sum_{k:(j,k)\in\mathcal{E}}(\hat{P}_{jk}-P_{jk})-r_{ij}\ell_{ij},$	(20b)
$\displaystyle\hat{Q}_{ij}-Q_{ij}$	$\displaystyle=\sum_{k:(j,k)\in\mathcal{E}}(\hat{Q}_{jk}-Q_{jk})-x_{ij}\ell_{ij}.$	(20c)

Substitute (20b)–(20c) into (20a), which yields:

	$\displaystyle\hat{v}_{j}-$	$\displaystyle v_{j}=\hat{v}_{i}-v_{i}-2\left(r_{ij}\sum_{k:(j,k)\in\mathcal{E}}\left(\hat{P}_{jk}-P_{jk}\right)\right.$
		$\displaystyle\left.+x_{ij}\sum_{k:(j,k)\in\mathcal{E}}\left(\hat{Q}_{jk}-Q_{jk}\right)\right)+\left(r_{ij}^{2}+x_{ij}^{2}\right)\ell_{ij}.$		(21)

We further calculate the active and reactive power loss terms $\sum_{k:(j,k)\in\mathcal{E}}(\hat{P}_{jk}-P_{jk})$ and $\sum_{k:(j,k)\in\mathcal{E}}(\hat{Q}_{jk}-Q_{jk})$ on the RHS of (VI). Following the tree structure of the network (i.e., a tree graph shown by Figure 5), we recursively calculate (20b)–(20c) until reaching leaves:


$\displaystyle\hat{P}_{jk}-P_{jk}$	$\displaystyle=\sum_{l:(k,l)\in\mathcal{E}}(\hat{P}_{kl}-P_{kl})-r_{jk}\ell_{jk},$	(22a)
$\displaystyle\hat{Q}_{jk}-Q_{jk}$	$\displaystyle=\sum_{l:(k,l)\in\mathcal{E}}(\hat{Q}_{kl}-Q_{kl})-x_{jk}\ell_{jk},$	(22b)
$\displaystyle\qquad\dots$
$\displaystyle\hat{P}_{th}-P_{th}$	$\displaystyle=-r_{th}\ell_{th},~{}\text{for leaf nodes $h$},$	(22c)
$\displaystyle\hat{Q}_{th}-Q_{th}$	$\displaystyle=-x_{th}\ell_{th},~{}\text{for leaf nodes $h$}.$	(22d)

We substitute the above equations layer-by-layer, i.e., (22c), $\dots$ , into (22a) for active power and (22d), $\dots$ , into (22b) for reactive power. This gives the expression of the active and reactive power loss terms as:


	$\displaystyle\sum_{k:(j,k)\in\mathcal{E}}(\hat{P}_{jk}-P_{jk})=$		(23a)
	$\displaystyle\qquad-\sum_{k:(j,k)\in\mathcal{E}}\sum_{l:(k,l)\in\mathcal{E}}\dots\sum_{h:(t,h)\in\mathcal{E}}r_{th}\ell_{th}-\dots-$
	$\displaystyle\qquad-\sum_{k:(j,k)\in\mathcal{E}}\sum_{l:(k,l)\in\mathcal{E}}r_{kl}\ell_{kl}-\sum_{k:(j,k)\in\mathcal{E}}r_{jk}\ell_{jk},$		(23b)
	$\displaystyle\sum_{k:(j,k)\in\mathcal{E}}(\hat{Q}_{jk}-Q_{jk})=$		(23c)
	$\displaystyle\qquad-\sum_{k:(j,k)\in\mathcal{E}}\sum_{l:(k,l)\in\mathcal{E}}\dots\sum_{h:(t,h)\in\mathcal{E}}x_{th}\ell_{th}-\dots-$
	$\displaystyle\qquad-\sum_{k:(j,k)\in\mathcal{E}}\sum_{l:(k,l)\in\mathcal{E}}x_{kl}\ell_{kl}-\sum_{k:(j,k)\in\mathcal{E}}x_{jk}\ell_{jk}.$		(23d)

We define $g_{j}$ and $\eta_{j}$ as the additive inverse of RHS of (23b) and (23d), respectively. In fact, $g_{j}$ and $\eta_{j}$ sum the active and reactive power loss of the lines downstream of node $j$ . We have more compact expressions for $g_{j}$ and $\eta_{j}$ :


	$\displaystyle g_{j}=\sum_{(\alpha,\beta)\in\operatorname{down}(j)}r_{\alpha\beta}\ell_{\alpha\beta},\quad\eta_{j}=\sum_{(\alpha,\beta)\in\operatorname{down}(j)}x_{\alpha\beta}\ell_{\alpha\beta},$

where $\operatorname{down}(j)$ denotes the set of lines downstream of node $j$ . From the tree structure, we have:


	$\displaystyle g_{j}=\sum_{k:(j,k)\in\mathcal{E}}r_{jk}\ell_{jk}+\sum_{k:(j,k)\in\mathcal{E}}g_{k},$		(25a)
	$\displaystyle\eta_{j}=\sum_{k:(j,k)\in\mathcal{E}}x_{jk}\ell_{jk}+\sum_{k:(j,k)\in\mathcal{E}}\eta_{k}.$		(25b)

The expressions of (25) imply that a backward sweep from the leaves to the root node in a power distribution network will give all the results of $(g_{j},\eta_{j}),\forall j\in\mathcal{N}$ . With $(g_{j},\eta_{j})$ , the voltage error (VI) becomes:

\displaystyle\hat{v}_{j}-v_{j}

\displaystyle=\hat{v}_{i}-v_{i}+2\left(r_{ij}g_{j}+x_{ij}\eta_{j}\right)+(r_{ij}^{2}+x_{ij}^{2})\ell_{ij}.

(26)

Equation (26) implies that the voltage error is accumulated from the root to the leaves. A forward sweep from the root to leaves will return voltage error for all nodes $h\in\mathcal{N}$ as:

\displaystyle\hat{v}_{h}-v_{h}=\sum_{(\zeta,\xi)\in\mathbb{P}_{h}}\left(2\left(r_{\zeta\xi}g_{\xi}+x_{\zeta\xi}\eta_{\xi}\right)+(r_{\zeta\xi}^{2}+x_{\zeta\xi}^{2})\ell_{\zeta\xi}\right).

This completes the proof of Lemma 1.

References

[1] S. H. Low, “Convex relaxation of optimal power flow—Part I: Formulations and equivalence,” IEEE Transactions on Control of Network Systems, vol. 1, no. 1, pp. 15–27, 2014.
[2] D. K. Molzahn and I. A. Hiskens, “A survey of relaxations and approximations of the power flow equations,” Foundations and Trends in Electric Energy Systems, pp. 1–221, 2019.
[3] E. Dall’Anese, H. Zhu, and G. B. Giannakis, “Distributed optimal power flow for smart microgrids,” IEEE Transactions on Smart Grid, vol. 4, no. 3, pp. 1464–1475, 2013.
[4] T. Erseghe, “Distributed optimal power flow using ADMM,” IEEE Transactions on Power Systems, vol. 29, no. 5, pp. 2370–2380, 2014.
[5] B. Zhang, A. Y. Lam, A. D. Domínguez-García, and D. Tse, “An optimal and distributed method for voltage regulation in power distribution systems,” IEEE Transactions on Power Systems, vol. 30, no. 4, pp. 1714–1726, 2015.
[6] Q. Peng and S. H. Low, “Distributed optimal power flow algorithm for radial networks, I: Balanced single phase case,” IEEE Transactions on Smart Grid, vol. 9, no. 1, pp. 111–121, 2018.
[7] S. Bolognani, R. Carli, G. Cavraro, and S. Zampieri, “Distributed reactive power feedback control for voltage regulation and loss minimization,” IEEE Transactions on Automatic Control, vol. 60, no. 4, pp. 966–981, 2015.
[8] L. Gan and S. H. Low, “An online gradient algorithm for optimal power flow on radial networks,” IEEE Journal on Selected Areas in Communications, vol. 34, no. 3, pp. 625–638, 2016.
[9] A. Bernstein and E. Dall’Anese, “Real-time feedback-based optimization of distribution grids: A unified approach,” IEEE Transactions on Control of Network Systems, vol. 6, no. 3, pp. 1197–1209, 2019.
[10] X. Zhou, Z. Liu, W. Wang, C. Zhao, F. Ding, and L. Chen, “Hierarchical distributed voltage regulation in networked autonomous grids,” in American Control Conference, pp. 5563–5569, 2019.
[11] X. Zhou, Z. Liu, C. Zhao, and L. Chen, “Accelerated voltage regulation in multi-phase distribution networks based on hierarchical distributed algorithm,” IEEE Transactions on Power Systems, vol. 35, no. 3, pp. 2047–2058, 2019.
[12] M. Baran and F. F. Wu, “Optimal sizing of capacitors placed on a radial distribution system,” IEEE Transactions on Power Delivery, vol. 4, no. 1, pp. 735–743, 1989.
[13] M. Farivar and S. H. Low, “Branch flow model: Relaxations and convexification—part I,” IEEE Transactions on Power Systems, vol. 28, no. 3, pp. 2554–2564, 2013.
[14] J. Koshal, A. Nedić, and U. V. Shanbhag, “Multiuser optimization: Distributed algorithms and error analysis,” SIAM Journal on Optimization, vol. 21, no. 3, pp. 1046–1081, 2011.


	$\displaystyle\frac{\partial v_{j}}{\partial p_{h}}=\frac{\partial\hat{v}_{i}}{\partial p_{h}}-2\left(r_{ij}\frac{\partial\hat{P}_{ij}}{\partial p_{h}}+x_{ij}\frac{\partial\hat{Q}_{ij}}{\partial p_{h}}\right)+\left\|z_{ij}\right\|^{2}\frac{\partial\hat{\ell}_{ij}}{\partial p_{h}}$
	$\displaystyle=\left(1\!-\!\frac{\left\|z_{ij}\right\|^{2}\ell_{ij}}{v_{i}}\right)R_{ih}+2\left(r_{ij}\!-\!\frac{\left\|z_{ij}\right\|^{2}P_{ij}}{v_{i}}\right)\cdot\mathds{1}(j\in\mathbb{P}_{h})$		(13a)
	$\displaystyle\frac{\partial v_{j}}{\partial q_{h}}=\frac{\partial\hat{v}_{i}}{\partial q_{h}}-2\left(r_{ij}\frac{\partial\hat{P}_{ij}}{\partial q_{h}}+x_{ij}\frac{\partial\hat{Q}_{ij}}{\partial q_{h}}\right)+\left\|z_{ij}\right\|^{2}\frac{\partial\hat{\ell}_{ij}}{\partial q_{h}}$
	$\displaystyle=\left(1\!-\!\frac{\left\|z_{ij}\right\|^{2}\ell_{ij}}{v_{i}}\right)X_{ih}+2\left(x_{ij}\!-\!\frac{\left\|z_{ij}\right\|^{2}Q_{ij}}{v_{i}}\right)\cdot\mathds{1}(j\in\mathbb{P}_{h})$		(13b)