Confidence Estimation Transformer for Long-term Renewable Energy Forecasting in Reinforcement Learning-based Power Grid Dispatching
Abstract
The expansion of renewable energy could help realizing the goals of peaking carbon dioxide emissions and carbon neutralization. Some existing grid dispatching methods integrating short-term renewable energy prediction and reinforcement learning (RL) have been proved to alleviate the adverse impact of energy fluctuations risk. However, these methods omit the long-term output prediction, which leads to stability and security problems on the optimal power flow. This paper proposes a confidence estimation Transformer for long-term renewable energy forecasting in reinforcement learning-based power grid dispatching (Conformer-RLpatching). Conformer-RLpatching predicts long-term active output of each renewable energy generator with an enhanced Transformer to boost the performance of hybrid energy grid dispatching. Furthermore, a confidence estimation method is proposed to reduce the prediction error of renewable energy. Meanwhile, a dispatching necessity evaluation mechanism is put forward to decide whether the active output of a generator needs to be adjusted. Experiments carried out on the SG-126 power grid simulator show that Conformer-RLpatching achieves great improvement over the second best algorithm DDPG in security score by 25.8 and achieves a better total reward compared with the golden medal team in the power grid dispatching competition sponsored by State Grid Corporation of China under the same simulation environment. Codes are outsourced in https://github.com/buptlxh/Conformer-RLpatching.
Index Terms:
optimal power flow, reinforcement learning, renewable energy prediction, Conformer-RLpatchingI Introduction
The randomness and volatility of renewable energy sources (RESs) have highlighted the pressing need to address stability and security concerns in power grid dispatching [1, 2]. Meanwhile, the utilization rate of renewable energy (URRE), as another performance index in addition to security and economy, brings new challenges to the security constrained economic dispatching (SCED) in the hybrid energy grid [3]. [4] presented a modified version of multi-objective differential evolution by incorporating wind power plant into the dynamic economic dispatch system. Aiming to minimize cost and restrict risks, a novel SCED model [5] and a chance-constrained economic dispatch model [6] were proposed for wind integrated hybrid power system. [7] implemented a short-term control algorithm to smoothen the power dispatching by improving min-max dispatching method. [8] proposed a look-ahead stochastic unit commitment model for robust optimal dispatching. However, all the above dispatching methods did not consider the volatility of RESs and URRE simultaneously.
The emerging artificial intelligence (AI) technology is increasingly applied to the hybrid energy grid dispatching [9]. In August 2021, State Grid Corporation of China (SGCC) and Baidu jointly sponsored State Grid Dispatching AI Innovation Competition, which is authoritative in smart grid.111https://aistudio.baidu.com/aistudio/competition/detail/111/0/introduction SGCC provided SG-126 power grid simulator for participating teams to realize multi-objective power grid dispatching and strive for the highest total reward. The total reward of the first ranked team is 510.09 among nearly 100 teams. In addition, advanced AI algorithms were put forward to further improve flexibility, controllability and observability of the hybrid grid [10]. An innovative dispatch optimization strategy with an uncertainty post-processing approach was proposed in [11], and [12] adopted a learning-based technique to search the optimal joint control policy. [13] presented a manifold-learning-based Isomap algorithm to represent the low-dimensional hidden probabilistic structure of data. [14] proposed a learning-based decision-making framework for the economic energy dispatching based on historical sequences. However, the optimization objectives of [11, 12, 13, 14] did not involve URRE. [15] modeled the power dispatching as sequential decision-making and introduced Deep Reinforcement Learning (DRL). [16] applied deep deterministic policy gradient (DDPG) to microgrids with photovoltaic panels. [17] improved DRL to adaptively respond to the renewable energy fluctuations. The improved decision tree [18, 19] was also able to provide feasible and optimal dispatch decisions for microgrids. [20] developed a new renewable energy management system with short-term forecasting for hourly dispatching. [21] proposed a data-driven RL approach to relieve branch overload in large power systems. However, all the above dispatching algorithms based on AI did not consider the long-term fluctuation of new energy and were only suitable for real-time or short-term dispatching.
In order to improve the accuracy of forecasting, some prediction algorithms have been proposed. [22, 23] adopted the method of historical data mining to analyze data characteristics for regional wind power prediction and wind power ramp prediction, respectively. [24] introduced small-world neural network to reduce wind power forecasting errors. [25, 26] both used hybrid models for high-precision day-ahead short-term photovoltaic output forecasting. Although Informer proposed in [27] has been verified to be able to accurately predict long-term electricity consuming load, it has not been applied to renewable energy prediction.
In summary, the existing dispatching methods fail to achieve multi-objective dispatching for the hybrid power system under the long-term fluctuation of renewable energy. To address the problem, a confidence estimation Transformer for long-term renewable energy forecasting in reinforcement learning-based power grid dispatching (Conformer-RLpatching) is proposed shown in Fig. 1. In Conformer, the confidence estimation method weights the prediction results from the enhanced Transformer to reduce prediction error. RLpatching utilizes the long-term prediction from Conformer and the current power grid observations to provide an appropriate multi-objective dispatching strategy. The contributions of this paper are summarized as follows:
1. This paper proposes Conformer, an enhanced Transformer with a confidence estimation method, which provides accurate prediction to boost the performance of hybrid energy grid dispatching and abate the impact of renewable energy uncertainty.
2. This paper proposes Conformer-RLpatching to realize multi-objective dispatching under the long-term fluctuation of renewable energy. A dispatching necessity evaluation mechanism is put forward to reduce unnecessary dispatching, thereby improving the stability of the power grid.
3. This paper applies Conformer-RLpatching to the SG-126 power grid simulator provided by SGCC, bridging the ‘sim-to-real’ gap. Extensive experiments show that Conformer-RLpatching achieves great improvement over DDPG in security score by 25.8 and achieves remarkable total reward up to 527.32 superior to 510.09, the highest total reward in State Grid Dispatching AI Innovation Competition.222https://aistudio.baidu.com/aistudio/competition/detail/111/0/leaderboard
The rest of this paper is organized as follows. Section II introduces the confidence estimation Transformer for long-term renewable energy forecasting. Section III presents the reinforcement learning-based dispatching framework and the dispatching process. Section IV gives the performance of proposed methods. Section V draws the conclusion.
II Confidence Estimation Transformer for Long-term Renewable Energy Forecasting
This section mainly introduces two main components of Conformer, the enhanced Transformer-based renewable energy prediction model and the confidence estimation, which are responsible for the long-term prediction of renewable energy and the synthesis of prediction results, respectively. Section II-C introduces the specific process of Conformer.
II-A Enhanced Transformer for Renewable Energy Prediction
In order to deal with renewable energy fluctuations caused by external factors in advance, this paper adopts the enhanced Transformer [27] to efficiently predict the long-term maximum active output of every renewable energy generator. The enhanced Transformer-based renewable energy prediction model, shown in Fig. 2 a, is divided into encoder and decoder.
For all renewable energy generators, represents real maximum active power from to , and the sequence of the maximum active power predicted by the enhanced Transformer at time is noted as . represents the final prediction result for the next 10 days output by Conformer.
The encoder mainly carries out coding for known sequences and extracting features. Suppose that the current time is . In the enhanced Transformer, the input of encoder includes the real data in the past 56 days and the corresponding four-dimension time code, whose sizes are and , respectively. ProbSparse self-attention is adopted in the encoder, achieving time complexity and memory usage on dependency alignments. For tuple inputs, i.e, query , key and value , ProbSparse self-attention allows each key to only attend to the dominant queries:
(1) |
where is a sparse matrix of the same size as , and it only contains the Top- queries under the sparsity measurement. Meanwhile, self-attention distilling operation is introduced. Conv1D layers and Maxpooling layers are added after each ProbSparse self-attention layer to privilege dominating attention scores and help receiving long sequence input.
The decoder is responsible for predicting. The input of the decoder includes historical data and corresponding time coding besides the sequence characteristics output by the encoder. The data section includes real data in the past 28 days and that in the next 15 days replaced by 0. Given the input, the decoder predicts the sequence in the next 15 days . Therefore, the sizes of the input data and time encoding are and respectively. A generative style decoder is adopted to acquire long sequence output with only one forward step needed, simultaneously avoiding cumulative error spreading during the inference.
II-B Confidence Estimation Method
Combining the prediction results from different time can offset the contingency of single prediction result. This paper designs a confidence estimation method to represent the effectiveness of a prediction sequence as the combining weight.
is the condition probability that the result in the prediction sequence is correct given the condition which means the accuracy of partial results is observed. can be derived as
(2) |
To facilitate the update of effectiveness, we use beta distribution to represent confidence of the prediction sequence according to the accuracy of historical prediction results, which is defined as
(3) | ||||
where represent the number of correct prediction results and incorrect prediction results, respectively. In this paper, the prediction result is judged to be correct, when its root mean square error is less than . is the Gamma function, which could be written as
(4) |
When the prediction model is initialized without prior knowledge, the confidence of the prediction sequence is expressed as a uniform distribution on :
(5) |
Suppose that and the prediction sequence at time is . As shown in Fig. 2 b, the confidence estimation method can evaluate the effectiveness of prediction sequence at time by comparing with . It is assumed that correct prediction results and incorrect prediction results are identified after comparison. The estimated confidence of the prediction sequence is defined as
(6) |
Finally, the confidence is normalized as
(7) |
II-C Process of Conformer
The proposed Conformer algorithm is shown in Algorithm 1. Firstly, the enhanced Transformer-based renewable energy prediction model predicts the sequence in the next 15 days on the basis of the real data in the past 56 days . And the prediction results are stored to the storage pool . If , is directly intercepted as . If , the past 5 days’ prediction sequences are taken out from the storage pool , and the confidence of the sequences is calculated via (6) and (7). Finally, the predicted active output results from to in the past 5 days’ prediction sequences are multiplied by the corresponding estimated confidence respectively and then accumulated to obtain the effective prediction .
In the prediction process of Conformer, this paper combines the enhanced Transformer prediction results from different time to get the final prediction, thereby reducing the prediction error. The final prediction is delivered to RLpatching to assist the power grid dispatching and abate the impact of new energy uncertainty.
III Reinforcement Learning-based Dispatching for the Hybrid Power Grid
This section mainly introduces RL-based multi-objective dispatching framework. Firstly, the optimization objectives and constraints are illustrated in Section III-A. DDPG-based power flow optimization, the core algorithm of RLpatching, is described in Section III-B. Next, Section III-C presents the dispatching necessity evaluation mechanism. Finally, Section III-D introduces the decision-making and training process of RLpatching with the assistance of Conformer.
III-A Optimization Objectives and Constraint Conditions
The optimization objectives consist of security, economic and environmental indicators. The security objective considers branch overflow , reactive power output overrun and voltage overrun . The economic objective aims at minimizing the cumulative cost of the power system , which includes the operation cost and startup cost of the generators. And the environmental objective targets to maximize URRE . The formula of the optimization objectives is
(8) | |||
(8a) | |||
(8b) | |||
(8c) | |||
(8d) | |||
(8e) |
Here, is the maximum number of steps in which the dispatching strategy can make the simulator run safely under the constraints. is the weight coefficient greater than one to improve URRE. , , and represent the number of generators, branches, busbars and renewable energy generators, respectively. , and are the on/off status, reactive power output and active output of generator in period , respectively. and represent the maximum and minimum reactive power output of generator , and is the maximum active output of generator . and separately represent the current and thermal limits of branch . is the voltage of busbar in period , and represent the maximum and minimum voltage of busbar , respectively. , and are operation cost factors of generator . , the factor of startup cost, is a fixed value for generator . In addition, we use a normalization function to limit the ranges of , and to .
In this paper, three constraint conditions are set to limit optimization.
III-A1 Voltage constraints
The actual value of the voltage of any generator should not be greater than the upper limit of the voltage, nor less than the lower limit. Otherwise, the security objective will be affected negatively.
III-A2 Power balance constraints
The total power generation should cover the total power demand. Hence,
(15) |
where is the number of thermal generators, and is the number of loads.
III-A3 Ramp rate constraints
The active power adjustment value of each thermal power generator between any two continuous time steps must be smaller than the generator maximum adjustment value. Therefore, for generator in period , the ramp rate constraint is defined as
(16) |
III-B DDPG-based Power Flow Optimization
RLpatching adopting DDPG determines the active power output of all generators for the next day according to the current power grid operation state and long-term prediction of renewable energy from Conformer. It includes and , which are used for decision-making and scoring, respectively, as shown in Fig. 3.
In RLpatching, the input of includes the current power grid observations and the final prediction output by Conformer, which are marked as the state space . consists of 13 selected groups of current power grid parameters to represent the operation state, including reactive power of all generators, voltage, active power and reactive power at the starts and ends of branches, load ratio and current of branches, active power and reactive power of the loads and the grid loss. outputs active power of all generators for the next day . takes , as its input and outputs the .
Since the grid operation control is modeled as a time-continuous decision-making process, the reward function is defined as the optimization objectives at each time step. The reward function at time is
(17) |
The and parameterized by , are used to represent the deterministic policy and the critic function . and are optimized by stochastic gradient method. The loss functions of and are
(18) |
(19) |
III-C Dispatching Necessity Evaluation Mechanism
In power grid dispatching, adjusting numerous generators would decrease the stability sharply. To deal with this problem, this paper designs a dispatching necessity evaluation mechanism to quantify the necessity of each generator and select generators to dispatch. For generator , three factors are considered to evaluate its necessity, including active power adjustment value for the next day , utilization rate , and maximum load rate of all branches around it . is normalized as by the min-max method. We select generators with the highest necessity to dispatch, and the active output of other generators is consistent with that of previous time step. The necessity formula of dispatching is defined as
(20) |
Equation (20) is a polynomial composed of three parts. measures the adjustment value, with which the dispatching necessity is positively correlated. is designed to control the utilization rate of generator within a reasonable range. When utilization rate of generator is less than , the necessity of adjustment increases with the increase of . When utilization rate of generator is greater than , the increase of active power of generator is restrained. maintains the load rate of branches around generator , featuring the same characteristics as .
III-D RLpatching Decision-making and Training Process
The decision-making and training process of the proposed Conformer-RLpatching for the hybrid power grid is shown in Algorithm 2. At the beginning of each epoch, is obtained from the initialized simulator, together with which the prediction from Conformer initializes the state space . In each step of dispatching, active power output of all generators for the next step is determined by the , the dispatching necessity evaluation mechanism selects generators to dispatch, and then the simulator executes the dispatching strategy. Next, and consists . Afterwards is stored to . Finally, samples are selected from to update all networks. This epoch will end when the power flow calculation of the simulator fails to converge or the operation of the grid does not meet the constraints mentioned in Section III-A.
IV Simulation and Results
IV-A The Simulator and Settings
Conformer-RLpatching is evaluated on SG-126 power grid simulator. The simulator includes 54 generators and 117 branches, which conforms to the characteristics and operation mode of provincial grid. The simulation parameters and generator parameters are illustrated in TABLE I and TABLE II, respectively.
Parameters | Value | Parameters | Values |
Renewable energy units | 18 | Thermal power units | 36 |
Branches | 185 | Loads | 91 |
Transformers | 9 | Ramp rate | 0.05 |
2 | 5 |
Parameters | Units | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
U1-U18 | U19-U30 | U30-U40 | U40-U54 | |||||||||
Type |
|
|
|
|
||||||||
- | 110 | 128 | 140 | |||||||||
0 | 15 | 25 | 28 | |||||||||
105 | 105 | 105 | 105 | |||||||||
95 | 95 | 95 | 95 | |||||||||
0.0696 | 0.0285 | 0.0109 | 0.0097 | |||||||||
26.2438 | 17.82 | 22.9423 | 12.8875 | |||||||||
31.67 | 10.15 | 32.96 | 58.81 | |||||||||
80 | 100 | 200 | 880 |
Input data length | 48 | 56 | 64 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Output data length | 15 | 20 | 25 | 30 | 15 | 20 | 25 | 30 | 15 | 20 | 25 | 30 |
Conformer | 5.0648 | 5.3919 | 5.8599 | 6.2192 | 4.9995 | 5.2718 | 5.9732 | 6.1780 | 5.0076 | 5.3985 | 6.0351 | 6.1866 |
Informer | 5.9370 | 6.7637 | 7.2799 | 7.8346 | 5.9677 | 6.6830 | 7.3198 | 7.7981 | 6.0773 | 6.7428 | 7.3137 | 7.7993 |
LSTMa | 10.9698 | 11.3034 | 12.1275 | 12.2250 | 10.8640 | 12.7792 | 13.0704 | 12.5671 | 11.1334 | 12.8136 | 13.0573 | 12.9230 |
Prophet | 8.9671 | 9.7796 | 10.8414 | 11.8916 | 9.4211 | 10.3752 | 11.2745 | 12.3669 | 9.5111 | 10.3964 | 11.2522 | 12.5742 |
CNN | 6.6045 | 7.5408 | 8.0617 | 8.5013 | 6.7521 | 7.4893 | 8.0038 | 8.5720 | 6.8096 | 7.5571 | 7.9928 | 8.6852 |
CNN-LSTM | 7.8669 | 8.6973 | 9.2094 | 9.5646 | 8.2419 | 9.3553 | 9.6920 | 9.7102 | 8.3122 | 9.4028 | 9.4484 | 9.9086 |
IV-B Comparison of Renewable Energy Prediction Models
This subsection compares the predictive effectiveness of Conformer with that of state-of-the-art methods, including Informer [27], LSTMa [28], Prophet [29], convolutional neural network (CNN), and CNN-LSTM [26]. The data set used in the experiment is the maximum active power output of 18 renewable energy units in 106820 days, which is provided by China Electric Power Research Institute. This paper selects 90 of the data as the training set and 10 as the testing set. The root mean square error (RMSE) is used to evaluate the effect of the models:
(21) |
We set up a series of comparative experiments, w.r.t variable input length and output length, to predict the long-term maximum active output of all renewable energy generators. In these experiments, historical data of the past 48 days, 56 days, 64 days, 72 days, 80 days, 88 days, and 96 days are fed into models separately, and the length of predicted values is set as 15, 20, 25 and 30. We only demonstrate the crutial results in TABLE III, the complete results are exhibited in APPENDIX A.
Conformer achieves much better performance than all compared methods in the overall experiments, which demonstrates its superiority. Specifically, the average RMSE of Conformer in 12 groups of comparative experiments is 19.07 lower than that of Informer with the second-best performance, which means Conformer has less prediction error. The results show that the confidence estimation method offsetting the contingency of a single prediction result facilitates the predictive effectiveness. In addition, the prediction accuracy of Conformer is 39.65 higher than other algorithms on average.
When the input length is fixed, all methods show a consistent trend, that is, the gap between the prediction and real data widens slightly as the output length increases. For example, when the input length is 56 and the output length rises from 15 to 30, the RMSE of Conformer and CNN increases by 23.57 and 26.95, respectively.
Given the fixed output length, the prediction accuracy of all methods tends to increase with the input length rising, and reaches a peak as the input length is approximately 56. The phenomenon illustrates that more historical data endows the model with better predictive ability, while excessive input will bring more interference of invalid data to the model, thus reducing the prediction accuracy. When the output length is fixed at 20, despite the predictive error of Conformer increasing by 2.40 as the input length rises from 56 to 64, Conformer still maintains its superiority with the RMSE 19.94, 28.56 lower than that of Informer and CNN, respectively.
This paper further observes the robustness of Conformer in different conditions of variable input and output lengths compared with Informer and CNN. Short-term prediction with more historical data makes it much easier for models. Therefore, three scenarios of variable difficulty are set as shown Fig. 4. It is clear that CNN has the worst performance in all scenarios. As depicted in Fig. 4 (a), both Conformer and Informer are able to describe the correct trend of real data, specially the prediction from day 9 to day 13 output by Conformer achieves surprisingly little error. In the moderate scenario Fig. 4 (b), Conformer performs more stably than Informer, and achieves more accurate long-term prediction. Conformer significantly outperforms Informer in the hard scenario Fig. 4 (c), where Informer deviates real data as output length rises, whereas Conformer describes the correct trend of real data accurately. In summary, Conformer still achieves the best performance despite the prediction accuracy decreasing as the difficulty rises, which proves its advantageous robustness.
Steps |
|
|
|
|
||||||||||||
Conformer-RLpatching I | 0.8 | 0.8 | 40 | 426 | 261.145 | 65251.618 | 81.579 | 527.318 | ||||||||
Conformer-RLpatching II | 0.8 | 0.8 | 40 | 309 | 202.073 | 56499.215 | 73.826 | 346.117 | ||||||||
Conformer-RLpatching III | 0.7 | 0.8 | 40 | 417 | 250.544 | 65552.791 | 79.798 | 498.990 | ||||||||
Conformer-RLpatching IV | 0.9 | 0.8 | 40 | 415 | 254.921 | 65399.495 | 77.909 | 485.560 | ||||||||
Conformer-RLpatching V | 0.8 | 0.7 | 40 | 412 | 252.695 | 65651.850 | 81.822 | 510.402 | ||||||||
Conformer-RLpatching VI | 0.8 | 0.9 | 40 | 395 | 241.744 | 66163.359 | 82.754 | 497.888 | ||||||||
Conformer-RLpatching VII | 0.8 | 0.8 | 35 | 408 | 252.815 | 65538.547 | 80.055 | 498.064 | ||||||||
Conformer-RLpatching VIII | 0.8 | 0.8 | 45 | 417 | 245.887 | 66081.338 | 82.638 | 513.476 | ||||||||
A+B | - | - | - | 411 | 250.824 | 65512.642 | 79.198 | 490.008 | ||||||||
B+C | 0.8 | 0.8 | 40 | 206 | 129.021 | 68862.785 | 78.333 | 241.563 | ||||||||
DDPG | - | - | - | 325 | 207.656 | 63973.590 | 55.871 | 243.303 | ||||||||
DCR-TD3 | - | - | - | 250 | 159.899 | 59964.867 | 78.787 | 299.359 | ||||||||
PPO | - | - | - | 56 | 30.961 | 71204.500 | 99.451 | 82.139 |
-
1
A: Conformer, B: DDPG-based Power Flow Optimization Algorithm, C: Dispatching Necessity Evaluation Mechanism
IV-C Performance of Conformer-RLpatching With Different Hyper Parameters and Ablation Experiments
According to Section IV-B, the length of historical data input to Conformer-RLpatching is set as 56 with the lowest prediction error. Conformer outputs the prediction for the next 10 days in all experiments except for Conformer-RLpatching II, which only provides prediction for the next day.
In order to compare the impact of long-term and short-term renewable energy prediction on dispatching effect, this paper designs a comparative experiment between Conformer-RLpatching I and II. As shown in TABLE IV, Conformer-RLpatching I enables the simulator to run 117 more steps than Conformer-RLpatching II whilst satisfying the constrains, and the total reward of Conformer-RLpatching I is 52.352 higher.
Conformer-RLpatching I and III-VIII test the impact of different parameters in the dispatching necessity evaluation mechanism, including , , , on active power flow dispatching of hybrid power grid. According to TABLE IV, overall, when and are both 0.8 and is 40, Conformer-RLpatching performs best, and its total reward reaches 527. When changes, the security score and renewable energy utilization rate show a downward trend, but the average cost of each step changes little. For example, when increases from 0.8 to 0.9, the utilization rate of renewable energy decreases by 4.499. When the critical value of branch current increases, the security of power grid is seriously affected. As increases from 0.8 to 0.9, the security score decreases by 7.429. As rises to 40, the stability of the power grid reaches a peak. When increases from 40 to 45, the number of operation steps and security score decrease by 2.113 and 5.843, respectively, despite the utilization rate of renewable energy increasing by 1.348.
In addition, ablation experiments are carried out to explore the separate contributions of Conformer and dispatching necessity evaluation mechanism. It can be seen that the security score decreases by 3.952 without C. This indicates that the dispatching necessity evaluation mechanism can ensure the stable operation of the power grid system. Besides, the total reward significantly reduces by 54.190 discarding A, which illustrates that Conformer can assist the grid dispatching and abate the negative impact of renewable energy fluctuations.
IV-D Comparison With Other Dispatching Methods
This paper compares Conformer-RLpatching with DDPG, distributed classification replay twin delayed deep deterministic policy gradient (DCR-TD3) [30] and proximal policy optimization (PPO) to further verify the dispatching effect of Conformer-RLpatching. The simulation results are shown in TABLE IV. The security score of Conformer-RLpatching is 25.758 and 63.319 higher than that of DDPG and DCR-TD3. The total reward is improved by 76.149 compared with DCR-TD3 having the second best performance. The renewable energy utilization rate of Conformer-RLpatching is 4.540 higher than that of other methods on average.
More intuitively, Fig. 5 depicts the average renewable energy utilization rate, the average cost, and the cumulative reward of each step of the above dispatching methods in an epoch. As shown in Fig. 5 (a), despite PPO reaching relatively high renewable energy utilization rate in the beginning, its dispatching is highly unstable with an end at only step 50. In contrast, Conformer-RLpatching, which enables superior stability and efficient renewable energy utilization rate, is noticeably the best performer on the renewable energy utilization issue, albeit at the cost of slightly high cost as presented in Fig. 5 (b). Fig. 5 (c) shows the overall performance in dispatching. The rewards obtained by PPO and DCR-TD3 improve fast initially whereas coming to an end quickly, which indicates their instability. DDPG runs relatively more steps than PPO and DCR-TD3, but it obtains the lowest rewards of them in the epoch. To sum up, Conformer-RLpatching achieves the most running steps and the highest cumulative reward, verifying its superiority among the state-of-the-art dispatching methods.
Total Reward | |
---|---|
Conformer-RLpatching I | 527.318 |
The first ranked team | 510.091 |
The second ranked team | 509.206 |
The third ranked team | 507.239 |
TABLE V compares the total reward of Conformer-RLpatching and the top three teams in State Grid Dispatching AI Innovation Competition. The results show that Conformer-RLpatching achieves a remarkable total reward up to 527.318 superior to 510.091 obtained by the best team.
V Conclusion
The emerging AI technology is gradually integrated into the security constrained economic dispatching of the hybrid energy grid. This paper proposes a Conformer-RLpatching to achieve multi-objective dispatching under the long-term fluctuations of renewable energy. Equipped with the confidence estimation method, the Conformer can provide accurate long-term renewable energy prediction to reduce the impact of new energy uncertainty on dispatching. What’s more, this paper designs RLpatching to realize active power flow optimization and puts forward a dispatching necessity evaluation mechanism to reduce unnecessary dispatching and improve the stability of the power grid. Extensive experiments based on the SG-126 power grid simulator demonstrate that the proposed Conformer-RLpatching is superior to other state-of-the-art methods. The future work will consider the impact of additional factors, such as weather and social activities, on renewable energy to improve the prediction accuracy, and further study the multi-objective dispatching of hybrid power grid under source-load side uncertainty.
Appendix A Prediction Results
As for prediction models, historical data of the past 48 days, 56 days, 64 days, 72 days, 80 days, 88 days, and 96 days are fed into models separately, and the length of prediction is set as 15, 20, 25 and 30. All experimental results are shown in the following table.
|
|
Conformer | Informer | LSTMa | Prophet | CNN |
|
||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
48 | 15 | 5.0648 | 5.937 | 10.9698 | 8.9671 | 6.6045 | 7.8669 | ||||||
20 | 5.3919 | 6.7637 | 11.3034 | 9.7796 | 7.5408 | 8.6973 | |||||||
25 | 5.8599 | 7.2799 | 12.1275 | 10.8414 | 8.0617 | 9.2094 | |||||||
30 | 6.2192 | 7.8346 | 12.225 | 11.8916 | 8.5013 | 9.5646 | |||||||
56 | 15 | 4.9995 | 5.9677 | 10.864 | 9.4211 | 6.7521 | 8.2419 | ||||||
20 | 5.2718 | 6.683 | 12.7792 | 10.3752 | 7.4893 | 9.3553 | |||||||
25 | 5.9732 | 7.3198 | 13.0704 | 11.2745 | 8.0038 | 9.692 | |||||||
30 | 6.178 | 7.7981 | 12.5671 | 12.3669 | 8.572 | 9.7102 | |||||||
64 | 15 | 5.0076 | 6.0773 | 11.1334 | 9.5111 | 6.8096 | 8.3122 | ||||||
20 | 5.3985 | 6.7428 | 12.8136 | 10.3964 | 7.5571 | 9.4028 | |||||||
25 | 6.0351 | 7.3137 | 13.0573 | 11.2522 | 7.9928 | 9.4484 | |||||||
30 | 6.1866 | 7.7993 | 12.923 | 12.5742 | 8.6852 | 9.9086 | |||||||
72 | 15 | 5.0214 | 6.081 | 11.1267 | 9.5745 | 6.8373 | 8.3324 | ||||||
20 | 5.4412 | 6.7233 | 12.9052 | 10.4459 | 7.5659 | 9.3964 | |||||||
25 | 5.906 | 7.3482 | 13.1027 | 11.3085 | 8.096 | 9.5547 | |||||||
30 | 6.2055 | 7.8026 | 12.9616 | 12.5302 | 8.6734 | 9.9164 | |||||||
80 | 15 | 5.1835 | 6.0975 | 11.3142 | 9.6998 | 6.8945 | 8.4128 | ||||||
20 | 5.5034 | 6.7255 | 12.8648 | 10.4943 | 7.5203 | 9.4645 | |||||||
25 | 5.9607 | 7.3541 | 12.9361 | 11.3496 | 8.1734 | 9.6077 | |||||||
30 | 6.2565 | 7.844 | 13.1125 | 12.5463 | 8.7174 | 9.8437 |
|
|
Conformer | Informer | LSTMa | Prophet | CNN |
|
||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
88 | 15 | 4.9972 | 6.1014 | 11.3667 | 9.6289 | 6.8981 | 8.3959 | ||||||
20 | 5.4511 | 6.6902 | 12.8955 | 10.5025 | 7.6697 | 9.773 | |||||||
25 | 5.9611 | 7.3878 | 13.0564 | 11.5041 | 8.3747 | 9.7416 | |||||||
30 | 6.2532 | 7.8635 | 13.2063 | 12.5195 | 8.7748 | 9.9718 | |||||||
96 | 15 | 5.1985 | 6.1397 | 11.4291 | 9.8016 | 6.9034 | 8.4526 | ||||||
20 | 5.5323 | 6.7316 | 13.0103 | 10.6395 | 7.6748 | 9.8253 | |||||||
25 | 5.9236 | 7.3796 | 13.1256 | 11.8777 | 8.4523 | 9.8562 | |||||||
30 | 6.3164 | 7.8826 | 13.1813 | 12.6051 | 8.8516 | 9.9868 |
References
- [1] B. Khorramdel, A. Zare, C. Chung, and P. Gavriliadis, “A generic convex model for a chance-constrained look-ahead economic dispatch problem incorporating an efficient wind power distribution modeling,” IEEE Transactions on Power Systems, vol. 35, no. 2, pp. 873–886, 2019.
- [2] C. Tang, J. Xu, Y. Sun, J. Liu, X. Li, D. Ke, J. Yang, and X. Peng, “Look-ahead economic dispatch with adjustable confidence interval based on a truncated versatile distribution model for wind power,” IEEE Transactions on Power Systems, vol. 33, no. 2, pp. 1755–1767, 2017.
- [3] C. Yang, W. Sun, J. Yang, and D. Han, “Risk-averse two-stage distributionally robust economic dispatch model under uncertain renewable energy,” CSEE Journal of Power and Energy Systems, pp. 1–10, 2021.
- [4] B. Qu, J. J. Liang, Y. Zhu, and P. N. Suganthan, “Solving dynamic economic emission dispatch problem considering wind power by multi-objective differential evolution with ensemble of selection method,” Natural Computing, vol. 18, no. 4, pp. 695–703, 2019.
- [5] H. Huang, M. Zhou, S. Zhang, L. Zhang, G. Li, and Y. Sun, “Exploiting the operational flexibility of wind integrated hybrid ac/dc power systems,” IEEE Transactions on Power Systems, vol. 36, no. 1, pp. 818–826, 2020.
- [6] Y. Yang, W. Wu, B. Wang, and M. Li, “Chance-constrained economic dispatch considering curtailment strategy of renewable energy,” IEEE Transactions on Power Systems, vol. 36, no. 6, pp. 5792–5802, 2021.
- [7] W. Wang, B. Sun, H. Li, Q. Sun, and R. Wennersten, “An improved min-max power dispatching method for integration of variable renewable energy,” Applied Energy, vol. 276, p. 115430, 2020.
- [8] E. Du, N. Zhang, B.-M. Hodge, Q. Wang, Z. Lu, C. Kang, B. Kroposki, and Q. Xia, “Operation of a high renewable penetrated power system with csp plants: A look-ahead stochastic unit commitment model,” IEEE Transactions on power systems, vol. 34, no. 1, pp. 140–151, 2018.
- [9] L. Yin, Q. Gao, L. Zhao, and T. Wang, “Expandable deep learning for real-time economic generation dispatch and control of three-state energies based future smart grids,” Energy, vol. 191, p. 116561, 2020.
- [10] Y. Gao and Q. Ai, “A novel optimal dispatch method for multiple energy sources in regional integrated energy systems considering wind curtailment,” CSEE Journal of Power and Energy Systems, pp. 1–10, 2022.
- [11] A. C. do Amaral Burghi, T. Hirsch, and R. Pitz-Paal, “Artificial learning dispatch planning for flexible renewable-energy systems,” Energies, vol. 13, no. 6, p. 1517, 2020.
- [12] K. Lv, H. Tang, Y. Li, and X. Li, “A learning-based optimization of active power dispatch for a grid-connected microgrid with uncertain multi-type loads,” Journal of Renewable and Sustainable Energy, vol. 9, no. 6, p. 065901, 2017.
- [13] Z. Hu, Y. Xu, M. Korkali, X. Chen, L. Mili, and J. Valinejad, “A bayesian approach for estimating uncertainty in stochastic economic dispatch considering wind power penetration,” IEEE Transactions on Sustainable Energy, vol. 12, no. 1, pp. 671–681, 2020.
- [14] W. Dong, Q. Yang, W. Li, and A. Y. Zomaya, “Machine-learning-based real-time economic dispatch in islanding microgrids in a cloud-edge computing environment,” IEEE Internet of Things Journal, vol. 8, no. 17, pp. 13 703–13 711, 2021.
- [15] J. Guan, H. Tang, K. Wang, J. Yao, and S. Yang, “A parallel multi-scenario learning method for near-real-time power dispatch optimization,” Energy, vol. 202, p. 117708, 2020.
- [16] L. Lei, Y. Tan, G. Dahlenburg, W. Xiang, and K. Zheng, “Dynamic energy dispatch based on deep reinforcement learning in iot-driven smart isolated microgrids,” IEEE Internet of Things Journal, vol. 8, no. 10, pp. 7938–7953, 2020.
- [17] T. Yang, L. Zhao, W. Li, and A. Y. Zomaya, “Dynamic energy dispatch strategy for integrated energy system based on improved deep reinforcement learning,” Energy, vol. 235, p. 121377, 2021.
- [18] R. Lu, T. Ding, B. Qin, J. Ma, X. Fang, and Z. Dong, “Multi-stage stochastic programming to joint economic dispatch for energy and reserve with uncertain renewable energy,” IEEE Transactions on Sustainable Energy, vol. 11, no. 3, pp. 1140–1151, 2019.
- [19] Y. Huo, F. Bouffard, and G. Joós, “Decision tree-based optimization for flexibility management for sustainable energy microgrids,” Applied Energy, vol. 290, p. 116772, 2021.
- [20] B. Mohandes, M. Wahbah, M. S. El Moursi, and T. H. El-Fouly, “Renewable energy management system: Optimum design and hourly dispatch,” IEEE Transactions on Sustainable Energy, vol. 12, no. 3, pp. 1615–1628, 2021.
- [21] M. Kamel, R. Dai, Y. Wang, F. Li, and G. Liu, “Data-driven and model-based hybrid reinforcement learning to reduce stress on power systems branches,” CSEE Journal of Power and Energy Systems, vol. 7, no. 3, pp. 433–442, 2021.
- [22] X. Peng, Y. Chen, K. Cheng, H. Wang, Y. Zhao, B. Wang, J. Che, C. Liu, J. Wen, C. Lu et al., “Wind power prediction for wind farm clusters based on the multifeature similarity matching method,” IEEE Transactions on Industry Applications, vol. 56, no. 5, pp. 4679–4688, 2020.
- [23] M. Cui, J. Zhang, Q. Wang, V. Krishnan, and B.-M. Hodge, “A data-driven methodology for probabilistic wind power ramp forecasting,” IEEE Transactions on Smart Grid, vol. 10, no. 2, pp. 1326–1338, 2017.
- [24] S. Wang, X. Zhao, H. Wang, and M. Li, “Small-world neural network and its performance for wind power forecasting,” CSEE Journal of Power and Energy Systems, vol. 6, no. 2, pp. 362–373, 2020.
- [25] L. Ge, Y. Xian, J. Yan, B. Wang, and Z. Wang, “A hybrid model for short-term pv output forecasting based on pca-gwo-grnn,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 6, pp. 1268–1275, 2020.
- [26] G. Li, S. Xie, B. Wang, J. Xin, Y. Li, and S. Du, “Photovoltaic power forecasting with a hybrid deep learning approach,” IEEE Access, vol. 8, pp. 175 871–175 880, 2020.
- [27] H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proceedings of AAAI, 2021.
- [28] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, 2014.
- [29] S. J. Taylor and B. Letham, “Forecasting at scale,” The American Statistician, vol. 72, no. 1, pp. 37–45, 2018.
- [30] J. Li and T. Yu, “Deep reinforcement learning based multi-objective integrated automatic generation control for multiple continuous power disturbances,” IEEE Access, vol. 8, pp. 156 839–156 850, 2020.
References
- [1] B. Khorramdel, A. Zare, C. Chung, and P. Gavriliadis, “A generic convex model for a chance-constrained look-ahead economic dispatch problem incorporating an efficient wind power distribution modeling,” IEEE Transactions on Power Systems, vol. 35, no. 2, pp. 873–886, 2019.
- [2] C. Tang, J. Xu, Y. Sun, J. Liu, X. Li, D. Ke, J. Yang, and X. Peng, “Look-ahead economic dispatch with adjustable confidence interval based on a truncated versatile distribution model for wind power,” IEEE Transactions on Power Systems, vol. 33, no. 2, pp. 1755–1767, 2017.
- [3] C. Yang, W. Sun, J. Yang, and D. Han, “Risk-averse two-stage distributionally robust economic dispatch model under uncertain renewable energy,” CSEE Journal of Power and Energy Systems, pp. 1–10, 2021.
- [4] B. Qu, J. J. Liang, Y. Zhu, and P. N. Suganthan, “Solving dynamic economic emission dispatch problem considering wind power by multi-objective differential evolution with ensemble of selection method,” Natural Computing, vol. 18, no. 4, pp. 695–703, 2019.
- [5] H. Huang, M. Zhou, S. Zhang, L. Zhang, G. Li, and Y. Sun, “Exploiting the operational flexibility of wind integrated hybrid ac/dc power systems,” IEEE Transactions on Power Systems, vol. 36, no. 1, pp. 818–826, 2020.
- [6] Y. Yang, W. Wu, B. Wang, and M. Li, “Chance-constrained economic dispatch considering curtailment strategy of renewable energy,” IEEE Transactions on Power Systems, vol. 36, no. 6, pp. 5792–5802, 2021.
- [7] W. Wang, B. Sun, H. Li, Q. Sun, and R. Wennersten, “An improved min-max power dispatching method for integration of variable renewable energy,” Applied Energy, vol. 276, p. 115430, 2020.
- [8] E. Du, N. Zhang, B.-M. Hodge, Q. Wang, Z. Lu, C. Kang, B. Kroposki, and Q. Xia, “Operation of a high renewable penetrated power system with csp plants: A look-ahead stochastic unit commitment model,” IEEE Transactions on power systems, vol. 34, no. 1, pp. 140–151, 2018.
- [9] L. Yin, Q. Gao, L. Zhao, and T. Wang, “Expandable deep learning for real-time economic generation dispatch and control of three-state energies based future smart grids,” Energy, vol. 191, p. 116561, 2020.
- [10] Y. Gao and Q. Ai, “A novel optimal dispatch method for multiple energy sources in regional integrated energy systems considering wind curtailment,” CSEE Journal of Power and Energy Systems, pp. 1–10, 2022.
- [11] A. C. do Amaral Burghi, T. Hirsch, and R. Pitz-Paal, “Artificial learning dispatch planning for flexible renewable-energy systems,” Energies, vol. 13, no. 6, p. 1517, 2020.
- [12] K. Lv, H. Tang, Y. Li, and X. Li, “A learning-based optimization of active power dispatch for a grid-connected microgrid with uncertain multi-type loads,” Journal of Renewable and Sustainable Energy, vol. 9, no. 6, p. 065901, 2017.
- [13] Z. Hu, Y. Xu, M. Korkali, X. Chen, L. Mili, and J. Valinejad, “A bayesian approach for estimating uncertainty in stochastic economic dispatch considering wind power penetration,” IEEE Transactions on Sustainable Energy, vol. 12, no. 1, pp. 671–681, 2020.
- [14] W. Dong, Q. Yang, W. Li, and A. Y. Zomaya, “Machine-learning-based real-time economic dispatch in islanding microgrids in a cloud-edge computing environment,” IEEE Internet of Things Journal, vol. 8, no. 17, pp. 13 703–13 711, 2021.
- [15] J. Guan, H. Tang, K. Wang, J. Yao, and S. Yang, “A parallel multi-scenario learning method for near-real-time power dispatch optimization,” Energy, vol. 202, p. 117708, 2020.
- [16] L. Lei, Y. Tan, G. Dahlenburg, W. Xiang, and K. Zheng, “Dynamic energy dispatch based on deep reinforcement learning in iot-driven smart isolated microgrids,” IEEE Internet of Things Journal, vol. 8, no. 10, pp. 7938–7953, 2020.
- [17] T. Yang, L. Zhao, W. Li, and A. Y. Zomaya, “Dynamic energy dispatch strategy for integrated energy system based on improved deep reinforcement learning,” Energy, vol. 235, p. 121377, 2021.
- [18] R. Lu, T. Ding, B. Qin, J. Ma, X. Fang, and Z. Dong, “Multi-stage stochastic programming to joint economic dispatch for energy and reserve with uncertain renewable energy,” IEEE Transactions on Sustainable Energy, vol. 11, no. 3, pp. 1140–1151, 2019.
- [19] Y. Huo, F. Bouffard, and G. Joós, “Decision tree-based optimization for flexibility management for sustainable energy microgrids,” Applied Energy, vol. 290, p. 116772, 2021.
- [20] B. Mohandes, M. Wahbah, M. S. El Moursi, and T. H. El-Fouly, “Renewable energy management system: Optimum design and hourly dispatch,” IEEE Transactions on Sustainable Energy, vol. 12, no. 3, pp. 1615–1628, 2021.
- [21] M. Kamel, R. Dai, Y. Wang, F. Li, and G. Liu, “Data-driven and model-based hybrid reinforcement learning to reduce stress on power systems branches,” CSEE Journal of Power and Energy Systems, vol. 7, no. 3, pp. 433–442, 2021.
- [22] X. Peng, Y. Chen, K. Cheng, H. Wang, Y. Zhao, B. Wang, J. Che, C. Liu, J. Wen, C. Lu et al., “Wind power prediction for wind farm clusters based on the multifeature similarity matching method,” IEEE Transactions on Industry Applications, vol. 56, no. 5, pp. 4679–4688, 2020.
- [23] M. Cui, J. Zhang, Q. Wang, V. Krishnan, and B.-M. Hodge, “A data-driven methodology for probabilistic wind power ramp forecasting,” IEEE Transactions on Smart Grid, vol. 10, no. 2, pp. 1326–1338, 2017.
- [24] S. Wang, X. Zhao, H. Wang, and M. Li, “Small-world neural network and its performance for wind power forecasting,” CSEE Journal of Power and Energy Systems, vol. 6, no. 2, pp. 362–373, 2020.
- [25] L. Ge, Y. Xian, J. Yan, B. Wang, and Z. Wang, “A hybrid model for short-term pv output forecasting based on pca-gwo-grnn,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 6, pp. 1268–1275, 2020.
- [26] G. Li, S. Xie, B. Wang, J. Xin, Y. Li, and S. Du, “Photovoltaic power forecasting with a hybrid deep learning approach,” IEEE Access, vol. 8, pp. 175 871–175 880, 2020.
- [27] H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proceedings of AAAI, 2021.
- [28] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, 2014.
- [29] S. J. Taylor and B. Letham, “Forecasting at scale,” The American Statistician, vol. 72, no. 1, pp. 37–45, 2018.
- [30] J. Li and T. Yu, “Deep reinforcement learning based multi-objective integrated automatic generation control for multiple continuous power disturbances,” IEEE Access, vol. 8, pp. 156 839–156 850, 2020.
Xinhang Li received the B.E. degree in communication engineering from Beijing University of Posts and Telecommunications (BUPT), Beijing, China, in 2021. He is currently pursuing the Ph.D. degree in information and communication engineering from the School of Artificial Intelligence, BUPT. His research interests include deep reinforcement learning, optimal power flow and intelligent information processing.
Zihao Li is currently pursuing the B.E. degree in telecommunications engineering with management with Beijing University of Posts and Telecommunications, Beijing, China. His current research interest includes machine learning and artificial intelligence.
Nan Yang received the B.S. and M.S. degrees in electrical engineering from Beijing Institute of Technology (BIT), Beijing, China, in 2015 and 2018, respectively. She works for China Electric Power Research Institute, and her research interests include big data analysis and artificial intelligence application in the field of power dispatching automation.
Zheng Yuan received the B.E. degree in information engineering from Beijing University of Posts and Telecommunications (BUPT), Beijing, China, in 2021. He is currently pursuing an M.S. degree from the School of Artificial Intelligence, BUPT. His research interests are reinforcement learning, power dispatching and intelligent information processing.
Qinwen Wang received the B.E. degree in digital media technology from Communication University of China (CUC), Beijing, China, in 2020. She is currently pursuing an M.S. degree from the School of Artificial Intelligence, Beijing University of Posts and Telecommunications (BUPT), Beijing, China. Her research interests include reinforcement learning, smart grid and cooperative intelligent transportation systems.
Yiying Yang received the B.E. degree in communication engineering from Beijing University of Posts and Telecommunications (BUPT), Beijing, China, in 2021. She is currently pursuing the M.S. degree in information and communication engineering with BUPT. Her research interests include reinforcement learning, power dispatching and cooperative connected vehicles control.
Yupeng Huang received the B.S. degree in electrical engineering from Tsinghua University (THU), Beijing, China, in 2016, and received the M.S. degree in electric power system and automation from China Electric Power Research Institute (CEPRI), Beijing, China, in 2019. He works for China Electric Power Research Institute and his research interests include power dispatching and automation.
Xuri Song received the B.E. and M.S. degrees in electrical engineering from China Agricultural University, Beijing, China, in 2009 and 2011. He works for China Electric Power Research Institute. His research interests include power grid analysis and artificial intelligence application.
Lei Li is currently an Associate Professor with the School of Artificial Intelligence, Beijing University of Posts and Telecommunications, China. Her research interests include intelligent information processing, deep learning, machine learning, and natural language processing.
Lin Zhang (Member, IEEE) received the B.S. and Ph.D. degrees from the Beijing University of Posts and Telecommunications (BUPT), Beijing, China, in 1996 and 2001, respectively. He is currently the Director of Beijing Bigdata Center and also a Professor of BUPT. He was a Postdoctoral Researcher with Information and Communications University, South Korea. He used to hold a Research Fellow position with Nanyang Technological University, Singapore. In 2004, he joined BUPT as a Lecturer, then an Associate Professor in 2005, and a Professor in 2011. He has authored more than 120 papers in referenced journals and international conferences. His research interests include intelligent information processing, deep learning, mobile cloud computing and Internet of Things.