Comparison and Evaluation of Methods for a Predict+Optimize Problem in Renewable Energy
Abstract
Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the “IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling,” held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.
Index Terms:
Energy Forecasting, Optimization, Time Series, Scheduling, Predict and OptimizeI Introduction
Optimization problems to be solved over an unknown future are at the core of many complex real-world operations. For example, supply chains, inventories and staffing rosters all need to be planned based on assumptions of future customer demand. This type of optimization will also play a vital role in the global transition to reduce CO2 emissions. Renewable energy production is characterized by variability over time, and the inability to readily vary production based on demand. Therefore, demand needs to be scheduled to make best use of supply where possible, with energy storage systems such as batteries scheduled optimally to makeup the shortfall, all based on unknown future production and demand. The common approach to solve these problems is to forecast the future, and use this as the “true” input for the optimization. Although this is expedient, it pays little regard to the uncertainty of the forecast. One way to address uncertainty is to use robust optimization [1] or stochastic optimization [2], with probabilistic forecasts as inputs instead of point forecasts. Some applications along these lines are presented by Dehghani et al. [3] for trans-shipment and Jung et al. [4] for supply chain management.
More recently, researchers have started to address these types of problem more holistically in an emerging research field known as Predict+Optimize, where forecasting and optimization are not treated as isolated tasks, but their interaction is taken into account. Using this approach, forecasts are chosen or evaluated through their contribution to the actual downstream cost of the optimization problem, in preference to arbitrary measures of forecast quality, such as MAE, RMSE, or CRPS. Kotary et al. [5] give an overview of methods of this type. Elmachtoub et al. [6] develop a specialized algorithm to build decision trees directly for the true optimization target, and Elmachtoub and Grigas [7] develop a differentiable surrogate for the true optimization target. Mandi et al. [8] extend this framework to discrete optimization and Demirovic et al. [9] develop an alternative solution for dynamic programming. Other approaches aim to build end-to-end systems where no intermediary forecasting is needed. For example, Donti et al. [10] build an end-to-end model in the form of a neural network that optimizes for the loss of a stochastic optimization problem, and Gao et al. [11] build an end-to-end system using imitation learning for scheduling of a microgrid.
A lot of research in these areas has focused on energy production and consumption. In addition to classical work on forecasting, e.g., energy demand [12], renewable power production [13, 14, 15, 16], or energy price [17], there is an increasing body of work on machine learning systems that integrate prediction and optimization via customized loss functions, e.g., using boosted regression trees and neural networks [18, 19, 20, 21, 22]. Other research has used meta-optimization, in which an outer optimization loop is added to a Predict+Optimize pipeline, such as in Carriere and Kariniotakis [23], and with methods to keep computational cost manageable, such as Lagrangian relaxation [24] or simplified linear programs [25].
Because optimization and forecasting are both difficult problems in their own right, the combined complexity of Predict+Optimize models in the literature may not be applicable to real-world problems. It may be that the combined problem requires simply specified problem instances, or that computation time is prohibitively long. Realistic problems require both a real-world data set and complex optimization. In this regard, we see a lack of problems from which to systematically determine the state of the art in this field of research. Competitions are a good way to establish standard benchmark problems. However, we are aware of only one competition in this area, the “ICON Challenge on Forecasting and Scheduling,” hosted in 2016. This challenge required a single time series (energy price) to be forecast, for the subsequent scheduling of server jobs to minimize energy cost. With a relatively simple prediction problem and a difficult optimization problem, this challenge leaned heavily towards optimization. The competition winner [26] implemented heuristics for generating an initial solution, which was then improved using a hill climbing algorithm.
Inspired by the ICON Challenge, we organized the “IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling,” [27], as part of a series of yearly Technical Challenges hosted and sponsored by the IEEE Computational Intelligence Society. The goal of our challenge was to provide a relevant real-world dataset and optimization benchmark problem along with strong baseline solutions from which to establish a state of the art in the area for the research community. We hope that this will enable more standardized and streamlined evaluation of future research in the field. A particular aim of the competition was to balance the requirements of the problem so that the competition could not be won by focusing on either forecasting or optimization alone.
The problem to be solved comes from the Monash Net Zero Initiative, consisting of a campus-wide microgrid having rooftop solar photo-voltaic installations and a battery for energy storage. This operates across the entire university campus, including buildings, grounds and electric vehicles (EVs), with the goal of achieving net zero emissions by 2030. In particular, it aims to: 1) maximize self-consumption of electricity, 2) participate in energy demand response programs, and 3) keep track of electricity price and yearly peak load tariffs, to manage costs.
From a technical point of view, the data provided presents an interesting time series prediction problem. The demand and production data has complex seasonality; external data (weather, electricity price) are factors in the problem. There is also the opportunity for cross-learning between time series for the energy demand and solar production problems. From an optimization point of view, the uncertainty in the inputs presents a mismatch between the forecast (production and demand) and that which actually eventuates. This needs to be addressed, along with various constraints to achieve a competitive solution. The goal of the optimization is to develop a battery charge and discharge schedule along with a lecture theatre use schedule that results in the lowest energy cost. Battery use is constrained by capacity. Lecture theatre use is determined by the university timetable, with some activities being regular, and others one-off.
We hope that addressing this problem will contribute to the fight against climate change through the development of strategies to make renewable energy more reliable and consequently more affordable. We also hope that the approaches and methods presented in the technical challenge will be applicable to many other fields facing similar problems of optimal decision-making in the presence of uncertainty.
The remainder of the paper is structured as follows. Section II presents the competition setup. Section III discusses the submitted solutions, presents the final rankings, and gives an overall summary of the results. Section IV describes the best-performing solutions, and Section V concludes the paper.
II Competition setup
The competition setup aimed to be as close to a typical real-world situation as possible. However, some adjustments needed to be made, due to the nature of the competition. In real-life, battery scheduling would typically be performed using historic data such as: building demand, solar power production, weather or specialized solar forecasts, and electricity price forecasts (from external providers) as input variables. Based on this, the battery schedule would then be optimized for approximately 1-3 days in advance, re-running the optimizer periodically (e.g., every 15 minutes). In real-life, lecture times and locations would be planned well in advance of the academic year, and without regard to the power schedule. Building energy use would ideally be comprised of building base load, along with the energy use from scheduled demand.
For the competition we had to decide between concealing the exact time and location of energy use data to prevent participants matching this against publicly available weather and energy market data, or assume that this was known. We opted for the latter, which meant that the competition effectively assumed that perfect weather and energy price forecasts were available. In addition, we disclosed the exact time and location (the Monash Clayton campus) from where data was gathered, and information on how participants could obtain weather and energy pricing data. Because Melbourne was in lockdown due to the COVID-19 pandemic for large parts of 2020, participants were able to make good approximations of building base load, since no on-campus lectures were held over this time. A timeline of events with a rough estimation from us about at which capacity the campus was operating at each time was provided to the participants and is shown in Figure 1.
We asked the participants to schedule activities (resulting in energy use) and battery charging over the full month. This differs from a normal production system, where planning is performed over a shorter horizon and constantly updated. However, our competition setup provided perfect weather and energy price data, which overcomes the information disadvantage of having to plan over a longer time frame, which meant that the competition results were comparable to those that could be obtained by planning over a shorter horizon with constant updating.

The competition was then conducted in two phases. Phase 1 ran for 3 months, from July to October 2021. Phase 2 ran for approximately 3 weeks during October 2021. The goal of Phase 1 was to optimally schedule batteries and timetabled activities (lectures) for the month of October 2020. Participants could submit forecasts and/or optimal schedules to a leaderboard during this phase. These were then evaluated, with the results visible to all participants. During Phase 2 of the competition, data for October 2020 was released to the participants, who were asked to perform the same forecasting and optimization exercise as Phase 1, but for November 2020. Several problems in the competition setup were addressed at this time. Notably, time zones for forecasting and optimization were aligned, and we ensured that no large outliers equivalent to those identified in the Phase 1 test data set (which made forecasting less important) were present in the Phase 2 test set. During this phase, only minimal feedback was provided to the participants about the quality of their submissions in the form of whether the solution was valid, and whether it was better, equal, or worse than the sample submission.
Phase 2 of the competition determined the winners and prizes awarded. The majority of prizes (USD $18k from a total of USD $20k) were awarded based on optimal energy cost. Teams could choose any methodology for optimization. This included the freedom not to perform forecasting if this was deemed unnecessary. However, a separate forecasting prize of USD $2k was awarded to the best forecasting solution, to encourage the participants to consider forecasting as part of their solution, and to promote a competition that integrates forecasting and optimization.
The competition was set up in line with best practice from the research literature and competition platforms such as Kaggle [28]. In particular, Athanasopoulos and Hyndman [29] argue that feedback in competitions leads to better outcomes, which is why Phase 1 of the competition presented results transparently. This enabled participants to gain a deep understanding of the problem, and also gave the organizers an opportunity to identify and address problems in the competition setup. The independent test set and minimal feedback in Phase 2 ensured that participants had no means to overfit their energy forecasts, but still had a mechanism to ensure their solution was valid.
Unlike many competitions where a single solution determines its ranking, we assembled a scientific committee consisting of 8 scholars who were asked to rank the submissions according to certain criteria (see Section II-E), based on a 4-page report of the methodology submitted by each of the shortlisted teams. This additional score was then combined with the optimization scores to determine a final score. The aim of this exercise was to ensure the scientific rigor and benefit to the research community of the winning solutions by promoting those with more general applicability in practice, over those that were very tailored to the competition data and the evaluation metrics.
Once winners were determined and prizes awarded, we released the final test set of November 2020, so that the solutions where participants published their code could be reproduced.
II-A Data description
The following energy consumption, solar production and weather data was made available to participants from the competition web page [27].
-
•
Energy consumption data recorded at 15-minute intervals was obtained from 6 buildings on the Monash Clayton campus over varying time periods, up to September 2020 (for Phase 1) and October 2020 (for Phase 2). Time series of about 5 years, commencing in 2016 were obtained from Buildings 0 and 3, whereas shorter time series of about one year were obtained from the other buildings. The dataset doesn’t contain a building numbered 2 as the data for this building was scarce and the decision was made to exclude this building before the competition started. Figure 2 shows these time series (in ) and the commencement date for each in TSF file format [30] (which is based on Weka’s ARFF file format [31]). Missing data is indicated by “?”. Figure 3 shows the series over their full duration graphically.
-
•
Solar production data from 6 rooftop solar installations on the Monash Clayton campus was recorded at 15-minute intervals over approximately one year until September/October 2020 (for Phases 1 and 2 of the competition, respectively). These data (in ) are also shown in Figures 2 and 3. One participant noted that the data for Solar 1 seem to be cumulative data for some parts of the series.
-
•
Weather data (ERA5) was generously provided by Oikolab [32], mid way through Phase 1 of the competition. It contains hourly measurements of temperature (), dewpoint temperature (), wind speed (), mean sea level pressure (Pa), relative humidity (), surface solar radiation (), surface thermal radiation (), and total cloud cover (), from 2010 to 2021.

Participants were also encouraged to use the following data from external sources:
-
•
Weather data (BOM) from the Australian Bureau of Meteorology (BOM) included the daily minimum temperature (), maximum temperature (), rainfall () and solar exposure () at three weather stations near the Monash Clayton campus: Olympic Park, Moorabbin Airport and Oakleigh (Metropolitan Golf Club) [33]. Each data series commenced on the 1st of January 2016 and concluded at the date of download by participant.
-
•
Electricity price data from the Australian Energy Market Operator (AEMO) consisted of half-hourly electricity price and demand data at the state level [34]. For Phase 1 of the competition, the relevant data was Victoria during October 2020, available from [35].111Though the intended use from this data source was the price data, some participants also used the demand data that they found helpful. The competition policy stated to gain permission from the organizers for any external datasets. However, as the demand data was (unintentionally) provided by the organizers, it was a grey area so that teams using the dataset did not request permission and in consequence not all teams were aware of the demand data, and that it could potentially be used.

II-B Forecasting
Forecasting was optional in the competition, but encouraged through a small prize for the most accurate forecast. During Phase 1 of the competition, participants were expected to predict the power demand of the 6 buildings, and the power production of the 6 arrays of solar panels over 15-minute intervals for each day in October 2020. This amounted to 2976 15-minute forecasts in total. At the end of Phase 1, the actual energy demand of each building and power production of each array of solar panels was released. For Phase 2, participants were then expected to provide the 15-minute energy demand/production forecasts for the same buildings and solar panels over the 30 days of November 2020.
II-C Optimization
The optimization problem was to timetable a set of activities over the coming month, across the set of six buildings, with the objective of minimizing the total electricity cost. Power was provided at no cost by the 6 arrays of solar panels as well as being available at market price from the energy grid. Batteries were available, enabling the storage of excess solar energy for later use, or charging from the grid during periods of low cost.
Each building was assumed to have a constant base load, that could be determined from the data provided. For each building, , the net power draw at time, , (in ) is denoted by . The buildings collectively placed a cumulative load of on the energy supply (solar, batteries and grid). It was assumed that cumulative load must always be positive, that is, there was no feed-in. Thus . It was also assumed that the university was exposed to a peak load tariff. This meant that it must pay an energy cost based on , over the whole month.
A set of activities to schedule over the month was given. These activities included recurring activities (such as lectures scheduled at the same time each week), and once-off activities (for example, experiments). Some activities had precedence constraints, where meant that must happen on a weekday before . In the case of recurring activities, these precedence constraints applied to each week (that is, lecture before lab), whereas for once-off activities, the precedence constraint applied only to the single instance. Each activity, , had its own power draw and duration , resulting in a ‘rectangular’ load. Scheduling an activity meant associating that activity to a room in one of the buildings at a given time slot. The activity’s load requirement would then contribute to that of the assigned building. Rooms were treated as a finite resource that could not be double-booked. All instances of a recurring activity had to be scheduled within normal business hours (commencing on or after 9:00 and finishing before 17:00), whereas once-off activities could be removed from the schedule. Each “deferrable” once-off activity attracted a discount for performing the activity during office hours. Deferrable activities could also be scheduled after hours, which incurred an additional cost.
Participants were given 5 large, and 5 small scenarios to schedule. Each scenario consisted of a number of activities to be scheduled over the month, any precedence requirements, and whether they were one-off or recurring. Each building was specified by the number of large and small rooms available. Solar production attributable to each building varied between scenarios by assigning one of the six solar time series (see Figure 3) to the building in each scenario. Battery storage remained constant between scenarios. Activities were specified by the number of rooms required, whether these rooms were large or small, the duration of the activity, energy load of that activity, and whether it was recurrent or one-off. Furthermore certain pairs of activities were also assigned precedence.
Conceptually, the activity scheduling problem thus defined is an instance of the Resource-Constrained Project Scheduling Problem (RCPSP) with time windows [36]. Here the activities are project tasks, and the rooms and electricity use are the resource limits. Because recurring activities with precedences must be scheduled on different days, the problem also exhibits minimum time lags, as well as maximum lags due to the limited time window of ‘daytime on weekdays’ during which all recurring activities must be scheduled. Bartusch et al. [36] proved that even just testing whether such a problem has a feasible solution is an NP-hard problem, meaning that no efficient solutions exist to solve it optimally (assuming the widely held expectation that P NP holds).
Compared to the typical RCPSP with time windows, the problem also has a number of additional considerations; access to a battery means that some of the resource limits can be altered. And the once-off activities are optional, while in RCPSP all activities must be scheduled. Furthermore, our objective is not ‘shortest makespan schedule’, but minimum energy imports. The objective consists of three parts: 1) the wholesale energy cost of all energy imported, 2) a peak load demand charge, and 3) the additional profit of scheduled once-off activities. Given cumulative power draw at time in (kW), and wholesale energy price in $/MWh (a time series which was provided to the participants), we convert power usage to energy consumption by assuming constant load during each 15-minute time step. The demand charge is fixed at 5 , independent of when the peak load occurs during the month. Finally, the value of every scheduled () once-off activity is earned, minus the penalty for scheduling out-of-office if it applies (), thus:
(Energy cost) | ||||
(Demand charge) | ||||
(Once-off profit) |
Despite the worst-case hardness of the RCPSP, it is known that randomly generated instances may exhibit shallow hardness characteristics. Vanhoucke et al. [37] propose six topological indicators of precedence graph connectedness, and perform a regression analysis on instance hardness in terms of branch-and-bound search tree depth as a function of these indicators. We used the insights from this work to construct an instance generation algorithm tuned to generate instances which sit in the hardness sweet spot between over- and under-constrained. This meant having instances with relatively few precedences, to ensure that the instance is close to parallel, and having a mix between long chains of activities and free activities.
Feasible activity schedules were created as follows: 1) sample and schedule activities, 2) assign precedence constraints, 3) set resource limits. In the first stage, a number of activities are sampled, with duration steps (from half-hour to two-and-half-hour long), number of rooms , using a small-sized room with . Each activity is assigned power consumption proportional to the maximum base power consumption observed in the time series, sampled from of the maximum base load. Once-off activities were given a value proportional to the average cost of energy required, from times the average cost. Sampled activities were then assigned a day of the week, and an in-office-hours time of day, constructing a tentatively valid schedule (meeting all the time-window constraints), without any precedece constraints. In the second stage, we sampled precedence constraints between scheduled activities such that they were already satisfied by the tentative schedule: Each activity considers the set of all activities on previous days, and samples without replacement from this set a number of preceding activities chosen from a Binomial distribution with (recurring) or (once-off). Thus, by construction we end up with five bins of activities; those tentatively scheduled on Monday, having no precedences, and those tentatively scheduled on Friday having many, with potentially ‘long’ arcs. Finally, in the third stage, the number of rooms was determined as the maximum required by the tentative schedule of recurring activities only, meaning that once-offs have to fit in ‘gaps’ different from the tentative schedule by construction. The battery specifications were matched to be close to the actual scale and performance of the two batteries currently installed in the Monash microgrid.
We generated two sizes of instances. Small instances had 50 recurring and 20 once-off activities, which is about middle of the road for the now easily solvable psplib set of benchmark instances [38]. Large instances had 200 recurring and 100 once-off activities, nearly three times the largest psplib instance and unlikely to be solvable to optimality using ‘brute force’. For each of the two phases we generated 5 small and 5 large instances, regenerating them to ensure that the competitors were required to solve the Phase 2 instances from scratch (i.e., without opportunity to use warm starts or learned statistics about the Phase 1 instances).
II-D Evaluation of forecast accuracy and total energy cost
II-D1 Evaluation of forecasts
The forecasts of the 12 time series (energy demand of 6 buildings and power production from the 6 arrays of solar panels) were evaluated using Mean Absolute Scaled Error (MASE) [39], defined as
where is the number of instances in the training series, is the length of the seasonal cycle of the dataset, is the forecast horizon, are the forecasts and are the actual values. MASE was calculated individually for each time series and averaged for the final error used to rank submissions.
II-D2 Evaluation of optimal schedule and total energy cost
Schedules were first checked for feasibility, after which the energy cost was computed for feasible schedules.
Feasibility
Schedules were required to assign a time period to every recurring activity while adhering to the following constraints:
-
•
The starting period must be during the week having the first Monday of the month,
-
•
Start time 9:00,
-
•
Finish time (start time plus activity duration) 17:00,
-
•
Activity precedence had to be observed.
Every battery schedule had to respect the capacity of the battery, such that the State-of-Charge (SoC) of the battery stays in for all time periods t.
Objective
For a feasible schedule, we compute the objective value in terms of the cost of the schedule, which is to be minimized, using the objective function given in the previous section.
II-E Evaluation by the scientific committee and calculation of final scores
We asked the 8 members of the scientific committee (SC) to rank the solutions using a form inspired by peer review forms from machine learning conferences. There were free text criteria such as to list 3 advantages and 3 disadvantages of the solution, comment on the robustness of the optimization model, potential generalizability of the approach, and on potential overfitting in the approach. The SC was also asked to rank the solutions on a scale of (excellent/good/ok/poor) for each of: scientific contribution, soundness, clarity, and reproducibility. Finally, we asked the jurors to give an overall evaluation of the submission on a scale of (excellent/very good/good/acceptable/ok/poor). These scales were translated to numerical values using a simple linear scale to produce a numerical score for each participant, that was averaged over the SC members and ranked. The final ranking of participants was calculated as the sum of of the energy optimization ranking and of the SC ranking.
As well as submission of the 4-page report for the SC evaluation, participants were required to submit their source code for verification by the organizers that the code produced the reported solution. Participants were also required to present their solution at a special session of the 2021 IEEE Symposium Series on Computational Intelligence for further questions and checking by the panel and audience. All shortlisted teams passed these hurdles without any problems.
III Solutions submitted, final rankings, and summary of results
In this section, we give an overview of all submissions, the leaderboard time line, an overview of the best-performing solutions, and more detailed evaluation the shortlisted solutions by the scientific committee.
III-A Submitted solutions
In total, 49 individuals/teams participated in either Phase 1 or Phase 2 of the competition. 36 individuals/teams submitted to Phase 1, and 36 (different) individuals/teams submitted to Phase 2. 23 individuals/teams submitted to both Phase 1 and Phase 2 of the competition. Many participants submitted multiple times for evaluation. During Phase 1 there were 522 actual submissions throughout the competition period. There were fewer submissions (220) during Phase 2, since no feedback was given during the competition period. Approximately 50% of teams attempted the forecasting task only.
As it was not required to have submitted to Phase 1 in order to submit to Phase 2, several new teams entered the competition for Phase 2 only. A number of teams that were not competitive in Phase 1 dropped out of the competition before Phase 2. Due to the challenging nature of the optimization problem, approximately 50% of teams only attempted the forecasting activity. Because the Phase 2 leaderboard gave only minimal feedback, participants made fewer submissions during Phase 2 compared to Phase 1, where submissions could be used for guidance in the development of solutions. Table I shows the development of the leaderboard over time, for Phase 1 and Phase 2. Table II shows the top positions of the leaderboards at the conclusion of Phase 1 and Phase 2, respectively.
The relationship between forecasting and optimization performance, for solutions in Phase 2 that outperformed the organizer-supplied baseline MASE and energy cost, is shown in Figure 4. The analysis has to be taken with caution, as participants were not required to submit the forecast actually used during optimization, meaning that the actual forecast reported may not have been that used. Furthermore, we see that the linear fits shown as lines in the plot do not represent the data well and serve only as an overall guidance. However, it is immediately apparent that there is very little correlation between solar forecast accuracy and energy cost. This is because solar power generation is approximately an order of magnitude smaller than the actual building load meaning that solar energy forecast errors only have a small effect on total energy costs. Forecasting building energy demand was much more important to total cost, hence the higher correlation between the two compared to solar. The figure also shows that we see higher correlations for MAE than MASE since MAE more closely resembles the optimization function by taking the scale of the series into account.






Date | Best Cost | Best MASE |
---|---|---|
Phase 1 | ||
19/07/2021 | 453,317 | 1.1365 |
16/08/2021 | 453,317 | 0.8776 |
30/08/2021 | 445,218 | 0.8776 |
13/09/2021 | 444,858 | 0.8106 |
27/09/2021 | 444,858 | 0.6625 |
11/10/2021 | 439,071 | 0.6320 |
Phase 2 | ||
14/10/2021 | 339,160 | 0.8030 |
18/10/2021 | 339,160 | 0.8030 |
23/10/2021 | 337,625 | 0.6927 |
27/10/2021 | 337,625 | 0.6927 |
30/10/2021 | 329,441 | 0.6927 |
02/11/2021 | 328,359 | 0.6460 |
Team | MASE | Energy cost ($) |
---|---|---|
Phase 1 | ||
MA&RE | 0.982255 | 439,071 |
HRI | 0.658880 | 439,936 |
RB | 0.632086 | 446,416 |
FRESNO | 0.777158 | 482,870 |
AS | 0.695587 | 483,643 |
QSZU-PolyU | – | 485,733 |
EVERGi | – | 710,227 |
Phase 2 | ||
MA&RE | 0.744052 | 328,359 |
RB | 0.646022 | 335,107 |
HRI | 0.855737 | 339,160 |
EVERGi | 0.807299 | 340,726 |
QSZU-PolyU | 0.774996 | 342,810 |
FRESNO | 1.870326 | 357,210 |
AS | 0.847391 | 363,168 |
III-B Overview of best-performing solutions
Optimization methodology | ||||
---|---|---|---|---|
Team | EC ($) | Algorithm | Software | Comments |
MA&RE | 328,359 | MIP/MIQP | Gurobi | Sample Average Approximation Method (SAAM) is employed |
in which the optimization model minimizes the average cost | ||||
of a solution over multiple scenarios | ||||
RB | 335,107 | MIP/MIQP | Gurobi | Two-staged process |
HRI | 339,160 | MIP/MIQP | Gurobi | Split into three sub-problems, use linearization technique |
EVERGi | 340,726 | CMAES or GA and | Gurobi for MIP, | Evolutionary algorithms for activity scheduling, |
subsequent LS for schedule, | pygmo for CMA-ES, | MIP for battery scheduling | ||
MIP for batteries | PyGAD for GA | |||
FRESNO | 357,210 | MIP | Gurobi | Linearization, did not schedule once-off activities |
QSZU-PolyU | 342,810 | LS (Local Search) | – | Develop a custom method to generate feasible solutions, |
randomly modify those | ||||
AS | 363,168 | MIP | Gurobi | Large Neighborhood Search coupled with scenario-based |
robust optimization, fix-and-optimize approach |
Building demand forecasting methodology | ||||||
Team | MASE | MAE | RMSE | Algorithm/Software | Input features | Comments |
MA&RE | 0.841 | 18.294 | 27.265 | Ensemble of LightGBM | Calendar features, daily/hourly | Ensemble over models that use daily, weekly, |
weather data, rolling statistics | and daily&weekly weather features | |||||
RB | 0.807 | 17.441 | 25.263 | Quantile regression forest | Calendar features, Fourier terms, | Groups of buildings trained together as they |
from R package “ranger” | BOM data, ERA5 data, lagging | were observed to be closely correlated over | ||||
and leading features | time. | |||||
HRI | 1.089 | 21.522 | 31.029 | Seasonal median forecast | No external inputs | Eight weeks of historical data as input |
over last 8 weeks | ||||||
EVERGi | 0.959 | 18.790 | 26.594 | LightGBM | Calendar features, weather data, | Log transform as preprocessing, Prophet for |
occupancy rates, lags, seasonality | feature engineering; Building 4 is treated as | |||||
and trend features | a multi-class classification | |||||
FRESNO | 0.921 | 20.608 | 28.840 | STL decomposition, then | Calendar features, occupancy, | 2 months of historial data as input |
ARIMA, RF, LightGBM, | hourly weather data | |||||
and SVM | ||||||
QSZU-PolyU | 0.835 | 16.460 | 22.751 | Different ML models, | Calendar features (hour, minute, | Models trained across buildings, |
including neural networks | weekday), total energy demand | preprocessing different for each building | ||||
of Victoria | ||||||
AS | 0.945 | 21.164 | 29.965 | Random Forest, Quantile | Weather data, calendar effects, | – |
Regression Forest | impact of COVID-19 restrictions, | |||||
exams period, and others | ||||||
Solar production forecasting methodology | ||||||
Team | MASE | MAE | RMSE | Algorithm/Software | Input features | Comments |
MA&RE | 0.647 | 1.312 | 2.309 | Ensemble of LightGBM | Calendar features, daily weather | Ensemble over models that use daily, |
data, hourly weather data, | weekly, and daily&weekly weather features | |||||
various rolling statistics | ||||||
RB | 0.485 | 0.950 | 1.855 | Quantile regression forest | Weather data, leading and | All of the solar instances |
lagging features | were trained together | |||||
HRI | 0.623 | 1.279 | 2.397 | Random forest | Weather data, leading and | – |
from scikit-learn | lagging features | |||||
EVERGi | 0.656 | 1.364 | 2.412 | LightGBM | Calendar features, weather data, | – |
mean value at similar time | – | |||||
FRESNO | 2.820 | 5.639 | 9.249 | ResNet, Refined Motif (RM) | Solar generation data, weather, | Team submission was erroneous, |
date time | error would be lower | |||||
QSZU-PolyU | 0.715 | 1.441 | 2.555 | Ensemble of various different | Surface solar radiation most | – |
types of neural networks, | important feature | |||||
SVR, Prophet | ||||||
AS | 0.750 | 1.504 | 2.617 | Ensemble of Random Forest, | Weather data, calendar features | – |
Gradient Boosting Machines, | ||||||
Ridge Regression, and Local | ||||||
Learning Regression |
Tables III and IV present an overview over the optimization and forecasting methods used by the shortlisted teams. It is evident in Table III that most teams used mixed integer programming (MIP), or mixed integer quadratic programming (MIQP) with linear relaxations for the optimization. Only the teams ranked at 1st and 7th place considered forecast uncertainty by predicting scenarios and employing stochastic or robust optimization. The other teams relied on point forecasts and deterministic optimization. Regarding software, most teams used Gurobi for optimization via a Python interface. Some notable exceptions were the EVERGi team, who used evolutionary algorithms, in particular CMAES, combined with a subsequent local search, for the activities schedule. Team QSZU-PolyU used a simple heuristic approach for the scheduling that was then further optimized with a local search, which provides an excellent benchmark from which to assess possible gains obtained by more complex methodologies.
Table IV shows that most of the top performing teams used tree-based algorithms, namely LightGBM and Random Forest (RF), to forecast building energy demand. A notable exception was the team from the Honda Research Institute (HRI, consisting of Steffen Limmer and Nils Einecke), who were able to achieve good results with a very simple technique: a seasonal median of demand over the previous 8 weeks. Several teams observed that Building 4 had very low demand, which led the EVERGi team to model this building as a multi-class classification problem. Other teams employed simple techniques, e.g., Richard Bean (RB) used a median forecast for this building. There are considerable amounts of missing values in the data, and we assume that this is a reason why tree-based methods performed well. Most teams used the weather data provided (daily and hourly), together with calendar features and/or Fourier terms. Weather data was used with both lagging and leading features since the competition assumed the availability of a perfect weather forecast. Some teams used other features such as the total energy demand for the state of Victoria, and occupation rates as estimated from COVID-19 restriction information and the academic calendar (e.g., exam periods).
Tree-based algorithms such as LightGBM and Random Forests were employed by most of the top solutions for solar forecasting. In particular, the two best solutions in terms of forecasting accuracy are based on these. Other approaches were neural networks such as ResNet (FRESNO team), and ensembles that included support vector regression (SVR), Prophet, Ridge Regression, and other algorithms. The features used by the participants were again lagging and leading weather features (solar irradiation in particular), and calendar features.
III-C Evaluation of results by the scientific committee
Figure 5 shows the overall evaluation of the shortlisted teams by the scientific committee (SC). The average score in each subcategory for these teams is shown in Table V. Results show that, generally speaking, the highest-ranked submissions were those that gave the best solutions in terms of energy cost. However, when the SC evaluations were incorporated with energy cost rankings, 5th and 6th ranked teams swapped, and the 3rd and 4th ranked teams were ranked equally at 3rd place. Further details of the evaluation of each team by the scientific committee are given in the appendix of the paper, in the supplementary material.
MA& | RB | HRI | EVE- | QSZU- | FRE- | AS | |
---|---|---|---|---|---|---|---|
RE | RGi | PolyU | SNO | ||||
Sc. Contrib. | 2.12 | 2.12 | 2.62 | 2.50 | 2.14 | 2.14 | 2.29 |
Soundness | 1.50 | 1.75 | 2.00 | 1.75 | 2.14 | 2.00 | 2.00 |
Clarity | 1.62 | 2.25 | 2.00 | 1.88 | 2.43 | 1.86 | 1.71 |
Reprod. | 2.12 | 2.50 | 2.12 | 2.12 | 2.29 | 2.29 | 2.29 |
Overall | 2.12 | 3.00 | 3.12 | 2.75 | 3.29 | 3.00 | 3.14 |

IV Descriptions of best-performing solutions
In the following, we present summaries of the 7 shortlisted solutions, in the order of their final score, i.e., the winning solution is presented first etc. For more details, we refer to the appendix of the paper, in the supplementary material.
IV-A Mahdi Abolghasemi and Rasul Esmaeilbeigi’s solution (MA&RE)
This solution ranked 1st in the optimization and 2nd in the forecasting challenge of the competition. An extensive exploratory data analysis was conducted to look at trends, seasonality, and intermittency patterns of the data. The impacts of COVID-19 were explored on buildings’ power demand since part of the provided data and the forecasting horizon was during the pandemic. Since both solar power and buildings’ demand are highly dependent on weather conditions, the hourly weather data provided by the organizers of the competition and downloaded daily data from the BOM website [33] were used. Various statistical and machine learning models including seasonal ARIMA, RF, LightGBM, and SVR were explored for building the predictive models. While the generated forecasts especially with RF and SVR were fairly accurate and competitive to LightGBM, LightGBM was opted for since it is significantly faster and returns reliable forecasts. The forecasting models were all trained with LightGBM where calendar features, daily weather data, hourly weather data and various rolling statistics of these features were used as input variables in the model. Hyperparameters were optimized and the most significant features for each model were selected. Several forecasts were generated with different models to provide a larger pool of scenarios for the optimization part of the competition. The final submitted forecasts were an ensemble of two LightGBM models for each series where daily and hourly features were used for each series and hyperparameters were optimized.
The objective function of the optimization part minimizes the total energy cost that includes the square of the maximum load, i.e., a quadratic term. A linearization technique was used to linearize this objective function and develop a mixed integer linear program that captures all constraints of the problem. One of the input parameters of the optimization model is the net base load, which is the difference between the total predicted base load of the buildings and the generation of their solar panels, per time slot. The proposed optimization approach does not rely on one forecast. Decision making under uncertainty plays a crucial role in managing energy systems. Uncertainty should be addressed properly since these systems are highly reliant on the predetermined energy prices and policies as well as predictable energy loads and demands [40]. In this solution, the net base loads of each time slot are considered as a random variable in the optimization model and the so-called Sample Average Approximation Method (SAAM) is employed in which the optimization model minimizes the average cost of a solution over multiple scenarios (predictive outcomes) rather than just one. The final submission employs 6 forecasting scenarios. This approach generally prescribes a solution with least expected cost that is also less sensitive to the forecasting errors. See Esmaeilbeigi et al. [41] for more details of SAAM. The code is publicly available from GitHub for forecasting222https://github.com/mahdiabolghasemi/IEEE-predict_optimise_technical_challenge and optimization.333https://github.com/resmaeilbeigi/IEEE_CIS_3rd_Technical_Challenge_Optimiser For more details of this solution see Appendix A in the supplementary material.
IV-B Richard Bean’s solution (RB)
Energy consumption by buildings, and solar energy production was forecast using random forests. Inputs to these models consisted of historical building use and weather data provided by the BOM and the European Centre for Medium-Range Weather Forecasting (ECMWF). Forecast accuracy was improved by thresholding energy consumption for buildings when these appeared as outliers. The forecast window was varied to determine the window length that maximised forecast accuracy. Manual feature selection was used to identify the most important features for each type of forecast and reduce the likelihood of over-fitting. Optimization was performed using a MIP solver to minimize peak load due to recurring activities. One-off activities were then incorporated into the schedule and a MIQP was used to shift activities to minimize total cost, using a no forced discharge battery management policy. The source code of this solution is available online.444https://github.com/RichardBean/IEEE-Predict-Optimize-Challenge For more details of this solution see Bean [42], and for the evaluation of the approach by the scientific committee see Appendix B in the supplementary material.
IV-C HRI Team’s solution
For the load prediction, a simple statistical approach was used, which predicts the load at a certain time step of a week as the median over the load values at the corresponding time steps of eight weeks of historical data. The PV production was predicted with a machine learning approach based on an RF with 14 input features (mainly weather data). For the optimization, a combination of mixed integer linear programming and mixed integer quadratic programming together with the Gurobi solver was employed. In order to accelerate the optimization, different measures were applied:
-
1.
The activities were assigned to buildings in a step performed separately from the main optimization.
-
2.
The number of decision variables was reduced by excluding start times of activities, which are infeasible with respect to precedence constraints, from the problem formulation.
-
3.
The setting of the parameters of the employed solver was tuned.
-
4.
The problem was decomposed into three easier subproblems.
-
5.
The objective function was linearized.
A more detailed description of the approach as well as a thorough evaluation of it on the data and problems of the second phase of the challenge can be found in Limmer and Einecke [43]. For the evaluation of the approach by the scientific committee see Appendix C in the supplementary material.
IV-D EVERGi Team’s solution
The optimization methodology consisted of the following elements: Multi-dimensional time series forecasting using LightGBM [44] was performed to predict both future energy production and consumption from historical data. Features used for modelling included seasonality and trend, weather data, calendar features, time of day, academic calendar, and proportional occupancy of buildings. A log transform was used to reduce variability in consumption. The optimal schedule of room usage and battery storage was created by first creating a base schedule of recurrent and one-off activities that satisfied precedence and room availability constraints. This was performed using both a Genetic Algorithm and the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) [45] independently. The base schedule was then improved by modifying the times of activities one-by-one when this would reduce costs. Recurrent activities were evaluated first, followed by one-off activities, in order of precedence for each group. The optimal battery schedule was determined using MIP implemented in the Gurobi solver. The final submission had an error so that only one battery was used instead of two. After fixing this error, the method would have obtained the second lowest cost in the competition. The source code of this solution is available online.555https://github.com/ujohn33/EVERGI_predict_optimize A more detailed description of the approach can be found in Ruddick et al. [46]. For the evaluation of the approach by the scientific committee see Appendix D in the supplementary material.
IV-E FRESNO Team’s solution
As the building and the solar patterns were completely different, two different sets of forecasting models were developed. Various research on forecasting techniques shows that the ensemble methods outperform individual ones in many cases. Therefore, we used the voting regressor from the Python sklearn package to forecast the buildings’ load. This regression model fits several estimators on the same dataset and then averages them out to get the actual predictions. It was found that tree-based methods like RF and gradient boosted trees gave the highest accuracy for this dataset. Also, to capture the cyclic and seasonal variations of the buildings’ load, STL decomposition was incorporated with the above methods to improve prediction accuracy.
Solar generation by its seasonal nature, tends to have repeated patterns. As a result, it might be useful to extract the most repeating pattern from the solar time series data and account for variances from the baseline using exogenous variables such as weather data. This repeating pattern is discovered by a refined motif (RM) method which is developed by the competition participants in Yuan et al. [47]. The discovered repeating patterns along with other exogenous variables were fed to a 1D convolutional neural network (1D-CNN) during Phase 1 to make predictions. Over-parameterization of CNNs can yield better performance, but training is costly in terms of computation time [48]. Thus, we implemented Residual Networks (ResNet) as an option for an NN that is deep but also has comparably low computational cost. The performance of the ResNet model was generally better compared with 1D-CNN. Note that the submission of the solar forecasts for Phase 2 was erroneous. A corrected calculation for Phase 2 should give comparable MASE values to Phase 1.666https://gitlab.com/ryuan/ieee-cis-data-challenge-fresno/-/blob/main/Solar_prediction.ipynb
For the optimization part of the competition, to capture the constraints of the scheduling problem, binary variables are necessary, for example at which interval a particular task is active. Thus, MIP was used to model this problem. From the problem description the following challenges were identified:
-
1.
Scheduling for one month with 15-minute granularity means vectors of size 2880. Hence, using more activities leads to exponentially increasing complexity.
-
2.
The peak power cost involved a square term making this problem a MIQP.
-
3.
For best economic benefit, it was necessary to schedule all activities within working hours. This also contributes to the peak power term, which is a sizeable chunk of the energy cost.
These challenges heavily influenced the tractability of the problem. We also found that the maximum value obtained by scheduling non-recurring activities was an energy cost of approximately $16,000 and this may not be worth the extra cost and computation required to schedule these tasks. Accounting for this, the following two steps were used to simplify the problem:
-
1.
Only recurring activities were modelled in the problem.
-
2.
The problem was converted to a mixed integer linear program by setting a limit on the peak power term over the month and removing it from the objective.
The methodology was hence divided into 4 sub-sections: data pre-processing, building load forecasts, solar generation forecasts, and optimal scheduling problem. The code of this solution is available online.777https://gitlab.com/ryuan/ieee-cis-data-challenge-fresno For more details see Appendix E in the supplementary material and Kumar et al. [49].
IV-F QSZU-PolyU-Team’s solution
Optimization was performed by first determining an optimal timetable after which the battery use was scheduled. Since activities to be timetabled had precedence relationships, a feasible set of activities that could be performed each day was constructed. Local search was then applied to this feasible set to determine the optimal schedule. Batteries were assumed to be in one of three states: hold, discharge or charge. The optimal battery state at each time slot was determined again using local search. Weather forecasts were made at 15-minute intervals from historical data. Total energy use across all buildings was found to correlate with total Victorian energy use. The prediction of energy use within individual use varied between buildings. Consumption data for some buildings contained many missing values, so consumption was set at constant levels. For other buildings consumption was predicted by time of day, and by week day or weekend. Energy production was forecast from surface solar radiation data.
The source code of this solution is available online.888https://github.com/xuyaojian123/IEEE-Predict-Optimize-Challenge A more detailed description of the approach can be found in Zhu et al. [50].
IV-G Akylas Stratigakos’ solution (AS)
The proposed solution was guided by several challenges that revealed themselves during the early stages of the competition. First, the limited computational resources did not allow to solve the (multiple) problem instances to optimality. Second, the computational cost also hindered our ability to explore different strategies during the validation phase, e.g. how to tackle the parameter uncertainty. Lastly, as the time to be allocated in this challenge was also limited, the decision was made to focus on the optimization component at the expense of the prediction component. Considering the above, the proposed solution adheres to the following: (i) can be implemented in a standard machine, (ii) provides competitive results relatively fast, and (iii) provides hedging against large forecast errors.
To this end, the solution was based on a fix-and-optimize heuristic search to iteratively improve an initial solution of the MIP solver (matheuristic). The problem was formulated as a large MIP and the proposed solution combines Large Neighborhood Search coupled with scenario-based robust optimization for handling uncertainty in the objective function. The uncertainty in problem parameters (i.e., renewable production and electricity demand) was modeled with scenarios based on marginal predictive intervals. A robust objective was then formulated to minimize the worst-case cost within the set of scenarios, thus offering protection against miscalibrated forecasting models. The solution methodology then considered the following steps. First, an adequate feasible schedule was derived considering only hard problem constraints, in this case the scheduling of recurring lecture activities. Next, the solution was improved iteratively with a fix-and-optimize heuristic search. At each iteration, the MIP solver explored a large neighborhood by fixing a subset of variables and optimizing over the remaining free variables. The process was repeated several times until a stopping criterion was met. Code for this solution is publicly available online.999https://git.persee.mines-paristech.fr/akylas.stratigakos/ieee-cis-ppo For more details of this solution see Appendix G in the supplementary material.
V Conclusions
This work has presented the results of the “IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling,” which was held to establish a benchmark dataset/problem, together with the state of the art in terms of performance on it, in a highly relevant research space that is currently lacking such a standard test bed. Out of 49 participants, the 7 shortlisted solutions have been presented here. Most top solutions converged to similar methodologies, namely tree-based forecasting models and MIP optimization, with some notable exceptions (one team used an evolutionary algorithm, another one a simple heuristic for optimization, others used different forecasting methodologies).
In general, most of the forecasting solutions involved extensive manual data cleaning, such as outlier identification and removal, which may be somewhat indicative of problems one would face in such a challenge in the real world, where data quality issues are common. However, this makes them less transferrable to an automated real-world production system.
A central aim of the competition was to design a problem in which both forecasting and optimization are important tasks to perform well. While this objective has been achieved to some degree, and all shortlisted participants but one had competitive forecasting methodologies (indicated by a MASE less than 1), it is a difficult task and the interplay between forecasting and optimization in such an integrated system still has ample room for further research.
While the number of time series was, to the best of our knowledge, larger than in any other similar undertaking before, it is still a relatively small amount if series, and drawing very fine-grained conclusions seems inadequate. Some models used “robustness” as in taking into consideration some worst case scenarios. Given the small number of time series and period for testing, it is difficult to know if this paid off, and robustness will more likely be seen in the long run.
All participants but the 1st and 7th place solutions fed a single forecast into the optimizer, and thus did not consider forecast uncertainty. The winning team employed stochastic optimization to minimize the expected cost over a number of forecast scenarios. There are discrepancies in the rankings of the methods between forecasting and cost. While the MASE is a standard measure to evaluate forecasts, the choice of this measure had certain implications for the competition: The MASE weighs all series equally, while in terms of cost most cost was concentrated in one time series (Building 3). Thus, for best performance across both tasks, participants could have produced one forecast that they used in the optimization, and another forecast to submit as their forecast. This illustrates the challenges in the predict+optimize space. Other error measures than the MASE would have likely led to different forecasting methodologies, but presumably to similar overall outcomes in terms of energy cost. In our work, we found that there was a weak correlation between overall forecast accuracy as evaluated in the competition, and optimization cost. Using a scaled measure like MAE, and/or focusing on the time series with the largest values, shows a higher correlation. Also the participants did not find strong correlations during the competition and one participant hypothesized that having a commercial solver such as Gurobi and access to high-performance computing facilities were more important factors. In contrast, the forecasting task could be performed on a single computer in minutes. Another participant noted that the validation data (Phase 1) included an extremely large demand outlier, which affected the peak demand and the respective peak tariff. In turn, this mitigated the impact of the objective formulation (deterministic versus robust). Further, examining the results on validation data (Phase 1) showed that, at least for the large instances, the peak demand tariff comprised the biggest part of total energy cost. However, the magnitude of the load to be scheduled during Phase 2 (relating to the respective activities), was significantly smaller. If the problem instances are viewed as data points from a problem distribution to be learned [51], this could be considered as a shift in the underlying distribution. Overall, the peak tariff became less important during Phase 2, which somewhat obscured the impact of forecast accuracy in total costs.
Thus, we have found in the competition that increased predictive accuracy does not directly and not always translate into improved optimization performance, and depends, among other things, on the forecast error measure used.
In a follow-up work, Abolghasemi and Bean [52] further explored certain aspects of the results of the competition and generated several scenarios to investigate the association between forecasting accuracy and optimization costs. They considered the participants’ forecasts and also generated several consistent overforecast and underforecast scenarios (perturbed) and computed their corresponding costs. That study shows that the Pearson correlation is 0.81 when we have synthetic over-forecast and under-forecast, and 0.9 for the competition participants. This suggests that there is a strong association between the forecast accuracy and the optimization costs. Furthermore, the association may change depending on the metrics in use. However, this correlation is asymmetric, it is not entirely clear what the impact of overforecasting and underforecasting is, and any given forecast accuracy metric may not be the most appropriate metric to minimize complex optimization costs.
As such, the predict+optimize field warrants further research. One avenue might be to use better error measures for the forecasting, that correlate better with the final objective. Other interesting avenues to explore would be to train the model to directly minimize downstream optimization costs, and other strategies in the predict+optimize space.
Acknowledgements
We are grateful to the IEEE Computational Intelligence Society to organize the technical challenge series and to sponsor and assist with organization of this edition in particular. We also thank the Department of Data Science and Artificial Intelligence of Monash University for their sponsorship. We would like to thank Prof Ariel Liebman for helpful discussions, and Prof Mark Wallace for forming part of the scientific committee. OikoLab kindly provided us ERA5 weather data. Support roles for this competition received funding from ARENA as part of ARENA’s Advancing Renewables Program. Finally, we’d like to thank all other participants in the competition, in particular the members of the QSZU-PolyU-Team, namely Qingling Zhu, Minhui Dong, Wu Lin, Yaojian Xu, Junchuang Cai, Junkai Ji, Qiuzhen Lin, and Kay Chen Tan.
[Detailed description of participants’ solutions]
-A Mahdi Abolghasemi and Rasul Esmaeilbeigi’s solution
-A1 Methodology
There are many missing values in the data. At first, the missing values were replaced with the average power across the same hours and days for each time slot but training the forecasting models on such data significantly reduced the accuracy of the forecasts. Therefore, missing values were removed from the data to provide better quality data for training the models.
Solar and building powers depict multiple seasonality patterns. Therefore, several calendar features such as minutes (15 minutely only), hour, day, week, day-of-week were created to capture the multiple seasonality behaviour inherent in solar data and buildings data. Daily weather data was gathered including max temperature, min temperature, solar exposure, and rainfall from the BOM website and various static and dynamic features such as lags, mean, standard deviation were created for the aforementioned daily information. This was done in a rolling window fashion. The hourly data provided by the competition organizers that includes hourly weather data such as temperature, pressure, wind speed, humidity, surface thermal radiation, surface solar radiation, total cloud coverage were used along with the raw features and conducted a comprehensive feature engineering on this data to create dynamic features that can capture the stochastic behaviour of power over time. This includes lags of data up until lag three, mean, and standard deviation with the length of three to six hours.
Three sets of models where developed. We used daily weather data, hourly weather data, and a combination of daily and hourly weather data besides calendar features as input features of our models. The first attempt includes one LightGBM model for each series with the calendar and daily weather data as input features (12 models in total), and one LightGBM model for each series with the calendar and hourly weather features as input variables (12 models in total). The hyperparameters were optimized with grid search and trained 24 LightGBM models (one model with hourly features data and one model with daily features data for each series). We set the num-leaves and min-data-in-leaf between 50 to 300 with steps of 20, max-depth between 7 and 13 with a step of 1. All LightGBM models were trained with a learning rate equal to 0.1 and mae loss function. We also examined the rmse loss function but that significantly reduced the accuracy as predictions were evaluated based on MASE. To avoid overfitting, L1 regularization with early stopping was used. We conducted time series cross-validation where September and October data were used as evaluation sets. The length of the training set for each building and solar differs and they were selected as per our observations and experiments. Final submission was an ensemble of two LightGBM models for each series with daily and hourly features.
In terms of the importance of the variables, we can see calendar events and especially hour of the day consistently being ranked as an important feature for solar data. For building data, various features such as temperature, day, and hour were chosen as the top-ranked features depending on the series. We observe a similarity between the importance of features within buildings and solar data but their ranking differs.
The optimization problem was formulated as a mixed-integer linear program in which the net base load is assumed to be a given deterministic parameter. The sets and parameters used to develop the optimization model are defined as follows: is the set of all recurring activities; is the set of all once-off activities; is the set of all activities (); is the set of all batteries; is the set of all time slots in the planning horizon; is the set of time slots when activity can be in progress. For , this set only corresponds to the time slots of the first week. is the set of time slots when activity can start. For , this set only corresponds to the time slots of the first week. is the set of starting time slots for activity that result in a penalty; is the set of prerequisites of activity ; is the penalty of scheduling activity outside of working hours; is the revenue of scheduling activity ; is the duration of activity ; and are respectively the number of small and large rooms required for activity ; is the load required by activity per room; is the number of small rooms available in total; is the number of large rooms available in total; is the capacity of battery ; is the energy of battery at the beginning of planning; is the maximum power of battery ; is the efficiency of battery ; is the wholesale price at time ; is the net base load at time ; is an integer upper bound on the absolute value of the maximum load; is the number of time slots in a day; is the number of time slots in the planning horizon (); is a function that maps to the corresponding time slot in the first week.
The decision variables are defined as follows: is a binary variable equal to 1 if and only if (iff) battery is charging at time ; is a binary variable equal to 1 iff battery is discharging at time ; is a binary variable equal to 1 iff activity begins at time ; is a binary variable equal to 1 iff activity is in progress at time ; is a non-negative variable representing the state of battery at the end of time slot ; is a binary variable equal to 1 iff activity is scheduled; is a binary variable equal to 1 iff activity is scheduled outside working hours; is a non-negative variable representing the day index at which activity begins if it is scheduled; is an unrestricted variable representing the aggregate/total load at time ; is a non-negative variable representing the absolute value of the maximum load; is a binary variable equal to 1 iff is equal to .
Objective (1) approximates the problem’s quadratic cost function as a linear function. Constraints (2)–(4) ensure that each activity starts in exactly one time slot and that the activity continues without any interruption until it finishes. According to Constraint (5), the auxiliary binary variables is equal to one iff activity is scheduled outside working hours. Constraints (6)–(8) ensure that there is at least one day between start times of an activity and its prerequisites. Furthermore, an activity is not scheduled unless its prerequisites have all been scheduled. Constraints (9)–(10) capture the capacity of each battery at a time slot depending on its initial state and the charging/discharging actions performed in that time slot. Constraint (11) states that a battery cannot charge and discharge at the same time. Constraints (12)–(13) guarantee that the total numbers of small and large rooms do not exceed their respective capacities. Constraint (14) captures the value of the net load based on the predicted load, the load from the batteries and the load required by the scheduled activities. Constraints (15)–(18) are used to capture the value of the maximum load and linearize the value of in conjunction with the objective function. This provides a reasonable approximation of the quadratic objective function. Constraint (19) guarantees that recurring activities are all scheduled. Constraint (20) ensures that the energy of a battery is always non-negative, and it does not exceed its capacity.
(1) |
(2) | ||||
(3) | ||||
(4) | ||||
(5) | ||||
(6) | ||||
(7) | ||||
(8) | ||||
(9) | ||||
(10) | ||||
(11) | ||||
(12) | ||||
(13) | ||||
(14) | ||||
(15) | ||||
(16) | ||||
(17) | ||||
(18) | ||||
(19) | ||||
(20) |
The model does not consider the specific allocation of building rooms to activities to avoid symmetry and also reduce the problem size. This allocation is instead performed in a post-processing step, where a feasibility problem is modeled and solved. Due to the space limitation, this model is not presented here. The decision variables corresponding to the recurring activities are only defined for the first week to reduce the size of the formulation. These decision variables are used throughout the planning horizon by employing the mapping . Some once-off activities will result in zero or negative revenue if they are scheduled outside of working hours. There exists an optimal solution in which these activities are not allocated to the time slots outside of the working hours. Therefore, pre-processing is performed to avoid generating additional decision variables corresponding to such allocations.
It is noteworthy to mention that Constraints (6)–(7) adhere to the feasibility checks of the competition. They can be generalized to capture prerequisites also for the case in which activities can be scheduled in the same day. This can be done by defining appropriate coefficients (see Section 6.3.2 of Esmaeilbeigi et al. [53] for details).
The deterministic formulation assumes that the true realization of the net base load is given ( for ). We consider the uncertainty of forecasts within the optimization problem and minimize the average cost over multiple uncertain scenarios of forecasts. Let denote the set of scenarios and be the realization of random variable in scenario . Accordingly, we define variables , and corresponding to scenario . The updated formulation has the objective function
(21) |
which minimizes the average cost incurred in different scenarios. Constraints (14)–(18) must hold in every scenario, and therefore they are respectively updated to
(22) | ||||
(23) | ||||
(24) | ||||
(25) | ||||
(26) |
In order to solve the resulting mixed integer linear program, the Gurobi optimization solver (version 9.1.2) is employed. An optimizer engine is developed in Python. Although the auxiliary variables or were defined as binary variables, they can be considered as non-negative continuous variables when solving the problem. This relaxation significantly improved the run time of the solver while providing a very good bound.
To further speed up solving the problem, this solution designed different fix-and-optimize heuristics to obtain good feasible solutions in a reasonable amount of time. A collection of 12 algorithms have been included in the optimizer engine. However, only the default algorithm is explained here. The optimizer engine can start with a good initial feasible solution (provided by the user in the form of a solution file) to warm-start the solver. If this setting is activated (i.e., parameter setstart is set to True), the engine will take a two-phase approach: in the first phase, the integrality constraints of all decision variables corresponding to the batteries are relaxed (e.g., allowing for a battery to be partially charged and discharged at the same time) while ensuring feasibility of the solution with regards to the activities. Once a solution for Phase 1 is obtained, we fix the decision variables corresponding to the activities and optimize scheduling of the batteries by including integrality constraints of their decision variables. If the setstart parameter is False, the engine will take another two-phase approach as follows: In the first phase, no batteries and no penalized activities are scheduled. Furthermore, the start times of all activities are restricted to the even time indices. In the second phase, we fix the activities that have been scheduled in Phase 1 (with some degrees of flexibility) and schedule the batteries to improve the solution (peak-shaving). For more details, please see the implementation of the optimizer engine.
-A2 Experiments & Results
To solve the forecasting problem, we trained different LightGBM models where we used calendar features and various form of weather data as inputs. We initially set the hyper-parameters manually but then optimized them with grid search. We started by using daily or hourly weather data as input features. We then developed a model with the combination of daily and hourly features which significantly improved the forecast accuracy. An ensemble of daily, hourly, and daily-hourly models resulted in the accuracy of 0.58 MASE over the October testset. In our final submission, we did not train the combined daily-hourly model due to limitations in time.
To solve the optimization problem, we explored various heuristics and examined various solver parameters. We gradually improved the algorithms by tuning these parameters and advancing the heuristics. The current default values of the settings in the optimizer engine proved to be promising. For the last submission, we employed the default algorithm and used solutions of the previous submission as the initial solution.
In terms of the association between forecast accuracy and energy cost in our experiments, we observe that our submissions on the November testset with accuracy of 0.84 and 0.74 respectively resulted in the actual energy costs of 337625 and 328359. This indicates a positive correlation between the accuracy of the forecast and the scheduling costs, where reducing MASE by 11% resulted in 3% reduction in scheduling costs in this case. Note that this correlation may not be linear and as accurate since the optimality gap of the solutions was nearly 2.5%. Furthermore, such a comparison can be made on the basis of actual energy costs and not the costs reported by an optimization algorithm.
-A3 Summary of the Scientific Committee’s Evaluation
An ensemble of LightGBM models with calendar features and dynamic features (lags, mean, standard deviations), with ensembling over models that use daily, weekly, and daily and weekly weather features. Optimization over multiple scenarios, with sample average approximation that minimizes average cost over multiple scenarios. The approach seems reproducible and solid, although classic. It is a fast and accurate forecasting and optimization.
Advantages of the methodology
Easy, systematic, robust, reproducible methodology. A rigorous problem formulation so that the model is not unduly complex. A good exploration of alternative forecasting techniques. Furthermore, a stochastic optimization approach that uses multiple forecasts. Also, a large neighbourhood search and good decomposition technique.
Disadvantages of the methodology
Somewhat ad-hoc hyperparameter tuning, a focus on “local” models for forecasting that work on every series separately. The algorithm choice is not clearly motivated, and some manual and unclear steps are in the forecasting methodology.
Robustness
Though some of the forecasting seems ad-hoc, e.g., the choice of size of training set per series, and optimization is typically not directly transferrable, the general methodology is highly generalizable and robust. It can be applied to similar problems with minor adjustments. Overfitting is adequately addressed with cross-validation, L1 regularization and early stopping.
-B Richard Bean’s solution
-B1 Methodology
A detailed description of the methodology can be found in Bean [42].
-B2 Summary of the Scientific Committee’s Evaluation
The submission uses a quantile regression forest with weather data from BOM and ERA5 for forecasting. Models are built on groups of series (buildings) rather than per series. Various different approaches are used for optimization. Both MIP and MIQP are used in a two-stage approach to tackle the quadratic objective. The forecasting uses a good selection of covariates, with justifications. The general methodology is very specific with great understanding of the data, but also seems sometimes ad-hoc, to the point of forecasting manually chosen constants in some instances. It achieves very accurate forecasting results, which are the best in the competition.
Advantages of the methodology
Highly effective in forecasting, achieving high accuracy. Performs global modelling across series. Optimizes the start date for the training data. Deals with outliers (although manually), and has a good rationale for variables used, with careful analysis of the data. The components seem well integrated.
Disadvantages of the methodology
Some steps are ad-hoc, and it is not clear why some of them are carried out. Some decisions are not well justified. This may lead to a lack of reproducibility. For example, manual thresholding for outlier filtering, grouping of buildings based on observation. Parameter optimization was performed fully against Phase 1, no time series cross-validation was performed. Some parts of the optimization approach are very heuristic.
Robustness
As some steps seem ad-hoc, extensions to other scenarios may require adaptations. Most design choices have been made based on the particular data. As such, it is a quite specific solution. Though the particular approach will not be easy to generalize, the main idea can be generalized.
-C HRI Team’s solution
-C1 Methodology
A detailed description of the methodology can be found in Limmer and Einecke [43].
-C2 Summary of the Scientific Committee’s Evaluation
The solution is based on simple median values for the load prediction, Random Forest for solar power production, and (an efficient) MILP for the optimization task. It therewith assumes that the influence of the weather forecast on the load is only marginal. It performs an elegant decomposition of the optimization problem with a variety of different settings to determine the final “best” setting. It is overall a simple, straightforward application of existing technologies that achieves results close to the best-performing methods both in forecasting and optimization.
Advantages of the methodology
The approach is designed to achieve a solution in an acceptable time frame. Random Forest is robust for prediction. The participants experimented with different levels of decomposition for the optimization, with a good problem formulation, namely linearlization by using peak load instead of the quadratic function to schedule recurring activities, and by separating the optimization into the same two steps as the competition winners, namely assigning buildings to activities as a second step.
Disadvantages of the methodology
There have been some manual decisions on what data to use, and the load prediction seems a bit too simplistic, although it achieves a decent performance. No comparison with other forecasting methods was given, and no discussion of why linearization of the objective does not lead to a decrease in solution quality. Overall, the results are good but not excellent.
Robustness
The methodology seems robust and minor adaptation would be required for other scenarios, although outlier filtering is done manually. The simplicity of methods and use of standard procedures makes the method relatively easily generalizable.
-D EVERGi Team’s solution
-D1 Methodology
A detailed description of the methodology can be found in Ruddick et al. [46].
-D2 Summary of the Scientific Committee’s Evaluation
The approach uses an evolutionary algorithm for initial scheduling of activities, followed by a local search, and MIP for the batteries. Seasonal and trend decomposition (STL, Prophet) followed by LightGBM is used for forecasting.
Advantages of the methodology
The methodology uses good preprocessing that, for example, found drift in the demand of Building 5. It uses a good forecasting methodology that applies transformations and decompositions. The optimization is a combination of heuristic and complete solvers, and as such a novel optimization idea that seems to work well.
Disadvantages of the methodology
Different buildings are treated differently in the forecasting. The computation time will presumably be high and the activity schedule might be slow. The evolutionary algorithms seems somewhat ad-hoc, and the schedule improvement with Gurobi adds complexity to the model.
Robustness
The method is a relatively general and robust approach that seems applicable in other settings with some minor adjustments.
-E FRESNO Team’s solution
For additional information see also Kumar et al. [49].
-E1 Methodology
The proposed optimization solution uses the idea that the objective solution to the linear program (LP) relaxation of a minimizing MIP problem provides a lower bound for the original objective function [54], i.e.,
(27) |
where, is the objective value for a minimizing MIP problem and is the objective value of the LP relaxation of the MILP problem. This property was used to approximate one of the stages of the optimization model to make the computation faster.
Data pre-processing
The data pre-processing step mostly revolved around the handling of missing values from all the energy profiles. Multiple abnormal zero generations for the solar systems and a lot of missing points and outliers for some buildings were found during the inspection. These were classified as invalid datapoints acting as noise while making time series predictions. To further process these invalid datapoints, three actions were taken:
-
1.
Long periods (96 consecutive points) of missing/invalid data were removed from the dataset completely.
-
2.
Short periods of missing data were approximated using a 96-point moving average window when most data is valid in the window.
-
3.
When there were more than half the window size of invalid datapoints, these were approximated using the annual average at that time instant.
These methods were not used for Buildings 4 and 5 because of their peculiar patterns. In Building 4, we noticed that the data was in discrete steps of 1, 2, 3 and 4 kW and their appearance seemed to be irrespective of the exogenous variables. Therefore, a flat prediction of 1 kW was applied for the predicting period upon checking the performance of the other discrete values. In Building 5, a certain triangular wave pattern with seasonal peaks was noticed. Thus, missing points in September and October 2020 were filled using their corresponding time instant values in 2019. Additionally, in Solar 3, we noticed an increase in the overall generation. This was also accounted for by using Solar 2, which had a similar generation pattern after the increase. Precisely, historical data before May 20th, 2020 of the Solar 2 profile were used to replace for those of Solar 3. After filling the missing values, the data pre-processing was completed by adding exogenous variables such as weather information, occupancy, seasonal variations using half sinusoidal waves for solar patterns, climate info (summer or winter) for building load data and day type (weekday or weekend).
Building load forecasting model
Figure 6 shows the block diagram of the proposed building forecasting method. Due to the demonstration of cyclic and seasonal behavior from the demand profiles, the concept of seasonal decomposition was applied to enhance the prediction improvement from the tree-based methods. However, for Building 0 and 5, the tree-based techniques were used directly on the original profiles as they had a clear and repetitive pattern. These methods include Random Forests and Gradient Boosted Trees. An extra decomposition step was necessary for the other buildings to capture the seasonality, trends, and stochasticity of their time series. Specifically, the STL decomposition technique [55] was employed to decompose the original time series with a 7 days period (96 x 7 intervals). Different regression techniques, including Support Vector Machine, Random Forest, Gradient Boosted Trees and ARIMA were then utilized for training on these components depending on their performances on the validation data in October. Moreover, from the observation of the real data, it could be seen that the demand profiles on Oct 23rd for Building 1, 3 and 6 were significantly lower than their expected mean. Since this was the only holiday in the training set, it was not worth having a separate label for this date. Instead, the profiles were replaced with their forecasted values to be later treated as training data for the prediction of November. There was one more holiday in the forecasting period, which appeared on Nov 3rd. Since there was no holiday label, this was accounted for by halving the prediction at mid-day. Similar to the building data, the solar profiles had repeated patterns, which are extracted prior to prediction, as explain in the following section.
Solar generation forecasting model
Figure 7 shows the block diagram of the proposed solar prediction method, where the pre-processed data provides the daily basis input features including surface radiation, cloud coverage, temperature, and a monthly cycle for presenting the seasonality. The Refined Motif was extracted from the generation data by calculating the pairwise similarities of all sub-patterns, and the input features were further processed to get the difference to the Refined Motif external features before reshaping it into a daily cycle. Hence, the input features are the representative solar generation and the relative features to the representative model. The output is the solar generation data in a daily cycle with the total prediction length of one month. ResNet was used as the prediction method to facilitate deeper understanding on reshaping the solar generation based on the external features’ changes. The hyper-parameters and shortcut connections of the ResNet are chosen based on classic ResNet-18 to ResNet-101 structures, which are originally designed for 2D-CNNs. Adam and MSE are used as the optimizer and error metric, respectively, for the model training.
Optimal scheduling problem
The scheduling problem was divided into 2 sub-problems; the first sub-problem helps to figure out the limit on peak power and the second one was developed to schedule the activities. The constraints related to the activities were modeled using a day list, which specifies day of the week at a given time index, a non-start list, which specifies non-office hours time indices, two binary decision variable lists to map the start time index and active time indices for each task. The battery constraints were modeled using two binary decision variables; the charge status and discharge status of a battery at some time index. The “battery decision” variable was also calculated in the optimization using a linear combination of these two binary variables. The baseload was calculated from the forecasts obtained from the above methods and used to calculate the power scheduled at a given instant (). Sub-problem 1 involves all the above constraints while minimizing the peak power consumed throughout the month. Whereas, sub-problem 2 aims to minimize the energy costs by limiting using the objective value from sub-problem 1 and scheduling batteries and activities accordingly. The idea here is to get a sub-optimal solution in a reasonably short time.
Figure 8 shows the sequence of the scheduling problem. First all the dependencies such as AEMO electricity price, load and solar generation forecasts (combined baseload), month information (day list, non start list) and the instance data were obtained. Then the LP relaxation of sub-problem 1 was solved, which gives us a lower bound on the peak power () for the given instance. The reason for doing so is to reduce computation time, since the objective value is only of interest and not the schedule itself from this problem. By using Eq. 27, it is understood that this is the lower bound for the limit; hence multiplied by a sufficiently large multiplication factor to make sub-problem 2 feasible. This multiplication factor inherently trades off speed for lower objective because the tighter the limit on , the slower the optimization will be.
The optimization problem was run on a Windows machine using Python 3.8.8 with an Intel i7 8-core processor and 8 GB RAM. The Gurobi solver was used to run the optimization. The advantage of using this solver is that it has built-in functions to relax MIP to LP and a well documented Python API. By using trial and error, it was found that a multiplication factor of and led to average solution times of and per instance for small and large instances, respectively. The activity and battery schedules were then passed through a room assignment algorithm and finally the solution was saved to the expected format.
Figure 9 shows an example day, power scheduled for a small and a large instance, along with the baseload (forecasts) and AEMO price. It can be seen that the power profile has a flat peak because of the constraints. It can also be seen that at intervals 72 to 80 when the AEMO price spikes, the scheduled power is less than the baseload. This indicates the batteries are discharged at those times to lower the cost of energy.
This method also leads to robust solutions with respect to forecast accuracy, since the problem focuses on minimizing the peak power consumed. Thus, the schedule developed aims to pack all the events as tightly as possible, and even for relatively bad forecasts leads to good results. If the forecasts capture the general trend of the baseload, this method will yield good results because it fills valleys to reach a flat trend. This can also be verified based on the Phase 2 results submitted, where the forecasts with MASE 1.00 and 1.87 produced schedules with very close objective values.
-E2 Experiments & Results
At the beginning, since most of the training profiles displayed a cyclic behaviour with a few noises and outliers, tree-based methods like Random Forest and Gradient Boosted Trees were chosen to provide a rough guideline on the general performance due to their resistance to outliers. Indeed, we achieved a MASE of 0.78 in our very first complete submission in Phase 1. From this base-case, various modifications had been made to the inputs, training periods and forecasting techniques on all the profiles as explained above, which then gave us a MASE of 0.67 at the end of Phase 1. In Phase 2, since there was no feedback from the competition, we decided to build a simplified evaluation function using 30 days’ shifted data to justify our improvements. This simplified evaluation function provided a slightly different value to MASE adopted by the competition but was able to reflect the same error changes for predicting values. Under our evaluation function, our models showed improved accuracy from 0.82 to 0.70 for the Phase 1 calibration (from simple Random Forest to the new adjustment in Phase 2).
For solar prediction, we firstly used a two-layer 1D-CNN as the regression model in Phase 1, which provided us with a MASE outperforming the Random Forest and other candidates. We used MAE and MSE as the Neural Network error metrics. With more experiments, we found MSE can lead to a better result than MAE so the former was picked to be the final error metric. Some recent research shows that CNN-based structures can benefit from increasing adjustable parameters, called over-parameterization, whereas one recent work shows the over-fitting can end up with a more accurate model with a sufficient training process. This phenomenon is known as grokking [56] and also discussed as deep double decent in other ResNet related research. To increase the available parameters, we replaced the 1D-CNN with a ResNet for better performance. During the testing to predict October’s consumption, this new structure was observed to have a slightly better accuracy for some solar systems. However, due to the lack of November’s data and the bad prediction accuracy for our prediction solution in Phase 2, using ResNet as the regression model became an unverified improvement. For the sake of training speed, only ResNet-18 and ResNet-34 were tested with early stopping set as 1000 patience, which might not be enough for reaching the grokking phenomenon. The proposed solar prediction method showed a better accuracy than Random Forest or other state-of-the-art methods in Phase 1 and Phase 2. However, we did not apply this on buildings consumption prediction because of two reasons: First, the Refined Motif based prediction model has a daily cycle and the solar generation is relatively independent between days. Second, learning the building features requires a deep Neural Network, which is computationally expensive and requires significant time for thorough testing.
The prediction and optimization model was developed and implemented in Python. It was found that the scheduling algorithm was robust with respect to forecasting error. The scheduling of batteries and activities were also visualized to demonstrate the performance of the algorithm.
-E3 Summary of the Scientific Committee’s Evaluation
The paper uses STL decomposition, followed by a separate forecast of each time series with ARIMA, Random Forest, Gradient Boosting, and SVM. For the solar panels, ResNet is trained using Refined Motif, proposed by the participants in another paper. The optimization is then done using MIP with a sensible decomposition, that solves a Linear Program relaxation first, then the scheduling problem.
Advantages of the methodology
The approach uses a systematic forecasting methodology. It uses a good Linear Programming relaxation to bound the optimization problem, that focuses on the peak demand. This appears to make the optimization more robust w.r.t. the forecast quality, as the forecasts are not very accurate.
Disadvantages of the methodology
Some buildings are not STL decomposed and treated differently, with no justifications. ResNet might need more data than available here. In general, the forecasting is not as accurate as the methodologies of other participants.
Robustness
Besides the complexity of the procedure and the many methods involved, the methodology seems to be able to be used on other datasets.
-F QSZU-PolyU-Team’s solution
-F1 Methodology
A detailed description of the methodology can be found in Zhu et al. [50].
-F2 Summary of the Scientific Committee’s Evaluation
Forecasting is performed using a variety of ML models (including neural networks). Single models are trained for all buildings, but preprocessing is different for each building. Then, this team uses a bi-level optimization to first identify an optimal timetable using local search, with relaxation. After this battery scheduling is optimized, again with a local search method, based on the optimal timetable. This is an ad-hoc and very simple optimization algorithm, which is a useful approach to benchmark the effectiveness of sophisticated MIP methods against this simple alternative.
Advantages of the methodology
A single forecasting model is built for all buildings, using weather data. The optimization is very simple. The observation that forecasting accuracy may not have a large impact on quality of optimization is interesting and useful.
Disadvantages of the methodology
The method presumably needs expensive training, and no hyperparameter tuning has been discussed. Only 2 months (August and September) are used for training. There is different ad-hoc preprocessing for different buildings. The forecasting accuracy is weak overall. Local search does not allow to judge solution quality.
Robustness
The preprocessing seems not generalizable, but the rest of the methodology can be adapted to other scenarios with minor changes.
-G Akylas Stratigakos’ solution
-G1 Methodology
Problem formulation
Recurring activities (lectures) were considered “hard” constraints, as they are required for a feasible schedule solution. Once-off activities (corresponding to lab experiments) were considered “soft” constraints, as they are not required for a feasible solution. Recurring activities are repeated in a weekly basis, thus a single set of variables per week suffices. For once-off activities, the defined variables span the whole scheduling period.
In short, a binary variable was assigned for each activity and each required time period. Further, an integer variable was defined for each activity, time period, building, and room type, to determine the number of rooms assigned. Note that this formulation is somewhat excessive. However, modern MIP solvers remove redundant constraints and variables during a presolve phase, thus leading to significantly smaller problems during the branch-and-bound search. For reference, for a large instance the initial problem comprised approximately continuous and integer ( binary) variables. After the presolve phase, these were decreased to continuous and integer ( binary), respectively. To our opinion, the requirement of activities to be scheduled in consecutive time slots posed the greater modeling challenge, which was handled with the introduction of auxiliary variables and applying a big-M reformulation of constraints.
Forecasting
The uncertain electricity demand and solar production must be predicted for the whole scheduling period; alternatively one could consider the aggregated net load. A separate forecasting model was trained for each series provided. Regarding solar panels, several base models, namely Random Forest, Gradient Boosting Machines, Ridge Regression, and Local Learning Regression, were developed and aggregated via averaging for the final prediction.
Regarding buildings’ demand, the prediction task is more challenging due to missing data and restrictions related to the Covid-19 pandemic. Overall, Buildings 1 and 4-6 show relatively low demand, which in turn reduces their impact on final energy cost. The decision was made to discard all missing data, as most of them relate to buildings 4-6. All demand forecasts were derived with a Random Forest model. Feature data included the weather data provided by the organizers, alongside categorical variables to model the calendar effect, impact of Covid-19 restrictions, exams period, etc. For Buildings 0 and 3 a pre-processing step was included to remove possible outliers. Lastly, a Quantile Regression Forest model was employed to generate probabilistic forecasts for Building 3, which were subsequently used to derive a high and low impact scenario.
Objective
The solutions were evaluated based on aggregate energy cost, which includes a peak load tariff. A scenario-based robust objective was derived to hedge against large forecast errors. Specifically, two scenarios for aggregated net load were derived based on the marginal predictive density of demand in Building 3. The objective was then reformulated in a standard epigraph form, and the worst-case cost over all scenarios was minimized. Ideally the scenarios should incorporate correlation between the different sources of uncertainty. Due to limited time and after a trial-and-error period, it was decided to derive scenarios solely based on the predictive density of Building 3, which represented the largest load. Figure 10 presents an indicative set of forecasts for a single day. Effectively, this approach creates a small box uncertainty set that aims to protect against larger forecast errors.
Optimization
A fix-and-optimize matheuristic is proposed to solve the MIP problem [57], which belongs to the class of Large Neighborhood Search (LNS) heuristics [58]. The approach is method based, meaning that the underlying model is the same, but the local neighborhood to search an improved solution changes at each iteration. Note that modern optimization solvers (e.g. Gurobi) require the model to be built once and each iteration is warm-started from the previous solution by adding or removing constraints, which significantly speeds-up computations.
The solution process is described in the flowchart shown in Figure 11. The derived forecasts using the methodology decribed earlier are used as inputs in this process. First, the problem was solved once considering only “hard” constraints, providing a minimum feasible solution, which translates into optimizing only over recurring activities. The sample solutions provided by the organizers were used to warm-start the solver. Next, the iterative search began by sampling a subset of activities, considering both recurring () and once-off (). Variables related to non-selected activities were fixed to their previous solutions and the subsequent problem was solved to optimize over the remaining free variables. The process was repeated several times until the maximum number of iterations was reached (max-iter) or until the solution stopped improving after a number of iterations (patience) subject to a pre-defined tolerance (tol). Variables related to storage remained free during all iterations, as the computational burden increased only marginally. The search algorithm is detailed in Algorithm 1. Note, this search is greedy and random. Due to precedence constraints, subsequent problems with fixed constraints are generally easier to solve close to optimality, due to reduced feasible space. In all cases, soft time constraints based on the current optimality gap were imposed by tracking the progress of the branch-and-bound search within the MIP solver through callback functions.
Input: Built Gurobi model, initial objective , search hyperparameters
Output: Improved solution
-G2 Experiments & Results
During Phase 1 (validation), different strategies were explored, mainly concerning how to deal with uncertainty in the objective function. Figure 12 provides a comparison between a deterministic formulation (i.e., using point forecasts) and the scenario-based robust approach for the Phase 1 data. Note that the validation data includes a large demand outlier, which actually obscures the impact of predictive accuracy in the final cost. Regardless the objective formulation, Figure 12 highlights that the fix-and-optimize search significantly reduces costs, with results being more pronounced for larger problem instances.
Regarding the Phase 2 (testing phase), all experiments were conducted using a standard machine with an Intel [email protected] and 32GB of RAM. The problems were modeled and solved with Gurobi using the Python-API. For reference, the required time to build a model is approximately 200-250 sec for small instances and 900-950 sec for large ones. To create the aggregated net load scenarios, the and quantile of the predictive marginal density of Building 3 were selected. For the initial feasible solution a hard time limit of 2 hours was set, although a smaller time limit would also suffice. Subsequent iterative solutions were solved with a hard time limit of 20 min. A soft time limit of 5 min, if the optimality gap is below , was also set. The total number of iterations was set at , the patience was set at , and the improvement tolerance was set at . Lastly, at each iteration the number of sampled activities was for recurring and for once-off, same for both small and large problem instances. For reference, Figure 13 shows the evolution of the objective value for a large instance in the testing phase.
Regarding the performance of the final submission, several improvements could be made. First, considering better informed scenarios could improve the solution. The decision to base scenarios solely on the predictive density of Building 3 is somewhat arbitrary; considering scenarios that account for the correlation between different sources of uncertainty could improve final performance. Second, improving the formulation of the timetabling problem could reduce computational costs. Third, note that the performance of the iterative search procedure is bounded by the solution to the full MIP problem. Smaller problem instances could probably be solved to optimality within reasonable time, given adequate time and computational resources. Moreover, the search is greedy and random; considering precedence constraints and sampling blocks of activities instead of individual activities could also improve the performance. Overall, we assume that the iterative search proposed offers higher utility for larger problem instances. Lastly, the optimization window changed during Phase 2 to match the forecasting window. As a result, additional effort was required to modify the optimization code to the new setting, which somewhat hindered the ability to properly evaluate results in the validation data. Avoiding modeling errors also proved important, as it hindered the ability to exploit and evaluate different solutions during within the timeframe of this challenge.
-G3 Summary of the Scientific Committee’s Evaluation
This participant used ML forecasting methods such as Random Forest, Gradien Boosting Machines, regression variants to predict PV power generation. Building demand forecasts were created using Random Forest models and quantile regression, using calendar and weather features. The optimization problem was solved using MIP via Gurobi. The novelty in the prosed method is the use of a fix-and-optimize approach, whereby sections of a feasible search space are “fixed” while the solver explores the remaining free variables.
Advantages of the methodology
The method uses well established ML models for forecasting both power production and building demand. The “fix and optimize” nature of the solver solution has the potential to increase performance speed. Combining these elements creates an effective solution tool with a straightforward data flow/solution path. Thus, the optimization approach is robust and easy to generalize. Minimization of the worst case expected cost helps hedge against large forecast errors.
Disadvantages of the methodology
The forecasting seems to not have received as much attention as in other solutions, and this may have had some influence on the results. The search in optimization is a bit greedy, and there will be degradation of the solver solution using “fix and optimize” compared to a more exhaustive solution.
Robustness
The scenarios are based only on Building 3, which makes the approach less generalisable. Apart from this, very few assumptions are made about the input data etc., so that this approach seems highly applicable to other settings, and the solution delivers many insights that can help adapting it.
References
- Ben-Tal et al. [2009] A. Ben-Tal, L. El Ghaoui, and A. Nemirovski, “Robust optimization,” in Robust optimization. Princeton university press, 2009.
- Juan et al. [2015] A. A. Juan, J. Faulin, S. E. Grasman, M. Rabe, and G. Figueira, “A review of simheuristics: Extending metaheuristics to deal with stochastic combinatorial optimization problems,” Operations Research Perspectives, vol. 2, pp. 62–72, 2015.
- Dehghani et al. [2021] M. Dehghani, B. Abbasi, and F. Oliveira, “Proactive transshipment in the blood supply chain: A stochastic programming approach,” Omega, vol. 98, 2021.
- Jung et al. [2004] J. Y. Jung, G. Blau, J. F. Pekny, G. V. Reklaitis, and D. Eversdyk, “A simulation based optimization approach to supply chain management under demand uncertainty,” Computers & chemical engineering, vol. 28, no. 10, pp. 2087–2106, 2004.
- Kotary et al. [2021] J. Kotary, F. Fioretto, P. Van Hentenryck, and B. Wilder, “End-to-end constrained optimization learning: A survey,” in Proceedings IJCAI-21, 8 2021, pp. 4475–4482.
- Elmachtoub et al. [2020] A. Elmachtoub, J. C. N. Liang, and R. McNellis, “Decision trees for decision-making under the predict-then-optimize framework,” in International Conference on Machine Learning (ICML), 2020, pp. 2858–2867.
- Elmachtoub and Grigas [2022] A. N. Elmachtoub and P. Grigas, “Smart ‘predict, then optimize’,” Management Science, vol. 68, no. 1, pp. 9–26, 2022.
- Mandi et al. [2020] J. Mandi, P. J. Stuckey, T. Guns et al., “Smart predict-and-optimize for hard combinatorial optimization problems,” in Proceedings of the AAAI Conference, vol. 34, no. 02, 2020, pp. 1603–1610.
- Demirovic et al. [2020] E. Demirovic, P. J Stuckey, T. Guns, J. Bailey, C. Leckie, K. Ramamohanarao, and J. Chan, “Dynamic programming for predict+optimise,” in The Thirty-Fourth AAAI Conference, 2020, pp. 1444–1451.
- Donti et al. [2017] P. Donti, B. Amos, and J. Z. Kolter, “Task-based end-to-end model learning in stochastic optimization,” Advances in neural information processing systems, vol. 30, 2017.
- Gao et al. [2021] S. Gao, C. Xiang, M. Yu, K. T. Tan, and T. H. Lee, “Online optimal power scheduling of a microgrid via imitation learning,” IEEE Trans. on Smart Grid, 2021.
- Dudek et al. [2022] G. Dudek, P. Pelka, and S. Smyl, “A hybrid residual dilated lstm and exponential smoothing model for midterm electric load forecasting,” IEEE Trans. on Neural Networks and Learning Systems, vol. 33, no. 7, pp. 2879–2891, 2022.
- Khodayar et al. [2021] M. Khodayar, G. Liu, J. Wang, O. Kaynak, and M. Khodayar, “Spatiotemporal behind-the-meter load and pv power forecasting via deep graph dictionary learning,” IEEE Trans. on Neural Networks and Learning Systems, vol. 32, no. 10, pp. 4713–4727, 2021.
- Wen et al. [2020] Y. Wen, D. AlHakeem, P. Mandal, S. Chakraborty, Y.-K. Wu, T. Senjyu, S. Paudyal, and T.-L. Tseng, “Performance evaluation of probabilistic methods based on bootstrap and quantile regression to quantify pv power point forecast uncertainty,” IEEE Trans. on Neural Networks and Learning Systems, vol. 31, no. 4, pp. 1134–1144, 2020.
- Ak et al. [2016] R. Ak, O. Fink, and E. Zio, “Two machine learning approaches for short-term wind speed time-series prediction,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 8, pp. 1734–1747, 2016.
- Quan et al. [2014] H. Quan, D. Srinivasan, and A. Khosravi, “Short-term load and wind power forecasting using neural network-based prediction intervals,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 2, pp. 303–315, 2014.
- Hu et al. [2020] T. Hu, Q. Guo, Z. Li, X. Shen, and H. Sun, “Distribution-free probability density forecast through deep neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 2, pp. 612–625, 2020.
- Li and Chiang [2018] G. Li and H.-D. Chiang, “Toward cost-oriented forecasting of wind power generation,” IEEE Transactions on Smart Grid, vol. 9, no. 4, pp. 2508–2517, 2018.
- Han et al. [2021] J. Han, L. Yan, and Z. Li, “A task-based day-ahead load forecasting model for stochastic economic dispatch,” IEEE Transactions on Power Systems, vol. 36, no. 6, pp. 5294–5304, 2021.
- Stratigakos et al. [2021] A. Stratigakos, A. Michiorri, and G. Kariniotakis, “A value-oriented price forecasting approach to optimize trading of renewable generation,” in 2021 IEEE Madrid PowerTech - Conference Proceedings, 2021.
- Zhang et al. [2022] J. Zhang, Y. Wang, and G. Hug, “Cost-oriented load forecasting,” Electric Power Systems Research, vol. 205, no. 107723, 2022.
- Stratigakos et al. [2022] A. C. Stratigakos, S. Camal, A. Michiorri, and G. Kariniotakis, “Prescriptive trees for integrated forecasting and optimization applied in trading of renewable energy,” IEEE Transactions on Power Systems, 2022.
- Carriere and Kariniotakis [2019] T. Carriere and G. Kariniotakis, “An integrated approach for value-oriented energy forecasting and data-driven decision-making application to renewable energy trading,” IEEE Transactions on Smart Grid, vol. 10, no. 6, pp. 6933–6944, 2019.
- Chen et al. [2022] X. Chen, Y. Yang, Y. Liu, and L. Wu, “Feature-driven economic improvement for network-constrained unit commitment: A closed-loop predict-and-optimize framework,” IEEE Transactions on Power Systems, vol. 37, no. 4, pp. 3104–3118, 2022.
- Munoz et al. [2020] M. Munoz, J. Morales, and S. Pineda, “Feature-driven improvement of renewable energy forecasting and trading,” IEEE Transactions on Power Systems, vol. 35, no. 5, pp. 3753–3763, 2020.
- Van Den Dooren et al. [2017] D. Van Den Dooren, T. Sys, T. A. Toffolo, T. Wauters, and G. V. Berghe, “Multi-machine energy-aware scheduling,” EURO Journal on Computational Optimization, vol. 5, no. 1-2, pp. 285–307, 2017.
- Bergmeir et al. [2021] C. Bergmeir, F. de Nijs, S. Ferraro, L. Magdalena, P. Stuckey, Q. Bui, R. Godahewa, A. Sriramulu, P. Galketiya, and R. Glasgow, “IEEE-CIS technical challenge on predict+optimize for renewable energy scheduling,” 2021. [Online]. Available: https://dx.doi.org/10.21227/1x9c-0161
- [28] “Kaggle: Your machine learning and data science community,” https://www.kaggle.com, accessed: 2022-06-27.
- Athanasopoulos and Hyndman [2011] G. Athanasopoulos and R. J. Hyndman, “The value of feedback in forecasting competitions,” International Journal of Forecasting, vol. 27, no. 3, pp. 845–849, 2011.
- Godahewa et al. [2021] R. Godahewa, C. Bergmeir, G. I. Webb, R. J. Hyndman, and P. Montero-Manso, “Monash time series forecasting archive,” in NeurIPS 2021 Datasets and Benchmarks track, 2021. [Online]. Available: https://forecastingdata.org
- Hall et al. [2009] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The weka data mining software: an update,” ACM SIGKDD explorations newsletter, vol. 11, no. 1, pp. 10–18, 2009.
- [32] “oikolab: Weather and climate data for analysts,” https://oikolab.com/, accessed: 2022-06-27.
- [33] Australian Government—Bureau of Meteorology, “Climate data online,” http://www.bom.gov.au/climate/data/, accessed: 2022-06-27.
- aem [a] “Aggregated price and demand data,” https://aemo.com.au/en/energy-systems/electricity/national-electricity-market-nem/data-nem/aggregated-data, accessed: 2022-06-27.
- aem [b] https://www.aemo.com.au/aemo/data/nem/priceanddemand/PRICE_AND_DEMAND_202010_VIC1.csv, accessed: 2022-06-27.
- Bartusch et al. [1988] M. Bartusch, R. H. Möhring, and F. J. Radermacher, “Scheduling project networks with resource constraints and time windows,” Annals of Operations Research, vol. 16, pp. 199–240, 1988.
- Vanhoucke et al. [2008] M. Vanhoucke, J. Coelho, D. Debels, B. Maenhout, and L. V. Tavares, “An evaluation of the adequacy of project network generators with systematically sampled networks,” Eur. Jour. of Operational Research, vol. 187, no. 2, pp. 511–524, 2008.
- Kolisch and Sprecher [1997] R. Kolisch and A. Sprecher, “Psplib - a project scheduling problem library: Or software - orsep operations research software exchange program,” Eur. Jour. of Operational Research, vol. 96, no. 1, pp. 205–216, 1997.
- Hyndman and Koehler [2006] R. J. Hyndman and A. B. Koehler, “Another look at measures of forecast accuracy,” International journal of forecasting, vol. 22, no. 4, pp. 679–688, 2006.
- Dai et al. [2021] R. Dai, R. Esmaeilbeigi, and H. Charkhgard, “The utilization of shared energy storage in energy systems: a comprehensive review,” IEEE Trans. on Smart Grid, 2021.
- Esmaeilbeigi et al. [2021a] R. Esmaeilbeigi, R. Middleton, R. García-Flores, and M. Heydar, “Benders decomposition for a reverse logistics network design problem in the dairy industry,” Annals of Operations Research, pp. 1–52, 2021.
- Bean [2022] R. Bean, “Forecasting and optimizing a microgrid for the ieee-cis technical challenge,” in AUPEC2022, 2022. [Online]. Available: http://elvumgar.fea.st.user.fm/papers/AUPEC2022_paper_8768.pdf
- Limmer and Einecke [2022] S. Limmer and N. Einecke, “An efficient approach for peak-load-aware scheduling of energy-intensive tasks in the context of a public IEEE challenge,” Energies, vol. 15, no. 10, pp. 1–23, 2022.
- Ke et al. [2017] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” Advances in neural inf. proc. systems, vol. 30, pp. 3146–3154, 2017.
- Hansen et al. [2003] N. Hansen, S. D. Müller, and P. Koumoutsakos, “Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES),” Evolutionary computation, vol. 11, no. 1, pp. 1–18, 2003.
- Ruddick et al. [2022] J. Ruddick, E. Genov, L. R. Camargo, T. Coosemans, and M. Messagie, “Evolutionary scheduling of university activities based on consumption forecasts to minimise electricity costs,” in 2022 IEEE Congress on Evolutionary Computation (CEC), 2022, pp. 1–8.
- Yuan et al. [2021] R. Yuan, S. A. Pourmousavi, W. L. Soong, G. Nguyen, and J. A. Liisberg, “IRMAC: Interpretable refined motifs and binary classification for rooftops PV owners,” arXiv preprint arXiv:2109.13732, 2021.
- Zerveas et al. [2020] G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, and C. Eickhoff, “A Transformer-based Framework for Multivariate Time Series Representation Learning,” pp. 1–20, 2020. [Online]. Available: http://arxiv.org/abs/2010.02803
- Kumar et al. [2022] Y. P. S. Kumar, R. Yuan, N. T. Dinh, and S. A. Pourmousavi, “Optimal activity and battery scheduling algorithm using load and solar generation forecasts,” in 32nd Australasian Universities Power Engineering Conference, AUPEC2022, 2022. [Online]. Available: https://arxiv.org/abs/2210.12990
- Zhu et al. [2021] Q. Zhu, Y. Xu, M. Dong, W. Lin, J. Cai, J. Ji, Q. Lin, and K. C. Tan, “A local search method for solving a bi-level timetabling and battery scheduling problem,” 2021. [Online]. Available: https://github.com/xuyaojian123/IEEE-Predict-Optimize-Challenge/blob/master/IEEE_Conference_Template.pdf
- Bengio et al. [2021] Y. Bengio, A. Lodi, and A. Prouvost, “Machine learning for combinatorial optimization: a methodological tour d’horizon,” European Journal of Operational Research, vol. 290, no. 2, pp. 405–421, 2021.
- Abolghasemi and Bean [2022] M. Abolghasemi and R. Bean, “How to predict and optimise with asymmetric error metrics,” arXiv preprint arXiv:2211.13586, 2022.
- Esmaeilbeigi et al. [2021b] R. Esmaeilbeigi, V. Mak-Hau, J. Yearwood, and V. Nguyen, “The multiphase course timetabling problem,” European Journal of Operational Research, 2021.
- Vielma [2015] J. P. Vielma, “Mixed integer linear programming formulation techniques,” SIAM Review, vol. 57, no. 1, pp. 3–57, 2015.
- Robert et al. [1990] C. Robert, C. William, and T. Irma, “STL: A seasonal-trend decomposition procedure based on loess,” Journal of official statistics, vol. 6, no. 1, pp. 3–73, 1990.
- Power et al. [2022] A. Power, Y. Burda, H. Edwards, I. Babuschkin, and V. Misra, “Grokking: Generalization beyond overfitting on small algorithmic datasets,” arXiv preprint arXiv:2201.02177, 2022.
- Lindahl et al. [2018] M. Lindahl, M. Sørensen, and T. R. Stidsen, “A fix-and-optimize matheuristic for university timetabling,” Journal of Heuristics, vol. 24, no. 4, pp. 645–665, 2018.
- Pisinger and Ropke [2010] D. Pisinger and S. Ropke, “Large neighborhood search,” in Handbook of metaheuristics. Springer, 2010, pp. 399–419.