This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Prediction of Space Weather Events through Analysis of Active Region Magnetograms using Convolutional Neural Network

Shlesh Sakpal Freedom High School
Abstract

Although space weather events may not directly affect human life, they have the potential to inflict significant harm upon our communities. Harmful space weather events can trigger atmospheric changes that result in physical and economic damages on a global scale. In 1989, Earth experienced the effects of a powerful geomagnetic storm that caused satellites to malfunction, while triggering power blackouts in Canada, along with electricity disturbances in the United States and Europe. With the solar cycle peak rapidly approaching, there is an ever-increasing need to prepare and prevent the damages that can occur, especially to modern-day technology, calling for the need of a comprehensive prediction system. This study aims to leverage machine learning techniques to predict instances of space weather (solar flares, coronal mass ejections, geomagnetic storms), based on active region magnetograms of the Sun. This was done through the use of the NASA DONKI service to determine when these solar events occur, then using data from the NASA Solar Dynamics Observatory to compile a dataset that includes magnetograms of active regions of the Sun 24 hours before the events. By inputting the magnetograms into a convolutional neural network (CNN) trained from this dataset, it can serve to predict whether a space weather event will occur, and what type of event it will be. The model was designed using a custom architecture CNN, and returned an accuracy of 90.27%, a precision of 85.83%, a recall of 91.78%, and an average F1 score of 92.14% across each class (Solar flare [Flare], geomagnetic storm [GMS], coronal mass ejection [CME]). Our results show that using magnetogram data as an input for a CNN is a viable method to space weather prediction. Future work can involve prediction of the magnitude of solar events.

I Introduction

I-A Background and Context

Earth’s atmospheric conditions are heavily influenced by its surroundings. The conditions in the environment surrounding Earth, referred to as space weather, serve a crucial role in determining its composition and balance. Phenomena such as solar flares, coronal mass ejections (CMEs), and geomagnetic storms have the potential to inflict great harm on infrastructure and communication systems on Earth.

In 1989, Earth was struck by a devastating geomagnetic storm, resulting in electricity grids being disrupted in the United States and Europe, while cities in Canada experienced complete blackouts [1]. To mitigate the impact of such an event occurring in modern day, it is necessary to implement a system which forecasts space weather and its effects on the atmosphere. This will allow humans to have greater preparation for the effects of hazardous space weather events.

Refer to caption
Figure 1: Magnetogram of the Sun. Courtesy of NASA/SDO and the AIA, EVE and HMI science teams.

The focus of this project will be to develop a system which can predict instances of space weather, with specific focus on solar flares, geomagnetic storms, and coronal mass ejections. The sun experiences a phenomenon called differential rotation, where the Sun’s poles rotate at varying rates when compared to its equatorial regions [2]. Due to this property, the magnetic field lines on the Sun become “tangled”, producing regions of greater magnetic activity, known as ‘active regions’. These active regions are where solar flares and coronal mass ejections originate, as shown in fig. 1, where coronal mass ejections can cause charged particles from the Sun to interact with Earth’s magnetosphere, creating geomagnetic storms [3].

Because solar events are caused by an increase in charged particles, magnetogram analysis of the Sun has become a more prevalent method to predict instances of space weather. Magnetograms portray the active regions of charged particles on the Sun, where solar events occur [4].

II Current State of Space Weather Prediction

II-A Geomagnetic Storm

Geomagnetic storms occur when the charged particles emitted from the Sun experience interactions with Earth’s magnetosphere [5]. Current work in predicting geomagnetic storms involves reliance on measurements from ground-based magnetometers and real-time measurement of solar wind [5]. A ground-based magnetometer measures alterations of Earth’s magnetic field, allowing research conducted using these instruments to provide insight on when geomagnetic storms occur [5].

Domico et al. utilized computer vision techniques, including canny edge detection [6] and topological structure analysis [7] in order to identify sunspots through images of the Sun, obtained from the NASA Solar Dynamics Observatory [8] [5]. Analysis of past and present sunspot images, allowed for accurate prediction of geomagnetic storms through a Gaussian Kernel Support Vector Machine, or G-SVM [9].

II-B Solar Flare

Solar flares are typically detected through active regions (ARs) on the surface of the sun [10]. Recent studies have involved the use of deep learning [11] in solar flare prediction, a switch from previous statistical analysis. Current work has utilized data collected from the Helioseismic and Magnetic Imager (HMI) of the NASA Solar Dynamics Observatory [12]. The HMI instrument provides data in the form of magnetograms, capturing how the magnetic fields on the Sun’s surface transform over time [13]. The work involving the HMI instrument utilized a convolutional neural network based on the VGGNet model, which allows for the extraction of features in magnetograms to predict between a flare and no-flare case, as well as its associated flare class (C, M, X) [14].

II-C Coronal Mass Ejection

Coronal mass ejections (CMEs), generally associated with solar flares, occur when the magnetic field of space realigns explosively, sending a cloud of particles into space [15]. Current work explores the application of machine learning into the prediction of CMEs[15]. This work focused on determining the relationship between CMEs and Solar flares. The study leveraged data from the HMI instrument, using images of the Sun since 2010, specifically focused on the photospheric magnetic field in order to generate predictions. After classifying the images as CMEs or non-CMEs, the data was then inputted into a Support Vector Machine, which is a type of artificial intelligence model that can identify nonlinear relationships between variables [16].

III Methods and Materials

Refer to caption
Figure 2: Visualization of overall methodology. Dates between solar flare and CME events were first matched using DONKI database. Then, labels and images were automatically uploaded into dataframe using Google AppsScript which obtained dates from SDO.

III-A Sources

We use the NASA Solar Dynamics Observatory (SDO) and the NASA Space Weather Database Of Notifications, Knowledge, Information (DONKI) [17] as main sources of data for magnetograms and dates. Project was completed entirely digitally, through the use of a personal computer.

III-B Data Compilation

We aggregate solar magnetograms from the SDO, and geomagnetic storm (GMS), solar flare, and coronal mass ejection (CME) arrival times from DONKI (Wold 2018). SDO provides magnetograms at specific dates, which provide visualization of the Sun’s active regions on each day [4], and DONKI provides specific timings of all solar events recorded (Wold 2018).

Refer to caption
Figure 3: Visualization of matching process for CME and Solar Flare classes. Google Sheet processes would determine whether a date from CME column would match with Solar Flare date.

To compile a dataset, we shift arrival dates of solar flares and CMEs back by one day, then select dates that match between solar flares and CMEs, as shown in fig. 3. We then write a Google Apps Script [18] to synthesize a dataset by extracting the magnetogram from the date of the matched events, then upload a label of ‘1’ for the CME and solar flare, meaning that the events both occurred on the date. Then, we observe dates where there was an observed CME, but no solar flare, and extract magnetograms for these dates, but upload a label of ‘1’ for CME and ‘0’ for solar flares.

Refer to caption
Figure 4: Dataframe example. Contains image file along with three classes, Flare, CME, and GMS, with binary labels where 1 represents event and 0 represents no event.

Then, we observe dates where there was an observed solar flare and no observed CME, and extract magnetograms for these dates, but upload a label of ‘0’ for CME and ‘1’ for solar flares, as shown in fig. 4. Then, we manually input dates during which a GMS was observed, inputting ‘1’ for the GMS label. Following this, we extract magnetograms from dates (shifted back 1 day) where there was no observed solar event, uploading a ‘0’ for CME, solar flare, and GMS. Our methodology is presented in fig. 2. This original dataset is given by DdataD_{\text{data}}. We apply an 80/20 train/test split to this data, then proceed to resample train data.

III-C Data Resampling

Data resampling was necessary due to imbalance between classes in DdataD_{\text{data}} as a result of data collection procedures resulting in a disproportionate number of ‘1’cases for each class.

Refer to caption
Figure 5: Visualization of the SMOTE algorithm. The SMOTE algorithm [19] synthesizes data using real data from the minority class of the dataset. Courtesy of Analytics Vidhya.

To alleviate this, we apply the Synthetic Minority Over-sampling Technique (SMOTE) algorithm, visualized in fig. 5 in order to synthesize samples from the minority class, which are the ‘0’ cases for the flare and CME classes. This was done separately per class, as the GMS case had significantly more ‘0’ cases than ‘1’ cases. We further resample Ddata using the SMOTE algorithm, as a result of an imbalance between the number of GMS observations in comparison to solar flare and CME cases.

III-D Model Architecture

For this image classification problem, we utilize a custom architecture convolutional neural network (CNN) due to the process by which it is able to identify patterns in image inputs [20].

A CNN is composed of multiple layers, which include the input layer, convolutional layer, pooling layer, rectified linear unit layer, and fully connected layer [20]. The input layer receives the original image input, then processes and resizes the image in order to be passed to the following layers [20]. Each pixel in the image is assigned a value from 0-255 [20]. The convolutional layer applies a filter to the image, where a filter is a matrix of values within this range [20]. This filter extracts features from the image, creating a ‘feature map’ of patterns in the image [20]. Feature maps are calculated based on the formula:

G[m,n]=(fh)[m,n]=ijh[i,j]f[mi,nj]G[m,n]=(f*h)[m,n]=\sum_{i}\sum_{j}h[i,j]f[m-i,n-j]

Where f represents a pixel in the input image, while h represents a kernel image, while m represents the columns of the matrix, and n represents the rows of the matrices [21].

The pooling layer downsamples the image while preserving the ‘features’ of the image [20]. The rectified linear unit layer, or ‘ReLU’ layer, is an activation function that implements nonlinearity into the model, which is extremely important considering the multi-label binary classification problem we aim to target [20]. The fully connected layer converts the observed patterns to generate predictions [20]).

Refer to caption
Figure 6: Visualization of Convolutional Neural Network. Contains four convolutional layers and four max pooling layers, which have provided strong results.

We propose a model using four convolutional layers and four max pooling layers using the ReLU activation function, as shown in fig. 6.

III-E Model Training and Testing

After implementing custom model architecture, the neural network was trained for 20 epochs, with an early stopping mechanism implemented to prevent overfitting as a result of the resampling that was done to the training data, as shown in fig. 7 [22].

Refer to caption
Figure 7: Visualization of early stopping mechanism. After the model converges, the early stopping mechanism halts the training process to prevent overfitting. Courtesy of Ramazan Gençay.

The learning rate of the network was 0.0001, using the Adam [23] optimization framework.

IV Results

IV-A Metrics

We evaluate all models on the metrics of precision, accuracy, recall, and F1 score. These metrics are computer as following:

accuracy=True PositivesTrue Positives+False Positives+True Negatives+False Negatives,\text{accuracy}=\dfrac{\text{True Positives}}{\text{True Positives}+\text{False Positives}+\text{True Negatives}+\text{False Negatives}},

precision=True PositivesTrue Positives+False Positives,\text{precision}=\dfrac{\text{True Positives}}{\text{True Positives}+\text{False Positives}},

recall=True PositivesTrue Positives+False Negatives,\text{recall}=\dfrac{\text{True Positives}}{\text{True Positives}+\text{False Negatives}},

F1=2precisionrecallprecision+recall.\text{F1}=\dfrac{2\cdot\text{precision}\cdot\text{recall}}{\text{precision}+\text{recall}}.

IV-B Evaluation

Refer to caption
Figure 8: Table presenting accuracy, precision, and recall of model. The model performed strongly in each of these classes.
Refer to caption
Figure 9: Table presenting F1 Scores of the model in each class. The model performed strongly in each of these categories.

The model returned an accuracy of 90.27%, precision of 85.83%, recall of 91.78%, and F1 score of 92.14%, along with strong metrics for F1 scores per class, as shown in fig. 8 and fig 9.

We present confusion matrices to further visualize data, which show the proportion of true positive, false positive, true negative, and false positive predictions. We use normalized confusion matrices in order to show the true proportion of classes predicted, because there was strong imbalance between classes in the dataset. The following confusion matrices provide insight on the model’s performance per class:

Refer to caption
Figure 10: Confusion matrix visualizing model performance in predicting solar flare cases.

For the solar flare class, the network performs well, as characterized by the strong downwards slope as shown in fig. 10.

Refer to caption
Figure 11: Confusion matrix visualizing model performance in predicting geomagnetic storm cases.

For the GMS class, the network also performs relatively well, as characterized by the defined downwards slope as shown in fig. 11. However, this class had a major imbalance between event and no event data labels, and as a result, data synthesis may have resulted in overfitting to the ‘Not GMS’ class, resulting in a 1.00 accuracy in predicting the case in testing.

Refer to caption
Figure 12: Confusion matrix visualizing model performance in predicting coronal mass ejection cases.

For the CME class, the network performs slightly poorly, as characterized by the vertical bar observed in fig. 12. This is likely due to overfitting in the model’s training, as a result of the major disparity between event and no event cases in the training data. Though resampling was done, it relied heavily on the currently available data, which was insufficient and likely resulted in overfitting for this class.

With strong performance in the solar flare and GMS classes, the model shows promising results in the field of space weather prediction.

V Discussion and Conclusions

We propose a method of space weather prediction through the use of active region solar magnetograms. We extend the research of Domico et al., Zhang et al., and Bobra et al. to develop a system of space weather prediction which is capable of predicting multiple classes of space weather based on magnetogram inputs into a custom architecture CNN [5] [12] [15]. Future work can involve prediction of the strengths of events in order to generate more relevant forecasts of space weather, and utilizing time-series data in order to generate a more reliable forecast based on the solar cycle. We find that the proposed model is capable of accurately predicting instances of solar flares and geomagnetic storms, while it requires further testing and data aggregation in order to more accurately predict instances of coronal mass ejections. As observed, the model returned an accuracy of 90.27%, precision of 85.83%, recall of 91.78%. These statistics show promising results for the model, in that it is capable of making accurate and relevant predictions of when a solar event may occur, 24 hours in advance. A more uniform distribution of classes in the data would enable the model to better identify patterns in magnetograms, resulting in higher quality predictions and stronger performance. The proposed model and dataset aggregation method can be leveraged in further studies which explore the prediction of space weather events using artificial intelligence.

References

  • [1] D. H. Boteler, “A 21st century view of the march 1989 magnetic storm,” Space Weather, vol. 17, no. 10, pp. 1427–1441, 2019. [Online]. Available: https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2019SW002278
  • [2] S. A. Balbus, J. Bonart, H. N. Latter, and N. O. Weiss, “Differential rotation and convection in the Sun,” Monthly Notices of the Royal Astronomical Society, vol. 400, no. 1, pp. 176–182, 11 2009. [Online]. Available: https://doi.org/10.1111/j.1365-2966.2009.15464.x
  • [3] H. Lundstedt, H. Gleisner, and P. Wintoft, “Operational forecasts of the geomagnetic dst index,” Geophysical Research Letters, vol. 29, no. 24, pp. 34–1–34–4, 2002. [Online]. Available: https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2002GL016151
  • [4] K. Florios, I. Kontogiannis, S.-H. Park, J. A. Guerra, F. Benvenuto, D. S. Bloomfield, and M. K. Georgoulis, “Forecasting solar flares using magnetogram-based predictors and machine learning,” Solar Physics, vol. 293, no. 2, p. 28, Jan 2018. [Online]. Available: https://doi.org/10.1007/s11207-018-1250-4
  • [5] K. Domico, R. Sheatsley, Y. Beugin, Q. Burke, and P. McDaniel, “A machine learning and computer vision approach to geomagnetic storm forecasting,” 2022.
  • [6] W. Rong, Z. Li, W. Zhang, and L. Sun, “An improved canny edge detection algorithm,” in 2014 IEEE International Conference on Mechatronics and Automation, 2014, pp. 577–582.
  • [7] M. B. Fuchs, “Topological structural analysis,” Structural optimization, vol. 13, no. 2, pp. 104–111, Apr 1997. [Online]. Available: https://doi.org/10.1007/BF01199228
  • [8] W. D. Pesnell, B. J. Thompson, and P. C. Chamberlin, The Solar Dynamics Observatory (SDO).   New York, NY: Springer US, 2012, pp. 3–15. [Online]. Available: https://doi.org/10.1007/978-1-4614-3673-7˙2
  • [9] C.-A. Tsai and Y.-J. Chang, “Efficient selection of gaussian kernel svm parameters for imbalanced data,” Genes, vol. 14, no. 3, 2023. [Online]. Available: https://www.mdpi.com/2073-4425/14/3/583
  • [10] J. Wang, S. Liu, X. Ao, Y. Zhang, T. Wang, and Y. Liu, “Parameters derived from the sdo/hmi vector magnetic field data: Potential to improve machine-learning-based solar flare prediction models,” The Astrophysical Journal, vol. 884, no. 2, p. 175, oct 2019. [Online]. Available: https://dx.doi.org/10.3847/1538-4357/ab441b
  • [11] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015. [Online]. Available: https://doi.org/10.1038/nature14539
  • [12] H. Zhang, Q. Li, Y. Yang, J. Jing, J. T. L. Wang, H. Wang, and Z. Shang, “Solar flare index prediction using sdo/hmi vector magnetic data products with statistical and machine-learning methods,” The Astrophysical Journal Supplement Series, vol. 263, no. 2, p. 28, dec 2022. [Online]. Available: https://dx.doi.org/10.3847/1538-4365/ac9b17
  • [13] P. H. Scherrer, J. Schou, R. I. Bush, A. G. Kosovichev, R. S. Bogart, J. T. Hoeksema, Y. Liu, T. L. Duvall, J. Zhao, A. M. Title, C. J. Schrijver, T. D. Tarbell, and S. Tomczyk, “The helioseismic and magnetic imager (hmi) investigation for the solar dynamics observatory (sdo),” Solar Physics, vol. 275, no. 1, pp. 207–227, Jan 2012. [Online]. Available: https://doi.org/10.1007/s11207-011-9834-2
  • [14] T. Bai and P. A. Sturrock, “Classification of solar flares,” Annual Review of Astronomy and Astrophysics, vol. 27, no. Volume 27, 1989, pp. 421–467, 1989. [Online]. Available: https://www.annualreviews.org/content/journals/10.1146/annurev.aa.27.090189.002225
  • [15] M. G. Bobra and S. Ilonidis, “Predicting coronal mass ejections using machine learning methods,” The Astrophysical Journal, vol. 821, no. 2, p. 127, apr 2016. [Online]. Available: https://dx.doi.org/10.3847/0004-637X/821/2/127
  • [16] S. Suthaharan, Support Vector Machine.   Boston, MA: Springer US, 2016, pp. 207–235. [Online]. Available: https://doi.org/10.1007/978-1-4899-7641-3˙9
  • [17] Wold, Alexandra M., Mays, M. Leila, Taktakishvili, Aleksandre, Jian, Lan K., Odstrcil, Dusan, and MacNeice, Peter, “Verification of real-time wsa-enlil+cone simulations of cme arrival-time at the ccmc from 2010 to 2016,” J. Space Weather Space Clim., vol. 8, p. A17, 2018. [Online]. Available: https://doi.org/10.1051/swsc/2018005
  • [18] D. Airinei and D. Homocianu, “Cloud computing based web applications. examples and considerations on google apps script,” Proceedings of the IE 2017 International Conference, pp. 64–69, 2017. [Online]. Available: https://ssrn.com/abstract=2964756
  • [19] J.-B. Wang, C.-A. Zou, and G.-H. Fu, “Awsmote: An svm-based adaptive weighted smote for class-imbalance learning,” Scientific Programming, vol. 2021, p. 9947621, May 2021. [Online]. Available: https://doi.org/10.1155/2021/9947621
  • [20] N. Sharma, V. Jain, and A. Mishra, “An analysis of convolutional neural networks for image classification,” Procedia Computer Science, vol. 132, pp. 377–384, 2018, international Conference on Computational Intelligence and Data Science. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1877050918309335
  • [21] Raff, Mathematics Behind Convolutional Neural Networks.   Manning Publications Co., 2022. [Online]. Available: https://books.google.com/books?hl=en&lr=&id=pAhsEAAAQBAJ&oi=fnd&pg=PR7&dq=math+behind+cnn&ots=kWeCtVAGrZ&sig=4QZT9qcNQZKLBrD36MIDQufDRZU
  • [22] L. Prechelt, Early Stopping - But When?   Berlin, Heidelberg: Springer Berlin Heidelberg, 1998, pp. 55–69. [Online]. Available: https://doi.org/10.1007/3-540-49430-8˙3
  • [23] Z. Zhang, “Improved adam optimizer for deep neural networks,” in 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), 2018, pp. 1–2.