.tifpng.pngconvert #1 \OutputFile \AppendGraphicsExtensions.tif
Multiscale graph neural networks with adaptive mesh refinement for accelerating mesh-based simulations
Abstract
Mesh-based Graph Neural Networks (GNNs) have recently shown capabilities to simulate complex multiphysics problems with accelerated performance times. However, mesh-based GNNs require a large number of message-passing (MP) steps and suffer from over-smoothing for problems involving very fine mesh. In this work, we develop a multiscale mesh-based GNN framework mimicking a conventional iterative multigrid solver, coupled with adaptive mesh refinement (AMR), to mitigate challenges with conventional mesh-based GNNs. We use the framework to accelerate phase field (PF) fracture problems involving coupled partial differential equations with a near-singular operator due to near-zero modulus inside the crack. We define the initial graph representation using all mesh resolution levels. We perform a series of downsampling steps using Transformer MP GNNs to reach the coarsest graph followed by upsampling steps to reach the original graph. We use skip connectors from the generated embedding during coarsening to prevent over-smoothing. We use Transfer Learning (TL) to significantly reduce the size of training datasets needed to simulate different crack configurations and loading conditions. The trained framework showed accelerated simulation times, while maintaining high accuracy for all cases compared to physics-based PF fracture model. Finally, this work provides a new approach to accelerate a variety of mesh-based engineering multiphysics problems.
keywords:
Machine Learning; Phase Field Model; Multiscale; Mesh-based; Graph Neural Network; Algebraic Multigrid Scheme, Transfer Learning; Crack Propagation; Displacement Fields1 Introduction
Computational models play an important role across many engineering fields to simulate the behavior of complex physics phenomena without performing experiments. These models commonly rely on solving coupled system of partial differential equations (PDEs) that define the underlying physics of the problem. A favored approach to solve these PDEs and propagate the physics in time has been to discretize the problem domain into a mesh, where the solutions of the PDEs are then approximated. While these methods have proven to generate accurate and reliable results in the past, they quickly become computationally expensive as problem complexity is increased.
Fracture is one of the common means of failure in engineered materials and, therefore, widely studied through computational techniques, including the phase field (PF) method [1, 2, 3, 4, 5]. The PF approach to fracture regularizes the discontinuous crack using a continuous damage field and uses it to formulate an energy functional, [6, 7, 8, 9]. The PF fracture approach has been used to study crack propagation, nucleation, and branching in brittle and ductile materials, composite materials with anisotropy, and even biological systems [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]. Despite their robustness and ease of implementation, PF methods require fine mesh resolution near the crack tip to adequately capture the crack interface, increasing computational requirements. As such, PF fracture methods typically employ adaptive mesh refinement (AMR) to use fine mesh only near the crack surfaces. Despite AMR, PF fracture methods are still computationally demanding, often requiring many CPU hours on high-performance computing clusters.
Machine Learning (ML) techniques provide a reduced-order modeling approach to mitigate the computational costs of current models. The rapid increase in popularity of ML has led to the development of various works for accelerating simulation models [21, 22, 23, 24, 25, 26, 27, 28]. Graph neural networks (GNNs) combine graph theory with neural networks and represent the model input as a graph using connecting nodes and edges (i.e., similar to a mesh). While neural networks update the model’s weights through stochastic backward gradient descent [29, 30, 31], GNN models define the weights using the resulting graph representation of nodes and edges. Recently, GNNs have been used to predict finite element (FE) convergence of elastostatic problems [32], molecular dynamics of translationally-invariant and rotationally-covariant local atomic environments [33], predict the graph structures of 3D boundaries for granular flow processes [34], and predict the dynamics of various solid mechanics and fluid mechanics problems [35, 36, 37, 38, 39].
The graph representation approach is also ideal for mesh-based problems where the simulation mesh can be used as the model’s graph representation. Recently, mesh-based GNNs have been used for engineering problems, such as simulating FE stresses and displacements [40, 41, 42], crack propagation in PF models with AMR [43], and flow over cylinders and airfoils [44, 45]. However, mesh-based GNNs struggle when working with problems involving very fine mesh. GNN models use message-passing (MP) blocks [46] to transfer information and learn relations between the nodes and edges [47, 48]. For fine meshes, GNN-based frameworks require a large number of MP blocks to transfer and capture the relations between nodes that are far apart from each other. This is unfavorable for most mesh-based problems where information must be shared between nodes beyond the local neighborhoods to achieve high accuracies. Increasing the MP blocks also results in over-smoothing [49, 50, 51]. This challenge has led to the development of Multiscale GNNs [52], which mimic the conventional iterative multigrid scheme. Multiscale GNNs perform multiple graph coarsening operations obtaining smaller meshes at each level, which are passed through MP blocks [53, 54, 55]. The smaller mesh levels provide new connections between distant nodes, thus, transferring information beyond the previous local neighborhood and reducing the required number of MP steps. This technique has shown higher accuracies compared to conventional mesh-based GNNs in dynamic problems involving flow field predictions [56], time-independent PDEs in unstructured meshes and sparse linear systems [57, 58, 59], and fluid motion over rigid bodies of varying shape [60, 61]. While these techniques provide a new avenue for simulating various physics problems while avoiding over-smoothing and errors due to long-range interactions, there are some limitations. First, these techniques were designed for a fixed initial graph and employed different graph-pooling techniques for downsampling steps [62, 60, 63, 64]. These techniques can add to the overall computational cost and have yet to be adapted to problems involving AMR.
In this work, we develop a new multiscale GNN coupled with block-structured AMR (BSAMR) to study time-evolving problems with near-singular operators, such as PF fracture. The framework introduces a new simple and computationally efficient downsampling/upscaling approach for mesh-based AMR problems by exploiting the various mesh resolution levels. In the developed downsampling/upscaling approach the framework reduces the size of the mesh by eliminating the highest level of mesh refinement at each coarsening operation, as shown in Figure 1. We study PF fracture problems with five AMR levels and compare the accuracy and computational time of three coarsening approaches from the finest to the coarsest mesh: (i) four coarsening/downscale operations, (ii) two coarsening/downscale operations, and (iii) one coarsening/downscale operation. We begin by developing the framework for Mode-I fracture problems with a single crack on the left edge of the domain. We employ transfer learning (TL) [65, 66] to extend the framework for other configurations such as center cracks, right-edge cracks, and shear loading. We then analyze the framework’s accuracy for each problem configuration. Lastly, we compare the computational time of the developed multiscale and adaptive GNN framework against the high-fidelity PF model.

2 Methods
2.1 Physics based PF fracture model
In this work, we develop the multiscale mesh-based GNN framework with BSAMR for PF fracture problems. PF fracture problems are computationally demanding due to a) near singular operator due to near-zero modulus inside the crack field and b) the requirement of high mesh resolution near the crack tip. In the PF formulation of fracture, the sharp crack is regularized using a smooth scalar field ranging from 0 to 1. An energy functional is constructed using which is then minimized to obtain a set of coupled partial differential equations, a vector equation corresponding to elastic equilibrium, and a scalar equation for crack equilibrium. The reader is referred to the extensive literature on PF fracture for details on the formulation [67]. In this work, we use the second-order energy functional as below [4].
(1) |
Here, is the displacement field, is the strain, is the strain energy density, is the fracture energy constant, and is the regularization length scale of the crack. Here, takes the value inside the crack, and the value in the bulk material. In this work, we use the PF open-source high-fidelity model with BSAMR capabilities from [4] to simulate various crack propagation problems. Following [43], we chose a linear elastic isotropic brittle material , with Young’s modulus , Poisson’s ratio , fracture energy density , and . We fixed the bottom edge and applied displacement to the top edge in -direction for tensile load and -direction for shear load. We varied initial crack length, , edge position, , and crack angle, to gather the datasets of unique initial conditions.
2.2 Graph Neural Network



As shown in Figure 2, we formulated the graph representation using the instantaneous refined mesh, , where includes the mesh vertices, and the resulting mesh edges. The node features, , include the positions of the mesh vertices, , their x- and y-displacement values, , their crack field values, , and the applied displacement loading along the x- and y-directions, .
(2) |
The applied displacement loading node feature, , allows the framework to distinguish between cases subjected to tensile loading versus cases under shear loading.
The edge features, , are defined using the binary connectivity value, , where and indicate the “sender” and “receiver” nodes, respectively. This means that for nodes that share an edge and for cases where .
(3) |
Lastly, we note that for each time-step in a simulation, we construct the initial graph from the instantaneous refined mesh containing all five levels of mesh resolution, , shown in Figure 2(a). However, as shown in Figure 1, the framework removes each refinement level iteratively until obtaining the coarsest mesh at level 0 resolution. We define the corresponding graphs as (i) the initial instantaneous refined mesh with resolution levels 0-4 (shown in Figure 2(a)), (ii) the first downscaled mesh with resolution levels 0-3, (iii) the second downscaled mesh with resolution levels 0-2 (shown in Figure 2(b)), (iv) the third downscaled mesh with resolution levels 0-1, and (v) the coarsest downscaled mesh with resolution level 0 only (shown in Figure 2(c)).
2.3 Multiscale GNN framework
The developed GNN framework integrates AMR to the multigrid approach where the initial refined mesh is coarsened by one mesh level at each step. The resulting coarsened mesh resolutions provide new connections between distant nodes to transfer information beyond the previous local neighborhoods. This approach reduces the required number of MP steps, thus, avoiding over-smoothing and significantly reducing computational costs while maintaining high accuracy.
The architecture of the developed multiscale GNN framework is depicted in Figure 1. The first step of the framework is to input the graph representation defined in Section 2.2 and shown in Figure 2(a), at a given time “”, into an encoder network. We use an MLP model as the encoder network, denoted by .
(4) |
We input the generated latent-space embedding, , to a MP network, , to learn relations within the local neighborhoods. We employ Graph Transformers as our MP models in this work [68]. Graph Transformers were first introduced for the tasks of sequence modeling and language translation [69]. Recently, this technique was extended for graph-based tasks by applying the multi-head and self-attention mechanisms to the vertex and edge features of neighboring vertices, thus, creating additional attention embeddings [68]. The Transformer MP models in the developed framework follow the same architecture, involving four attention heads and a hidden node dimension of 128.
The “GNND1” block depicted in Figure 1 includes the MP network, , and the first Downscale operation, . Each Downscale operation removes the current finest level of mesh refinement. For instance, the first Downscale operation in Figure 1 removes refinement level 4 from graph (Figure 2(a)), resulting in graph .
(5) |
Here, is the resulting node embedding in refined mesh from the MP block GNN1, is the edge feature vector for the downscaled mesh , and is the resulting downscaled node embedding from to . Similarly, “GNND2” from Figure 1 includes a MP network, , and the second Downscale operation, , which removes refinement level 3 from graph , resulting in graph shown in Figure 2(b).
(6) |
where is the edge feature vector for the downscaled mesh , and is the resulting downscaled node embedding from to . We repeat this process using two additional MP blocks each followed by their coarsening operations to obtain the downscaled node embedding (from to ) for the coarsest mesh, , shown in Figure 2(c). The framework then includes an additional MP network (denoted by “GNNn” in Figure 1) which operates on the resulting coarsest mesh node embedding.
(7) |
Here, is the edge feature vector for the downscaled mesh shown in Figure 2(c), is an aggregation operation for the skip connectors (shown as dashed green lines in Figure 1), and is the resulting node embedding for mesh from the aggregation of the MP block GNNn and the skip connector.
Next, we perform a series of upscale steps to reconstruct the original refined mesh. For instance, “GNNU1” includes the first Upscale operation, , to reconstruct mesh level 1 (), followed by the skip connector from to the MP GNN, .
(8) |
In equation (8), defines the resulting reconstructed node embedding from the upscale operation, , (i.e., from to ), and denotes the resulting node embedding for mesh from the aggregation of the MP block and the skip connector from . Next, “” applies the second Upscale operation, , to regenerate mesh level 2 ( from Figure 2(b)), followed by the skip connector from to the MP GNN, .
(9) |
where defines the resulting reconstructed node embedding from the upscale operation, , (i.e., from to ), and denotes the resulting node embedding for mesh from the aggregation of the MP block and the skip connector from . We repeat this process using two additional upscale blocks (i.e., and in Figure 1) until we reconstruct the initial refined mesh with mesh levels 0-4 and obtain the resulting aggregated node embedding . Lastly, we use the final generated reconstructed embedding, , as input to a Decoder MLP network, , to transfer the embedding from the latent-space to the real-space and predict the displacements and scalar damage field at the next time-step, “”.
(10) |
2.4 Single-Stage, Two-Stage, and Four-Stage Refinement GNNs
While a higher number of downscale/upscale operations and MP GNNs may provide the framework with higher prediction accuracy, it comes with an increased computational cost. We evaluated the prediction accuracy and computational cost of three different architectures following the procedure described in the previous Section 2.3. For each architecture, we reduced the number of MP GNNs, upscale operations, and downscale operations. The first framework was a Four-Stage Refinement (FSR) GNN involving four downscale and four upscale operations as shown in Figure 1. The FSR framework architecture is described in detail in Section 2.3.
The second framework was a Two-Stage Refinement (TSR) GNN involving two downscale and two upscale operations. In the TSR framework, a single downscale/upscale operation accounts for two mesh resolution levels. For instance, given an initial instantaneous refined mesh (Figure 2(a)), the first downscale operation in TSR, , removes mesh resolution levels 4 and 3 to obtain the new mesh graph (i.e., Figure 2(b)), along with the resulting downscaled node embedding . By removing four MP GNNs, and two downscale and upscale operations, we expect the TSR framework to be faster than FSR.
Lastly, the third framework was a Single-Stage GNN (SSR) involving a single downscale/upscale operation. In the SSR framework, we removed six MP GNNs and three downscale and upscale operations. For instance, SSR involves a single downscale operation, , which removes mesh resolution levels 1-4 directly from the initial node embedding, , to obtain the coarsest mesh level shown in Figure 2(c). The SSR framework is significantly less computationally expensive compared to FSR and TSR as it does not require computing new graphs for intermediate mesh resolution levels (1-4), nor additional MP Transformer GNNs.
2.5 Transfer learning




First, we trained the multiscale GNN framework for the left-edge notched system under tensile loading, shown in Figure 3(a). We obtained a dataset of 1100 PF fracture simulations involving single-edge notched systems under tensile loading from [43]. Each of these simulations consisted of a unique crack configuration where the initial length , edge position , and orientation of the crack were varied. Additional details for the simulation set-up can be found in [43].
To perform TL, we first trained the multiscale GNN framework on the large dataset of 1100 single-edge notched simulations gathered from [43]. From the trained GNN framework, we transferred the weights of the encoder MLP (i.e., ), and the first MP model (i.e., GNND1) to a new GNN framework with similar architecture. Using this approach, we performed a series of sequential TL update steps. For each TL update step, we transferred the resulting pretrained weights to the next case study. We considered the following 4 case studies.
-
1.
Case 1: Left-edge cracks subjected to Mode I loading as shown in Figure 3(a).
-
2.
Case 2: Center cracks subjected to Mode I loading as shown in Figure 3(b).
-
3.
Case 3: Left-edge cracks subjected to Mode II loading as shown in Figure 3(c).
-
4.
Case 4: Right-edge cracks subjected to Mode I loading as shown in Figure 3(d).
Unlike Case 1 where left-edge cracks are only allowed to propagate towards the right, in Case 2 center cracks can propagate in both the left and right directions, thus, increasing the framework’s understanding. For Cases 2-3, we leveraged the PF model from [4] to generate new datasets for center cracks and shear loading scenarios. For each of these cases, we generated a dataset of 30 simulations (i.e., 15 for training, and 15 for testing). For Case 4 we leveraged the symmetry of the left-edge crack case study (i.e., Case 1) to mirror 30 randomly chosen simulations from the training dataset used in Case 1. We emphasize the significant decrease in the required size of the training dataset from 1100 (Case 1) to 15 simulations using TL (i.e., approximately 70 times smaller). The implementation of TL allows the multiscale GNN framework to be extended to other crack problems with a fast simulation time and decreased computational cost.
3 Results and discussion
3.1 Prediction and error analysis for FSR, TSR and SSR



Here, we compare the performance of FSR, TSR, and SSR architectures of the multiscale GNN framework. We obtained the predictions of the crack variable , -displacement, and -displacement for the left-edge crack cases. Figures 4(a) - 4(c) depict the predicted values, -displacement, and -displacement for a randomly chosen simulation from the test dataset of left-edge crack cases. The simulation chosen shows a crack of positive orientation, positioned towards the top of the left edge of the domain. The crack can also be seen approaching the right edge of the domain for complete material failure. From Figure 4(a) all three architectures predicted with nearly identical results to the ground truth (i.e., left-most case). We obtained a similar result for x- and y-displacements shown in Figures 4(b) - 4(c), where qualitatively the FSR, TSR, and SSR architectures predicted displacements with high accuracy. This qualitative analysis demonstrates that the TSR and SSR were able to maintain their prediction accuracy despite the reduced downsampling and upsampling steps.
Next, we computed the errors corresponding to each simulation from the test dataset. For each test simulation, we first computed the average percent error across all mesh points in for each time step. We then computed the average of the resulting percent error across all time steps of the simulations. For instance, we computed the error for the crack field as
. | (11) |
were and denote the predicted and ground truth crack field value at each mesh point , respectively, is the total number of mesh points in , is the first predicted time-step, and the final time-step. Using equation (11), we gathered the and displacement errors corresponding to each test simulation for the FSR, TSR, and SSR architectures. Figures 6(a) - 6(c) show the resulting crack field and displacement errors for the FSR architecture. Figures 6(d) - 6(f) depict the errors for the TSR architecture. Similarly, Figure 6(g) - 6(i) shows the resulting average errors for the SSR architecture.
Comparing predictions (i.e., left-most), we note that the FSR, TSR, and SSR architectures resulted in average percent errors below 0.3. The errors in -displacement and -displacement also remained below 0.35 for all architectures. These results confirm our previous observations from the qualitative analysis which indicated that TSR and SSR were capable of maintaining high prediction accuracy despite their reduced downsampling and upsampling steps.
Next, we computed the average errors across all testing simulations for each architecture 7(a). From Figure 7(a), we note that FSR architecture demonstrated the highest accuracy when predicting the crack field . However, the SSR architecture showed lower errors in crack field predictions than the TSR architecture. The TSR architecture resulted in the lowest error for -displacement prediction, while the FSR and SSR architectures showed similar errors at approximately 0.08. For -displacement predictions, the FSR and TSR architectures showed similar low errors at approximately 0.08, while the SSR architecture showed the highest error close to 0.12. Finally, Figure 7(a) shows that reducing the number of refinement operations for TSR and SSR did not significantly increase prediction error. The highest prediction error of 0.12 for SSR on -displacement predictions is still considerably low compared to previous work [43].









3.2 Simulation time analysis for FSR, TSR, and SSR
We compared the computational cost for FSR, TSR, and SSR architectures by calculating the time required to generate 30 simulations. In Section 3.1, we showed that the prediction errors did not significantly increase for the TSR and SSR architectures despite their lower number of downscaling and upscaling operations. Each operation increases computational costs because it requires storing the resulting mesh configurations and their node embeddings. Each operation also comes with added MP GNNs. Figure 7(b) depicts the total time in hours required for each framework to generate 30 randomly chosen simulations from the training dataset. As shown in Figure 7(b), the FSR GNN architecture required the longest simulation time of 6.13 hours. The TSR architecture was the second most computationally demanding, requiring 4.75 hours to simulate 30 cases. As expected, the fastest architecture with the lowest computational costs was the SSR, which required only 3.91 hours. In contrast, we note from [43] that the high-fidelity PF model required approximately 43.5 hours to generate 30 cases. Due to its lower simulation time and high prediction accuracy, we chose the SSR architecture for subsequent TL steps.


3.3 Center crack cases






In Sections 3.1 - 3.2, we determined that the SSR architecture provided high prediction accuracy at significantly lower computational costs. We implemented TL to the multiscale GNN framework with SSR architecture for cases with center cracks as shown in Figure 3(b). We tested the extended SSR-based framework to predict the crack and displacement fields for a randomly chosen case from the testing data. For this analysis, we first obtained the initial time-step prediction () as shown in Figure 9(a) for the ground truth versus predicted crack field, Figure 9(c) for the ground truth versus predicted -displacement, and Figure 9(e) for the ground truth versus predicted -displacement. Next, we propagated the simulation in time and obtained the corresponding predictions of crack and displacement fields for a time-step approaching complete material failure ().
Figures 9(b), 9(d), and 9(f) depict the ground truth versus predicted crack field, -displacement, and -displacement, respectively, at . These results depict similar accuracies compared to the left-edge crack cases. During both time steps (i.e., and ) the SSR-based framework’s predictions are virtually indistinguishable from the high-fidelity PF model. These qualitative results demonstrate that the SSR-based framework was capable of predicting crack propagation for center crack cases through TL.
We also performed a quantitative analysis to verify these qualitative observations by generating the errors of each test simulation. Figure 10 depicts the computed average percent errors for center crack test cases obtained using equation (11). For crack field errors, the extended SSR framework maintained high prediction accuracy (less than 0.125 error) across all test cases. Similarly, the -displacement errors remained below 0.25 error, and y-displacement errors below 0.20. These results emphasize the high prediction accuracy of the SSR framework by implementing TL using significantly smaller training datasets. Additionally, extending the SSR framework from left-edge cracks to center cracks increases the framework’s capabilities in predicting cases where cracks propagate in both the left and right directions.



3.4 Shear load cases






In this step, we apply TL update to the extended SSR-based framework obtained in Section 3.3. As mentioned in Section 2.2, the node features allow the framework to distinguish between tensile loading (e.g., and ) and shear loading (e.g., and ) cases. Following a similar approach as for center cracks, we evaluated the resulting framework for shear load cases to predict the crack field, and displacement fields at an initial time-step, , and at a later time-step approaching material failure, .
Figures 12(a), 12(c), and 12(e) compare GNN predictions and PF predictions for the crack field, -displacement, and -displacement fields, respectively for initial time steps. Figures 12(b), 12(d), and 12(f) compare GNN and PF predictions for the crack field, -displacement, and -displacement fields, respectively for the final time step. These figures show that the new extended SSR framework was able to capture cases involving shear loads and generate accurate predictions at both time steps with virtually identical results compared to the ground truth. For quantitative analysis, we computed the average errors for each test simulation following the approach described in Section 3.1. Figure 13 shows the resulting average percent errors for all test cases under shear loads. As shown, the SSR also predicted shear cases with high accuracy. Average percent errors for the crack field, -displacement, and -displacement resulted below 0.25, 1.20, and 0.25, respectively.



3.5 Right-edge crack cases
Lastly, we extended the framework for cases involving right-edge cracks. For this TL update step, we mirrored the dataset involving left-edge cracks subjected to tension as described in Section 2.5. Following the same approach as in previous sections, we first tested the resulting trained SSR framework to simulate a randomly chosen simulation from the test dataset. Figures 15(a) - 15(b) depict the prediction history of the crack field parameter from to for the chosen test simulation. The prediction history depicts qualitatively identical results compared to the PF predictions. This high prediction accuracy is verified in Figure 16(a), where the average error across all simulations remains below 0.10. For the x-displacement and y-displacement predictions, we show the resulting SSR simulation histories in Figures 15(c) - 15(f), respectively. The SSR framework also shows high prediction accuracy compared to the PF simulations. The obtained average percent errors for x-displacement are shown in Figures 16(b). The errors across all testing samples remained below 0.25. Also, for y-displacement errors shown in Figure 16(c), it may be noted that the average percent errors remained below 0.30. Ultimately, these results show that by implementing a series of sequential TL update steps, the SSR-based framework was successful in simulating multiple problem configurations with high accuracy.









4 Conclusion
Complex multiphysics phenomena are often modeled using computationally expensive approaches involving solving coupled multiphysics equations on a mesh. Recent ML models such as mesh-based GNNs have emerged as promising tools to simulate multiphysics problems at a reduced cost. However, conventional mesh-based GNNs suffer from over-smoothing when working with fine meshes due to a high number of required MP steps. This work introduces a mesh-based multiscale GNN framework with AMR for simulating multiphysics problems with a reduced number of MP steps, high prediction accuracy, and accelerated performance times. The developed formulation implements sequential coarsening and upscaling operations by removing/adding the highest mesh refinement resolution level at each step. This approach results in new graphs with fewer mesh resolution levels providing new distant connections and larger local neighborhoods. The framework employs a state-of-the-art Graph Transformer for the MP networks at each coarsening/upscaling operation, and skip-connectors from the coarsening operations to the upscaling operations to avoid information loss.
Due to the high complexity and computational requirements of multiphysics PF models, this work tested the multiscale GNN on PF crack problems with a near singular operator and coupled equations. First, we studied single-edge notched systems (i.e., left-edge cracks) subjected to tension. We obtained a large dataset for these cases using an open-source PF fracture code. We developed and compared three architectures (FSR, TSR, and SSR) for simulating single-edge notched systems using different numbers of MP GNN blocks, and coarsening/upscaling operations. The SSR architecture, with the smallest number of operations, demonstrated the fastest prediction times while maintaining high prediction accuracy.
We then used TL to extend the SSR-based framework to simulate different PF crack propagation problems. We implemented TL to the SSR framework for simulating problems involving (i) center cracks, (ii) left-edge cracks subjected to shear, and (iii) right-edge cracks using only 15 training samples. For all cases (i-iii) the proposed accelerated framework predicted the crack field, and displacement field evolution with high accuracy above 98. These results demonstrated that by using TL, the multiscale SSR-based GNN framework can be extended for different problem configurations with two orders of magnitude smaller training datasets.
Ultimately, this work introduced a new mesh-based multiscale formulation that benefits from the computational efficiencies of the AMR method, mimicking the conventional iterative multigrid scheme, and the TL approach. The resulting AMR mesh-based multiscale GNN provides a framework for simulating additional complex AMR mesh-based engineering and multiphysics problems with high accuracy and accelerated performance.
5 Supplementary information
Additional information for (i) maximum error analysis for the entire test datasets of left-edge crack, center crack, shear load, and right-edge crack cases, and (ii) generated sample simulations for each case can be found in https://github.com/rperera12/Adaptive-mesh-based-Multiscale-Graph-Neural-Network.
6 Acknowledgements
The authors are grateful for the financial support provided by the U.S. Department of Defense in conjunction with the Naval Air Warfare Center/Weapons Division through the SMART scholarship Program (SMART ID: 2021-17978).
References
- [1] M. Ambati, T. Gerasimov, and L. De Lorenzis, “A review on phase-field models of brittle fracture and a new fast hybrid formulation,” Computational Mechanics, vol. 55, pp. 383–405, 2015.
- [2] M. Ambati, R. Kruse, and L. De Lorenzis, “A phase-field model for ductile fracture at finite strains and its experimental verification,” Computational Mechanics, vol. 57, pp. 149–167, 2016.
- [3] F. Ernesti, M. Schneider, and T. Böhlke, “Fast implicit solvers for phase-field fracture problems on heterogeneous microstructures,” Computer Methods in Applied Mechanics and Engineering, vol. 363, p. 112793, 2020.
- [4] S. Goswami, C. Anitescu, and T. Rabczuk, “Adaptive fourth-order phase field analysis for brittle fracture,” Computer Methods in Applied Mechanics and Engineering, vol. 361, p. 112808, 2020.
- [5] G. Zhang, T. F. Guo, K. I. Elkhodary, S. Tang, and X. Guo, “Mixed graph-fem phase field modeling of fracture in plates and shells with nonlinearly elastic solids,” Computer Methods in Applied Mechanics and Engineering, vol. 389, p. 114282, 2022.
- [6] G. A. Francfort and J.-J. Marigo, “Revisiting brittle fracture as an energy minimization problem,” Journal of the Mechanics and Physics of Solids, vol. 46, no. 8, pp. 1319–1342, 1998.
- [7] A. Egger, U. Pillai, K. Agathos, E. Kakouris, E. Chatzi, I. A. Aschroft, and S. P. Triantafyllou, “Discrete and phase field methods for linear elastic fracture mechanics: a comparative study and state-of-the-art review,” Applied Sciences, vol. 9, no. 12, p. 2436, 2019.
- [8] J. G. Ribot, V. Agrawal, and B. Runnels, “A new approach for phase field modeling of grain boundaries with strongly nonconvex energy,” Modelling and Simulation in Materials Science and Engineering, vol. 27, no. 8, p. 084007, 2019.
- [9] B. Runnels and V. Agrawal, “Phase field disconnections: A continuum method for disconnection-mediated grain boundary motion,” Scripta Materialia, vol. 186, pp. 6–10, 2020.
- [10] W. Xu, H. Yu, J. Zhang, C. Lyu, Q. Wang, M. Micheal, and H. Wu, “Phase-field method of crack branching during sc-co2 fracturing: A new energy release rate criterion coupling pore pressure gradient,” Computer Methods in Applied Mechanics and Engineering, vol. 399, p. 115366, 2022.
- [11] S. A. Vajari, M. Neuner, P. K. Arunachala, A. Ziccarelli, G. Deierlein, and C. Linder, “A thermodynamically consistent finite strain phase field approach to ductile fracture considering multi-axial stress states,” Computer Methods in Applied Mechanics and Engineering, vol. 400, p. 115467, 2022.
- [12] W. Li, M. Ambati, N. Nguyen-Thanh, H. Du, and K. Zhou, “Adaptive fourth-order phase-field modeling of ductile fracture using an isogeometric-meshfree approach,” Computer Methods in Applied Mechanics and Engineering, vol. 406, p. 115861, 2023.
- [13] J. Han, S. Matsubara, S. Moriguchi, and K. Terada, “Variational crack phase-field model for ductile fracture with elastic and plastic damage variables,” Computer Methods in Applied Mechanics and Engineering, vol. 400, p. 115577, 2022.
- [14] V. Agrawal and B. Runnels, “Robust, strong form mechanics on an adaptive structured grid: efficiently solving variable-geometry near-singular problems with diffuse interfaces,” Computational Mechanics, pp. 1–19, 2023.
- [15] V. Agrawal and B. Runnels, “Block structured adaptive mesh refinement and strong form elasticity approach to phase field fracture with applications to delamination, crack branching and crack deflection,” Computer Methods in Applied Mechanics and Engineering, vol. 385, p. 114011, 2021.
- [16] Y. Chen and Y. Shen, “A “parallel universe” scheme for crack nucleation in the phase field approach to fracture,” Computer Methods in Applied Mechanics and Engineering, vol. 403, p. 115708, 2023.
- [17] S. Brach, E. Tanné, B. Bourdin, and K. Bhattacharya, “Phase-field study of crack nucleation and propagation in elastic–perfectly plastic bodies,” Computer Methods in Applied Mechanics and Engineering, vol. 353, pp. 44–65, 2019.
- [18] O. Gültekin, H. Dal, and G. A. Holzapfel, “Numerical aspects of anisotropic failure in soft biological tissues favor energy-based criteria: A rate-dependent anisotropic crack phase-field model,” Computer methods in applied mechanics and engineering, vol. 331, pp. 23–52, 2018.
- [19] M. Marulli, A. Valverde-González, A. Quintanas-Corominas, M. Paggi, and J. Reinoso, “A combined phase-field and cohesive zone model approach for crack propagation in layered structures made of nonlinear rubber-like materials,” Computer Methods in Applied Mechanics and Engineering, vol. 395, p. 115007, 2022.
- [20] B. Yin and M. Kaliske, “An anisotropic phase-field model based on the equivalent crack surface energy density at finite strain,” Computer Methods in Applied Mechanics and Engineering, vol. 369, p. 113202, 2020.
- [21] A. Hunter, B. A. Moore, M. Mudunuru, V. Chau, R. Tchoua, C. Nyshadham, S. Karra, D. O’Malley, E. Rougier, H. Viswanathan, et al., “Reduced-order modeling through machine learning and graph-theoretic approaches for brittle fracture applications,” Computational Materials Science, vol. 157, pp. 87–98, 2019.
- [22] A. J. Lew, C.-H. Yu, Y.-C. Hsu, and M. J. Buehler, “Deep learning model to predict fracture mechanisms of graphene,” npj 2D Materials and Applications, vol. 5, no. 1, p. 48, 2021.
- [23] B. Euser, E. Rougier, Z. Lei, E. E. Knight, L. P. Frash, J. W. Carey, H. Viswanathan, and A. Munjiza, “Simulation of fracture coalescence in granite via the combined finite–discrete element method,” Rock Mechanics and Rock Engineering, vol. 52, pp. 3213–3227, 2019.
- [24] D. Montes de Oca Zapiain, J. A. Stewart, and R. Dingreville, “Accelerating phase-field-based microstructure evolution predictions via surrogate models trained by machine learning methods,” npj Computational Materials, vol. 7, no. 1, p. 3, 2021.
- [25] Z. Yang, C.-H. Yu, and M. J. Buehler, “Deep learning model to predict complex stress and strain fields in hierarchical composites,” Science Advances, vol. 7, no. 15, p. eabd7416, 2021.
- [26] D. Sharma, V. Pandey, I. V. Singh, S. Natarajan, J. Kumar, and S. Ahmad, “A polygonal fem and continuum damage mechanics based framework for stochastic simulation of fatigue life scatter in duplex microstructure titanium alloys,” Mechanics of Materials, vol. 163, p. 104071, 2021.
- [27] L. Zhang and X. Wei, “Prediction of fatigue crack growth under variable amplitude loading by artificial neural network-based lagrange interpolation,” Mechanics of Materials, vol. 171, p. 104309, 2022.
- [28] Y. Wang, D. Oyen, W. Guo, A. Mehta, C. Scott, N. Panda, M. Fernández-Godino, G. Srinivasan, and X. Yue, “Stressnet—deep learning to predict stress with fracture propagation in brittle materials. npj mater,” Degrad, vol. 5, no. 1, pp. 1–10, 2021.
- [29] S.-i. Amari, “Backpropagation and stochastic gradient descent method,” Neurocomputing, vol. 5, no. 4-5, pp. 185–196, 1993.
- [30] L. Bottou, “Large-scale machine learning with stochastic gradient descent,” in Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers, pp. 177–186, Springer, 2010.
- [31] N. Ketkar and N. Ketkar, “Stochastic gradient descent,” Deep learning with Python: A hands-on introduction, pp. 113–132, 2017.
- [32] N. Black and A. R. Najafi, “Learning finite element convergence with the multi-fidelity graph neural network,” Computer Methods in Applied Mechanics and Engineering, vol. 397, p. 115120, 2022.
- [33] C. W. Park, M. Kornbluth, J. Vandermause, C. Wolverton, B. Kozinsky, and J. P. Mailoa, “Accurate and scalable graph neural network force field and molecular dynamics with direct force architecture,” npj Computational Materials, vol. 7, no. 1, p. 73, 2021.
- [34] A. Mayr, S. Lehner, A. Mayrhofer, C. Kloss, S. Hochreiter, and J. Brandstetter, “Boundary graph neural networks for 3d simulations,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 9099–9107, 2023.
- [35] N. N. Vlassis, R. Ma, and W. Sun, “Geometric deep learning for computational mechanics part i: Anisotropic hyperelasticity,” Computer Methods in Applied Mechanics and Engineering, vol. 371, p. 113299, 2020.
- [36] N. N. Vlassis and W. Sun, “Geometric deep learning for computational mechanics part ii: Graph embedding for interpretable multiscale plasticity,” arXiv preprint arXiv:2208.00246, 2022.
- [37] R. Perera, D. Guzzetti, and V. Agrawal, “Graph neural networks for simulating crack coalescence and propagation in brittle materials,” Computer Methods in Applied Mechanics and Engineering, vol. 395, p. 115021, 2022.
- [38] Z. Li and A. B. Farimani, “Graph neural network-accelerated lagrangian fluid simulation,” Computers & Graphics, vol. 103, pp. 201–211, 2022.
- [39] R. Bhattoo, S. Ranu, and N. Krishnan, “Learning articulated rigid body dynamics with lagrangian graph neural network,” Advances in Neural Information Processing Systems, vol. 35, pp. 29789–29800, 2022.
- [40] Z. Jin, B. Zheng, C. Kim, and G. X. Gu, “Leveraging graph neural networks and neural operator techniques for high-fidelity mesh-based physics simulations,” APL Machine Learning, vol. 1, p. 046109, 11 2023.
- [41] C. Jiang and N.-Z. Chen, “Graph neural networks (gnns) based accelerated numerical simulation,” Engineering Applications of Artificial Intelligence, vol. 123, p. 106370, 2023.
- [42] J. C. Wong, C. C. Ooi, J. Chattoraj, L. Lestandi, G. Dong, U. Kizhakkinan, D. W. Rosen, M. H. Jhon, and M. H. Dao, “Graph neural network based surrogate model of physics simulations for geometry design,” in 2022 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1469–1475, 2022.
- [43] R. Perera and V. Agrawal, “Dynamic and adaptive mesh-based graph neural network framework for simulating displacement and crack fields in phase field models,” Mechanics of Materials, vol. 186, p. 104789, 2023.
- [44] X. Shao, Z. Liu, S. Zhang, Z. Zhao, and C. Hu, “Pignn-cfd: A physics-informed graph neural network for rapid predicting urban wind field defined on unstructured mesh,” Building and Environment, vol. 232, p. 110056, 2023.
- [45] T. Pfaff, M. Fortunato, A. Sanchez-Gonzalez, and P. W. Battaglia, “Learning mesh-based simulation with graph networks,” arXiv preprint arXiv:2010.03409, 2020.
- [46] J. Gasteiger, J. Groß, and S. Günnemann, “Directional message passing for molecular graphs,” arXiv preprint arXiv:2003.03123, 2020.
- [47] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,” IEEE transactions on neural networks, vol. 20, no. 1, pp. 61–80, 2008.
- [48] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in International conference on machine learning, pp. 1263–1272, PMLR, 2017.
- [49] Q. Li, Z. Han, and X.-m. Wu, “Deeper insights into graph convolutional networks for semi-supervised learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, Apr. 2018.
- [50] T. K. Rusch, M. M. Bronstein, and S. Mishra, “A survey on oversmoothing in graph neural networks,” arXiv preprint arXiv:2303.10993, 2023.
- [51] C. Cai and Y. Wang, “A note on over-smoothing for graph neural networks,” arXiv preprint arXiv:2006.13318, 2020.
- [52] M. Fortunato, T. Pfaff, P. Wirnsberger, A. Pritzel, and P. Battaglia, “Multiscale meshgraphnets,” arXiv preprint arXiv:2210.00612, 2022.
- [53] K. Stüben et al., “An introduction to algebraic multigrid,” Multigrid, pp. 413–532, 2001.
- [54] J. Xu and L. Zikatanov, “Algebraic multigrid methods,” Acta Numerica, vol. 26, pp. 591–721, 2017.
- [55] M. Eliasof and E. Treister, “Diffgcn: Graph convolutional networks via differential operators and algebraic multigrid pooling,” Advances in neural information processing systems, vol. 33, pp. 18016–18027, 2020.
- [56] Z. Yang, Y. Dong, X. Deng, and L. Zhang, “Amgnet: Multi-scale graph neural networks for flow field prediction,” Connection Science, vol. 34, no. 1, pp. 2500–2519, 2022.
- [57] R. J. Gladstone, H. Rahmani, V. Suryakumar, H. Meidani, M. D’Elia, and A. Zareei, “Gnn-based physics solver for time-independent pdes,” arXiv preprint arXiv:2303.15681, 2023.
- [58] W. Liu, M. Yagoubi, and M. Schoenauer, “Multi-resolution graph neural networks for pde approximation,” in Artificial Neural Networks and Machine Learning–ICANN 2021: 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14–17, 2021, Proceedings, Part III 30, pp. 151–163, Springer, 2021.
- [59] I. Luz, M. Galun, H. Maron, R. Basri, and I. Yavneh, “Learning algebraic multigrid using graph neural networks,” in International Conference on Machine Learning, pp. 6489–6499, PMLR, 2020.
- [60] M. Lino, C. Cantwell, A. A. Bharath, and S. Fotiadis, “Simulating continuum mechanics with multi-scale graph neural networks,” arXiv preprint arXiv:2106.04900, 2021.
- [61] M. Lino, S. Fotiadis, A. A. Bharath, and C. Cantwell, “Towards fast simulation of environmental fluid mechanics with multi-scale graph neural networks,” arXiv preprint arXiv:2205.02637, 2022.
- [62] Y. Cao, M. Chai, M. Li, and C. Jiang, “Efficient learning of mesh-based physical simulation with bi-stride multi-scale graph neural network,” in International Conference on Machine Learning, pp. 3541–3558, PMLR, 2023.
- [63] H. Gao and S. Ji, “Graph u-nets,” in international conference on machine learning, pp. 2083–2092, PMLR, 2019.
- [64] S. Barwey, V. Shankar, V. Viswanathan, and R. Maulik, “Multiscale graph neural network autoencoders for interpretable scientific machine learning,” Journal of Computational Physics, vol. 495, p. 112537, 2023.
- [65] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?,” Advances in neural information processing systems, vol. 27, 2014.
- [66] R. Perera and V. Agrawal, “A generalized machine learning framework for brittle crack problems using transfer learning and graph neural networks,” Mechanics of Materials, vol. 181, p. 104639, 2023.
- [67] G. Francfort, “Variational fracture: twenty years after,” International Journal of Fracture, vol. 237, no. 1-2, pp. 3–13, 2022.
- [68] Y. Shi, Z. Huang, S. Feng, H. Zhong, W. Wang, and Y. Sun, “Masked label prediction: Unified message passing model for semi-supervised classification,” arXiv preprint arXiv:2009.03509, 2020.
- [69] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.