Efficient neural topology optimization via active learning for enhancing turbulent mass transfer in fluid channels
Abstract
The design of fluid channel structures of reactors or separators of chemical processes is key to enhancing the mass transfer processes inside the devices. However, the systematic design of channel topological structures is difficult for complex turbulent flows. Here, we address this challenge by developing a machine learning framework to efficiently perform topology optimization of channel structures for turbulent mass transfer. We represent a topological structure using a neural network (referred to as “neural topology”), which is optimized by employing pre-trained neural operators combined with a fine-tuning strategy with active data augmentation. The optimization is performed with two objectives: maximization of mass transfer efficiency and minimization of energy consumption, for the possible considerations of compromise between the two in real-world designs. The developed neural operator with active learning is data efficient in network training and demonstrates superior computational efficiency compared with traditional methods in obtaining optimal structures across a large design space. The optimization results are validated through experiments, proving that the optimized channel improves concentration uniformity by 37% compared with the original channel. We also demonstrate the variation of the optimal structures with changes in inlet velocity conditions, providing a reference for designing turbulent mass-transfer devices under different operating conditions.
keywords:
computational fluid dynamics; mass transfer enhancement; topology optimization; neural operator; active learning; neural topology
1 Introduction
Turbulent mass transfer is a fundamental phenomenon and directly influences the efficiency of both separation [1] and reaction [2, 3], the two major operations in chemical and many other process industries [4]. Enhancing mass transfer is an effective way of process enhancement aiming at inherently and significantly increasing the efficiency of the processes [5]. Experimental studies showed that mass transfer is strongly affected by the inside geometric structure of the channel of the fluid flow, as, in addition to molecular diffusion, its convection and fluctuation that drive the movement of species within the fluid mixture are determined mainly by the structure [6, 7, 8, 9].
Efforts have been made by altering fluid channels’ structures in process equipment to devise structures that intensify the convection and fluctuation-driven mass transfer in turbulent mass transfer processes. These include addition of baffles [10], porous media [11], and installing elements to improve flow patterns or to induce disturbances [12, 13, 14]. However, in most of the previous research, the designs of improved structures were based on designers’ imaginations or inspirations combined with experimental and/or numerical validations, and, therefore, the best structure giving the most effective turbulent mass transfer cannot be guaranteed.
Topology optimization (TO) is a systematic way of finding the best structure for given objectives, including, for example, mass transfer flux maximization and flow resistance minimization [15, 16]. Since mass transfer typically occurs in turbulent flows within chemical equipment and optimal topology is easily affected by velocity boundary conditions, optimization of the structure of the fluid channel by TO to improve turbulent mass transfer efficiency is a challenge [17, 18]. In recent years, researchers have applied TO to mass transfer processes such as reactors [19, 20] and micro-mixing [21, 22]. These studies typically formulated the optimization problem employing a Lagrangian augmentation equation based on a mechanism model, objective functions, and process constraints, and then solved the problem using algorithms such as the method of moving asymptotes (MMA) or genetic algorithms. Since the turbulent mass transfer process is complex, these methods, which require repeatedly solving a mechanism model and Lagrange multiplier equations, demand significant computational resources [23, 24]. Moreover, these methods are limited to optimizations under a single boundary condition and cannot take into account varying boundary conditions, such as inlet velocities, which have an impact on the optimal topological structure.
The recent development of scientific machine learning methods [25], such as physics-informed neural networks [26, 27, 28, 29, 30] and neural operators [31, 32, 33, 34, 35, 36], has provided effective approaches for physical fields prediction under different operation conditions. Researchers have applied these models to fluid optimization, including streamline design of flow reactor [37], airfoil [38, 39, 40], artificial catheters [41], solid location optimization in nanoscale heat transfer processes [42], and solid shape optimization within laminar flow channels [43, 44]. However, these studies primarily focus on optimizing specific geometric parameters within fluid channels, and large training datasets are typically required for flow channel structure optimization. To find optimal topological structures within high-dimensional function spaces (e.g., in the case of fluid flow channel structure optimization to enhance turbulent mass transfer), the volume, position, and shape of solid structures within channels need all be handled as variables, and the impact of varying velocity boundaries on the optimal fluid channel design should also be considered. However, such a method is not available yet in the open literature.
In this study, we represent a topological structure using a neural network (“neural topology”), which is integrated with pre-trained neural operators to realize topology optimization under different inlet velocities for turbulent mass transfer. The recently proposed Fourier-enhanced DeepONet (Fourier-DeepONet) [45, 46, 47], known for its robust generalization ability and training efficiency, is used to construct neural operators, which can efficiently predict concentration and pressure distributions under different topological structures and inlet velocities. Process indicators are defined based on the predicted concentration and pressure distributions, serving as objectives for the gradient-based optimization of a neural topology under varying inlet velocities using a neural network. Through two rounds of active data augmentation based on optimization results, both neural operators and the optimization results are improved, enabling data-efficient optimization within a larger and complex design space. Compared with the traditional mechanism model-based optimization approach, the proposed method in this study demonstrates superior computational efficiency under varying boundary conditions. We also design turbulent mass transfer experiments of the optimized channel to validate the effectiveness of the proposed TO method. The results under different inlet velocities are quantitatively compared and analyzed to guide the channel design of the actual turbulent mass transfer process.
2 Results
This study considers the turbulent mass transfer process for a water (solvent)–methylene blue (solute) system in a rectangular channel [24] (Fig. 1a). The topology optimization is performed in a rectangular domain in the channel. The effect of inlet velocity in the range [0.1, 0.9] on the process efficiency and the optimal topological structure is also investigated. We suppose that the space in the channel is full of solids with pores and the channel’s structure is altered by changing its porosity distribution . To ensure the effectiveness of the generated topological structure, the following solid volume constraint is imposed:
(1) |
where is the minimum solid volume fraction in the design domain, and is the total volume of the design domain.

Most topology optimization research focuses on enhancing mass transfer as the primary objective. However, channels with better mass transfer may require higher external energy consumption, which could lead to a decline in the overall performance of the process. Therefore, our optimization process employs multi-objective optimization, with the process efficiency indicated by two criteria:
(2) |
(3) |
where represents the concentration () variance of the system and reflects the uniformity of the concentration of the methylene blue, indicating the effectiveness of mass transfer; and denotes the pressure () drop from the inlet to the outlet of the system, representing the external energy consumption for the process. represents the average concentration of methylene blue in the fluid channel.
We aim to identify the optimal topological structure of the channel that balances mass transfer efficiency and external energy consumption for the process with different velocity boundaries. The total objective function of the optimization is defined by the weighted sum of and :
(4) |
where represents the weight. Such a best channel structure can be either mass transfer effectiveness dominant or energy efficiency dominant, depending on the weight value given by the decision-making designers.
2.1 Effects of topology and inlet velocity on turbulent mass transfer
We first investigate the effects of topology and inlet velocity on turbulent mass transfer. The concentration and pressure distributions for the smooth channel and channels with randomly generated topological structures (Figs. 1b and c) are numerically simulated based on the mechanism model presented in Section 4.1. Comparisons between the results of the three fluid channels (Fig. 1c) show that modifying the flow channel by adding solid blocks generally results in a more uniform concentration distribution (smaller ) but with the cost of increase of the system pressure drop (larger ) compared to the smooth channel. The Structure ii achieves smaller values of both and compared to that of Structure i, and therefore Structure ii is more favorable for the mass transfer process. This demonstrates that different channel structures behave differently in terms of the two criteria, and optimization is necessary to find the best topological structure of the channel that minimizes the overall objective function.
The inlet velocity of the system is also a crucial factor influencing mass transfer efficiency. The system pressure drop increases with the increase of inlet velocity (Fig. 1d), whereas the decreases first and then increases with the inlet velocity. The concentration distribution is the most uniform at . This suggests that increasing the inlet velocity improves both the convective and fluctuating mass transfer in the turbulent flow system. At the same time, the flow rate of the solute entering the system also increases with the rise of inlet velocity . Therefore, the high-concentration region becomes larger at compared to , leading to an increase in the concentration variance.
2.2 Optimizing neural topology under different inlet velocities
Since the governing equations based on the mechanism model for the turbulent mass transfer process are complex, traditional methods cannot efficiently optimize the channel structure, especially for the investigations on various inlet velocities. In the present study, we develop a computational framework to explore the optimal topology under different inlet velocities by integrating pre-trained neural operators with a neural topology (Fig. 2). To address the challenge of the large search space of the possible topological structures, we propose an active data-augmentation method to enhance the effectiveness of the neural operator-based optimization algorithm. The method can be described by the following three steps.

Step 1.
Two independent neural networks and are trained for rapid prediction of the concentration distribution and pressure distribution under different topological structures and inlet velocity , i.e.,
In this study, the neural operator method of Fourier-DeepONet [45] is adopted to construct the neural networks and , which demonstrated high training efficiency and strong generalization capability. In Fourier-DeepONet, the encoding process by DeepONet consists of two fully-connected neural networks: the branch net used for encoding the input function and the trunk net used for encoding the input variable . The output of DeepONet goes through the decoding process of a Fourier layer, 2 U-Fourier layers, and a projection layer , to get the output function of the Fourier-DeepONet framework. The details of the training data generation and neural networks are given in Sections 4.2 and 4.3.
Step 2.
To simultaneously perform TO under different velocity boundary conditions, we constructed a neural topology capable of outputting the topological structure for any given inlet velocity . Within the TO framework, the pre-trained neural operators and are integrated to efficiently compute the corresponding objective values for any given topological structure and inlet velocity . The objective function (Eq. (4)) together with the inequality constraint of solid volume (Eq. (1)) forms the total loss of the TO framework. Then, a gradient-based algorithm is used to optimize the parameters of the neural topology. As such, the trained neural topology can output optimized topological structures with minimum for any given inlet velocity. The algorithm is detailed in Section 4.4.
Step 3.
Due to the large search space of the topological structures, the neural operator trained on the randomly generated initial training dataset may not achieve sufficient prediction accuracy for all possible topological structures. This may result in discrepancies between the optimal topological structures obtained from the neural operator-based TO algorithm and those derived from the mechanism model-based TO algorithm. To address this issue, we develop an active learning method that performs two rounds of data-augmentation process according to the current TO results. The details of the active data-augmentation methods are discussed in Section 4.5. Using such a data-augmentation method, both neural operators and the TO framework were fine-tuned to obtain the final TO results, which are then validated using the mechanism model.
2.3 Optimal topology for different objectives
After introducing the methods, we present the TO results under different inlet velocities. We first compare the objective values between the cases in the training dataset and the TO results in Fig. 3a. Specifically, we show the optimization results under three weights: , 1, and 10. These weights can be interpreted as optimization processes dominated by enhancing mass transfer performance, balanced optimization of both objective functions, and optimization driven by minimizing external energy consumption, respectively. The dashed lines in Fig. 3a represent a possible Pareto front, which is obtained by linearly interpolating the optimization results under three different weights. When , the optimization algorithm achieves a channel structure with the smallest compared to the examples in the training set, especially for lower inlet velocity. In addition, the system pressure drop increases significantly as the inlet velocity increases, which is in agreement with the simulation results in Fig. 1d. As a result, reducing no longer dominates absolutely when , and the optimal topological structure is a trade-off between the two objectives.

As and increase, the optimization process gradually shifts from being predominantly driven by reducing to being predominantly driven by reducing . Consequently, the optimal points move towards the lower right corner in Fig. 3a. When , the corresponding to the optimal topology with and 10 are close to of the smooth channel. These results indicate the effectiveness of our proposed TO method, which can output a topology with smaller when the TO objective is dominated by the system pressure drop.
Next, we compare the optimized topological structures for before and after data augmentation (Fig. 3b). When , the optimal topological structures generated by the TO algorithm before and after data augmentation are quite similar (Fig. 3b, second row). However, in the case of , the optimal topological structure obtained after data augmentation shows a noticeable difference (Fig. 3b, first row). More optimization results (Fig. S5) show that for most of the examples the proposed method is successful in getting a better result after one active data augmentation, and the second data augmentation can further fine-tune the optimization results.
As discussed above, we obtained good TO results even in the first round without data augmentation. This good performance is attributed to the good prediction accuracy and generalization ability of the neural operators trained only with the initial small dataset. The neural operators can accurately predict process objectives under different inlet velocities or topological structures. In our training dataset (Appendix S1), we use . To quantify the generalization ability, we design three datasets for testing, including a standard testing dataset with , an interpolation dataset with , and a difficult extrapolation dataset with . The prediction errors of the neural operators for three test datasets (Table 1) are less than 1%, 2% and 5%, respectively. Hence, the trained and neural operators can not only predict the new topological structure, but also has good prediction accuracy for the cases whose inlet velocities are outside the training set. Moreover, although the neural operators are not directly trained with the objective values, the objective values calculated using the predicted physical fields are in good agreement with the results from the mechanism model. We present more details on neural operator validation in Appendix S2.
Neural operator | Test | Interpolation | Extrapolation | |
0.395% | 1.688% | 3.972% | ||
0.293% | 1.314% | 4.911% | ||
0.819% | 1.851% | 2.811% | ||
0.124% | 1.971% | 1.434% |
Moreover, we illustrate how data augmentation, particularly for the training dataset, allows the TO framework to achieve better topology. The pressure difference for the data points in the initial training dataset mostly ranges between 0.004 and 0.005 (Fig. 3a), while in the TO results, it is approximately 0.003 for , 0.002 for , and 0.001 for . Excluding the smooth channel, the minimum value of in the initial training dataset is around 0.003. Therefore, the training dataset of the has fewer data points near the optimal point, leading to lower prediction accuracy for the optimized structure. In contrast, for the neural operator (Fig. 3c), where the distribution of in the training dataset is more uniform, the prediction of neural operator for optimized structure is more accurate. After adding more data points to the training set based on the TO results, the number of data points with smaller objective values increases (Fig. 3c), which improves the prediction accuracy of neural operators for optimized structure.
2.4 Computational efficiency
We make a comparison of the computational cost between our proposed method and the traditional method. Although our method requires solving the computational fluid dynamics (CFD) and computational mass transfer (CMT) model equations repeatedly for various conditions to establish the training dataset, each simulation is independent and can be performed in parallel. Based on the trained neural operator, the optimal structure of 9 inlet velocities can be obtained by the optimization algorithm. When the in the objective function changes, it is only necessary to repeat Step 3 to get a new optimization result. In contrast, structure optimization based on mechanism models requires solving multiple CMT and Lagrange equations iteratively. For example, Jia et al. [19] employed the MMA optimization algorithm and used 31 iterations, in each of which the mechanism models equations must be solved, to obtain the optimal channel structure with fixed values of and .
For numerical solutions with the mechanism model, we utilizes parallel computing with 48 threads on two Intel Xeon E5-2687W v4 CPUs. The neural network training is performed using an NVIDIA GeForce RTX 3090 Ti GPU. The computational cost is summarized in Table 2, which shows that the proposed method has an obvious computational advantage in optimizing multiple structures under different inlet velocities. Furthermore, as we showed in Table 1, the trained neural operator has sufficient prediction accuracy for the physical fields and objectives corresponding to any . Hence, if we aim to perform TO for more inlet velocities, the speedup will be even more significant.
Our method (neural operator + gradient-based algorithm) | Total time: 43.5 h |
• Initial data generation: 6.7 h for solving CMT equations 360 times, and 3 h for solving CFD equations 540 times. | |
• Step 1: 2.8 h for , and 2.5 h for . | |
• Step 2: 3 h for a fixed . | |
• Step 3: 10 h form TO result I to TO result II, and 9.5 h from TO result II to final TO result. | |
Traditional method (mechanism model + MMA) | Total time: 1080 h |
• For fixed and , about 40 h for solving CMT and Lagrange equations 31 times [19]. | |
• Repeating the above process for 27 conditions of different or . |
2.5 Experimental validation
To validate our ML results, we conduct turbulent mass transfer experiments of the smooth and optimized flow channels under m/s and . Fig. 4a shows the diagram of the experiment setup. The simulated 2D channel was expanded into a 3D channel with a square cross-section. Three sides of the 3D channel were printed using white material, while the remaining side was fitted with a transparent window to capture the mass transfer behavior of water-methylene blue within the channel. The dimensions of the transparent window match those of the computational domain in Fig. 1a. A centrifugal pump was used to achieve continuous inlet feed, with the inlet velocity controlled by a rotor flowmeter and a bypass control valve. During the experiment, a high-speed camera continuously captured the fluid mixing conditions within the channel, and the images were processed into grayscale on a computer. By averaging the grayscale values across all frames in the animation, the time-averaged grayscale images were obtained. The time-averaged concentration distribution (Fig. 4b) was then obtained using the time-averaged grayscale image and the calibration curve. The original experimental data and calibration curves are presented in Appendix S5.

In order to quantitatively verify the effectiveness of the optimized channel in enhancing mass transfer, the variance of concentration distributions () for both the smooth and the optimized fluid channels are compared in Fig. 4c. Experimental results indicate that the mass transfer performance of the optimized fluid channel is significantly improved compared to the smooth case, with the objective function value decreasing by approximately , which aligns well with the simulation results.
2.6 Influence of inlet velocity on optimal topological structures and objectives
According to the TO results in Fig. 3a, it is evident that the optimal topological structure varies with the inlet velocity. Here, we present more analysis in two cases: (mass transfer dominated; Fig. 5a) and (energy consumption dominated; Fig. 5b).
When , the primary objective of the optimization process is to minimize , and we show the optimized concentration distributions for three different inlet velocities in Fig. 5a. We find that the vertical location of the solid baffle increases with , which is consistent with the position where the concentration drops to 0 in Figs. 1b and d. This indicates that adding solid baffles enhances both the turbulent diffusion coefficient and the convective diffusion rate, which promotes the diffusion of solute toward regions of lower concentration, resulting in higher mass transfer rates and a more uniform concentration distribution. Another solid baffle is located at the lower wall, where the concentration boundary layer is more pronounced. As the thickness of the concentration boundary layer at this location decreases with the increase of , the volume of the solid in this region also decreases accordingly.

For , the primary objective of the TO is to reduce the . The optimal topological structures (Fig. 5b) indicate that when the solid baffle is located at the bottom of the design domain , the pressure distribution of the process is similar to that of the smooth channel in Figs. 1b and d. This can be explained as that a relatively thick pressure boundary layer forms near the bottom wall due to the presence of inlet 2. When the solid is positioned near the bottom of the domain, its impact on the system’s pressure drop is relatively small. Since the position of the pressure boundary layer remains unchanged with variations in , when the optimization process is dominated by minimizing , the optimal topological structure is independent of .
We compare the objective values before and after the TO in Fig. 5c. When the value of is relatively small, the process performance, controlled by reducing the concentration distribution variance, is significantly improved by the TO compared to the smooth channel. Moreover, for smaller , the increase in system efficiency by the TO is more pronounced. The results for a larger value of the show that the external energy consumption of the smooth channel is lower than that of the optimized channel for inlet velocities . However, for higher inlet velocities, the pressure distribution by the TO outperforms that of the smooth channel.
3 Discussion
In this study, we develop a computational framework to achieve efficient topology optimization under different velocity boundaries by integrating pre-trained neural operators with a neural topology and gradient-based optimization. By incorporating an active data augmentation approach, our framework is both data-efficient and computationally efficient in identifying the optimal solution within a large design space. Our results show that no matter whether the topology optimization is dominated by the mass transfer efficiency or the system pressure drop, the optimized topological structure is always better than the baseline smooth channel and all the randomly generated structures in the training dataset. The TO results are used to guide the design of the experiments, and we have a close agreement between the ML predictions and experimental measurements. The optimized channel shows approximately a 37% improvement in mass transfer efficiency compared to the smooth channel. Our developed framework can be applied to the design of other turbulent mass transfer enhancement techniques, such as incorporating reactive particles and altering the fluid inlet angle.
4 Methods
4.1 Mechanism model
The turbulent mass transfer process (Fig. 1a) can be described by the Reynolds-averaged Navier-Stokes (RANS) equations [48, 49] with denoting and coordinates:
(5) |
(6) |
(7) |
Eqs. (5)–(7) represent the continuity equation, momentum conservation equation, and species conservation equation, respectively. Here, , , and denote pressure, velocity, and coordinate position, respectively. represents the time-averaged concentration. , , and are physical constants representing density, laminar viscosity, and laminar mass diffusivity, respectively. and represent turbulent viscosity and turbulent mass diffusivity, which can be obtained from the - two-equation model [50] and - two-equation model [51], respectively. The two-equation models and their parameters can be found in Ref. [48]. Eqs. (5)–(6) and the closure equation for turbulent viscosity constitute the CFD equations used to calculate the velocity and pressure distributions. The CMT equations include the CFD equations, Eq. (7) for concentration calculation, along with the closure equations for turbulent mass diffusivity .
Additionally, to account for the influence of changing the system’s topological structure on the physical fields, we introduce a source term in the momentum conservation equation (Eq. (6)) to represent the resistance effect of solid baffles on the flow field. Moreover, in the component conservation equation, a fluid volume fraction within the range [0, 1] is introduced to represent the effect of the solid baffle on the component diffusion coefficient. Specifically, represents the frictional drag generated by solid baffles and can be solved using a density model [19]:
where is the interpolation function of the porosity :
represents the solid density, and is a positive integer used to adjust the shape of the interpolation curve. In this study, and are set to 1 and 0, respectively [52]. is a large value that approximates the fluid-to-solid transition, set to 600,000. When is 0, the corresponding fluid density is , resulting in significant frictional resistance , making the domain effectively solid-like. Conversely, when is 1, the corresponding region is fluid. In this study, we set the velocity inlet and pressure outlet (with static pressure fixed at 0) [24]. More computational details can be found in Ref. [24].
4.2 Data generation
To train the neural operators, various topological structures are randomly generated and evenly assigned to 9 different inlet velocities (). To obtain the corresponding concentration and pressure distributions for the given and , the mechanism model (Section 4.1) is solved in the commercial CFD software package FLUENT . The value of is stored using user-defined memory, while , , and in Eqs. (6)–(7) are implemented through user-defined functions. The total pressure , including static and dynamic pressures, is normalized as . Each data point in the dataset includes the coordinates , along with their corresponding values of . The initial dataset is categorized into training, testing, and interpolation/extrapolation datasets based on the value of the inlet velocity . More details on the initial dataset are provided in Section S1.
4.3 Neural operator
Fourier-DeepONet, illustrated in Fig. 2 Step 1, was developed based on three neural network frameworks: the deep operator network (DeepONet) [31], the Fourier neural operator (FNO) [53], and U-FNO (a block combining Fourier neural operator and U-Net) [54]. DeepONet is used to encode the input variables and functions, while FNO and U-FNO apply Fourier transforms to decode DeepONet’s output to obtain the model output. Fourier-DeepONet exhibits superior generalizability and better accuracy in learning neural operators in high-dimensional spaces [45].
The vanilla DeepONet consists of two components: a branch net and a trunk net used for encoding the inputs. In this study, the trunk net takes the inlet velocity as input, while the branch net takes the distribution of fluid volume of the design domain (Fig. 1a) as input. Since the solutions of the mechanism model are used as data for training Fourier-DeepONet, the dimension of the input function corresponds to the number of grid nodes of the mechanism model, which is 121 (mesh nodes in the -direction) 101 (mesh nodes in the -direction). The outputs of the branch net and the trunk net are denoted as and , respectively:
where and represent two linear transformations, and denotes the padding operation. represents the output dimensionality of a single channel in the branch net, where in this study and . represents the number of channels, which corresponds to the width of the operator layers. Additionally, as shown in Fig. 2, DeepONet computes by combining the outputs and through element-wise multiplication:
We then utilize an FNO layer and a U-FNO layer to decode the output of DeepONet. The outputs of the FNO layer, , and the two U-FNO layers, and , are computed as
In the FNO and U-FNO layers, represents the two-dimensional Fast Fourier Transform (FFT), denotes the inverse two-dimensional FFT, is the activation function, are weight matrices, are complex valued tensors, and are bias. In the U-FNO layer, denotes a U-Net layer. Since the input and output dimensions of the Fourier and U-Net layers must be consistent, a linear layer is added before the activation function in the U-FNO layer to transform the output function to the same dimension as the concentration and pressure distributions. represents the weight matrix of this linear layer.
The projection layer in the decoding part is used to perform nonlinear transformations and slicing operations on the decoded output function from the Fourier layer, resulting in the model’s output, i.e., the physical fields or . The projection layer can be expressed as
where denotes the slicing operation, and are the weight matrices, and and are biases of the projection layer. The dimensions of the concentration distribution and the pressure distribution are consistent with the number of nodes in the CFD simulation, i.e., 201 (number of nodes in the -direction) 101 (number of nodes in the -direction).
As shown in Fig. 1c, the pressure distribution is more sensitive to changes in the topological structure than the concentration distribution, which requires more channels for than for . In this study, the number of channels used for and was set to 32 and 48, respectively. For other model hyperparameters of DeepONet and the projection layer , please refer to Refs. [31, 45], and for other model hyperparameters of FNO and U-FNO, refer to Refs. [54, 53]. The training loss trajectories of and are shown in Section S3 Fig. S2.
4.4 Optimization of neural topologies
As shown in Fig. 2 Step 2, a topological structure is constructed by a neural network, which consists of two layers, each containing 140 neurons. represents the preliminary predicted porosity values by the neural network. As the porosity must take values of 0 or 1 for any input , to obtain a reasonable topological structure, the preliminary output needs to be transformed to 0/1. In this study, the sigmoid function is used for the 0/1 transformation, and the topological structure is computed as
where represents the average value of at different coordinate points for the same velocity boundary. is a hyperparameter, and a larger value of indicates a sharper change between the output of 0 and 1. In this study, .
The trained and neural operators take the and the topological structure as the inputs to predict the pressure and concentration distributions, respectively. Then , , and are computed according to Eqs. (2)–(4). The training losses , , and are computed from , , and under different inlet velocities:
where the superscript represents different data points in the training dataset. The total inequality constraint of the solid volume is computed based on Eq. (1):
The optimization of a neural topology does not require observational data, and the training loss is composed of topological structure constraints and the objective function:
where is a weight coefficient and is set to to ensure the optimal topology satisfies the volume constraint.
During the training, the initial learning rate decays from to in 5000 steps. The activation function and optimizer are ReLU and Adam, respectively. When , the training error curve of the TO framework is shown in Fig. S3. We observe that approaches 0, indicating that the TO results satisfy the inequality constraints of solid volume.
4.5 Active data augmentation
After performing Step 1 and Step 2 using the initial dataset in Section S1, the TO result I is obtained, and some optimized structures are depicted in Fig. S5a. In Step 2, we use neural operators to replace the mechanism model for enabling fast prediction of the objectives. Hence, a good prediction accuracy of neural operators is crucial, and the errors are listed in Table S1. shows lower prediction accuracy for the optimized topological structure compared to , because the training dataset size of is equal to the number of topologies in the dataset, regardless of the inlet velocities, and thus it is much smaller than the dataset size of . Moreover, as suggested by the TO result I, the optimal structure would only have one or two solid structures near the bottom or left boundaries of the design domain. However, in our randomly-generated initial dataset, the structures do not satisfy this pattern.
To improve the prediction accuracy of , especially for structures similar to the optimal structure, more data are generated as follows. In the first round of data augmentation, the solid positions are chosen the same as those in Fig. S5a, while the solid shapes are randomly generated. We generate ten new structures for , and we show three examples in Fig. S4. To reduce the computational cost, we only use these structures to generate the training data of . Then, TO result II is obtained through transfer learning of the neural operator and the neural topology.
In the second round of data augmentation, we use the 27 optimal structures obtained from TO result II to generate new training data for both and .
Data availability
The simulation and experiment data will be made available upon publication on GitHub at https://github.com/lu-group/neural-topology-optimization.
Code availability
The code for this study is implemented using the Python library DeepXDE [55] and will be publicly available at the GitHub repository https://github.com/lu-group/neural-topology-optimization.
Acknowledgments
This work was partially supported by the NNSFC Grants No. 22178247 (to X.Y.) and No. 22308251 (to S.J.).
References
- [1] J. Sánchez and P. Tanguy. Turbulent flow and mass transfer in a structured packed bed reactor. AIChE Journal, 60(9):3089–3101, 2014.
- [2] X. Jin and X. Li. Mass transfer enhancement in packed bed reactors: A review. Chemical Engineering Science, 75:1–16, 2012.
- [3] Carlos A Ramírez. Mass transfer enhancement by chemical reaction in turbulent tube flow. Chemical Engineering Journal, 138(1-3):628–633, 2008.
- [4] V. Sundararajan and S. Kundu. Turbulent mixing and mass transfer in confined flows. Nature Physics, 5(12):1026–1031, 2009.
- [5] J. Zhang and L. Wang. Optimization of microfluidic channel design for enhanced mass transfer. Chemical Engineering Science, 200:1–12, 2019.
- [6] R. Kumar and A. Ramaswamy. Experimental and computational study of mass transfer in turbulent flow reactors. Chemical Engineering Journal, 243:206–215, 2014.
- [7] L. Tao and D. Xu. Experimental and numerical investigation of turbulent mass transfer in complex geometries. Chemical Engineering Science, 153:109–121, 2017.
- [8] M. Y. Hassan and S. Latif. Mass transfer enhancement in absorption systems using novel structured packing. AIChE Journal, 66(8):3101–3112, 2020.
- [9] Meng-Yue Lu, Chen Yin, Qiang Ma, Hua-Neng Su, Ping Lu, Zhou-Qiao Dai, Wei-Wei Yang, and Qian Xu. Flow field structure design for redox flow battery: Developments and prospects. Journal of Energy Storage, 95:112303, 2024.
- [10] Qun Chen and Ji-an Meng. Field synergy analysis and optimization of the convective mass transfer in photocatalytic oxidation reactors. International Journal of Heat and Mass Transfer, 51(11-12):2863–2870, 2008.
- [11] M Bidi, MRH Nobari, and M Saffar Avval. A numerical evaluation of combustion in porous media by egm (entropy generation minimization). Energy, 35(8):3483–3500, 2010.
- [12] Qiong Zheng, Feng Xing, Xianfeng Li, Guiling Ning, and Huamin Zhang. Flow field design and optimization based on the mass transport polarization regulation in a flow-through type vanadium flow battery. Journal of Power Sources, 324:402–411, 2016.
- [13] Chenhui Kou, Shengkun Jia, Yiqing Luo, and Xigang Yuan. Performance investigation of the solar thermal decomposition of methane reactor considering discrete and deposited carbon particles. Fuel, 324:124401, 2022.
- [14] Tatsuo Nishimura, Naoki Oka, Yoshimichi Yoshinaka, and Koji Kunitsugu. Influence of imposed oscillatory frequency on mass transfer enhancement of grooved channels for pulsatile flow. International journal of heat and mass transfer, 43(13):2365–2374, 2000.
- [15] M. P. Bendsoe and O. Sigmund. Topology Optimization: Theory, Methods, and Applications. Springer, Berlin, 2003.
- [16] G. Allaire and F. Jouve. Topology optimization for structural design using finite element analysis. Nature, 414(6864):1232–1237, 2002.
- [17] Cetin B Dilgen, Sumer B Dilgen, David R Fuhrman, Ole Sigmund, and Boyan S Lazarov. Topology optimization of turbulent flows. Computer Methods in Applied Mechanics and Engineering, 331:363–393, 2018.
- [18] Z. Wang and Y. Zhang. Topology optimization of electromagnetic materials. Nature Communications, 11(1):2589, 2020.
- [19] Shengkun Jia, Xuepu Cao, Xigang Yuan, and Kuo-Tsong Yu. Multi-objective topology optimization for the solar thermal decomposition of methane reactor enhancement. Chemical Engineering Science, 231:116265, 2021.
- [20] Xuepu Cao, Yiqing Luo, Xigang Yuan, Zhiwen Qi, and Kuo-Tsong Yu. An optimization approach for improving the exergetic efficiency in mesoscale combustor. Computers & Chemical Engineering, 134:106707, 2020.
- [21] Tingliang Xie and Cong Xu. Numerical and experimental investigations of chaotic mixing behavior in an oscillating feedback micromixer. Chemical engineering science, 171:303–317, 2017.
- [22] Mubashshir Ahmad Ansari and Kwang-Yong Kim. Mixing performance of unbalanced split and recombine micomixers with circular and rhombic sub-channels. Chemical Engineering Journal, 162(2):760–767, 2010.
- [23] D. Mikkelsen and O. Sigmund. Topology optimization of flow through porous media. Nature Materials, 3(7):434–438, 2004.
- [24] Chenhui Kou, Yuhui Yin, Yang Zeng, Shengkun Jia, Yiqing Luo, and Xigang Yuan. Physics-informed neural network integrate with unclosed mechanism model for turbulent mass transfer. Chemical Engineering Science, 288:119752, 2024.
- [25] George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning. Nature Reviews Physics, 3(6):422–440, 2021.
- [26] Maziar Raissi, Paris Perdikaris, and George Em Karniadakis. Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations. arXiv preprint arXiv:1711.10561, 2017.
- [27] Maziar Raissi, Alireza Yazdani, and George Em Karniadakis. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science, 367(6481):1026–1030, 2020.
- [28] Benjamin Fan, Edward Qiao, Anran Jiao, Zhouzhou Gu, Wenhao Li, and Lu Lu. Deep learning for solving and estimating dynamic macro-finance models. Computational Economics, pages 1–37, 2024.
- [29] Mitchell Daneker, Shengze Cai, Ying Qian, Eric Myzelev, Arsh Kumbhat, He Li, and Lu Lu. Transfer learning on physics-informed neural networks for tracking the hemodynamics in the evolving false lumen of dissected aorta. Nexus, 1(2), 2024.
- [30] Wensi Wu, Mitchell Daneker, Kevin T Turner, Matthew A Jolley, and Lu Lu. Identifying heterogeneous micromechanical properties of biological tissues via physics-informed neural networks. arXiv preprint arXiv:2402.10741, 2024.
- [31] Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021.
- [32] Pengzhan Jin, Shuai Meng, and Lu Lu. Mionet: Learning multiple-input operators via tensor product. SIAM Journal on Scientific Computing, 44(6):A3490–A3514, 2022.
- [33] Lu Lu, Xuhui Meng, Shengze Cai, Zhiping Mao, Somdatta Goswami, Zhongqiang Zhang, and George Em Karniadakis. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022.
- [34] Anran Jiao, Haiyang He, Rishikesh Ranade, Jay Pathak, and Lu Lu. One-shot learning for solution operators of partial differential equations. arXiv preprint arXiv:2104.05512, 2021.
- [35] Anran Jiao, Qile Yan, Jhn Harlim, and Lu Lu. Solving forward and inverse pde problems on unknown manifolds via physics-informed neural operators. arXiv preprint arXiv:2407.05477, 2024.
- [36] Min Zhu, Handi Zhang, Anran Jiao, George Em Karniadakis, and Lu Lu. Reliable extrapolation of deep neural operators informed by physics or sparse observations. Computer Methods in Applied Mechanics and Engineering, 412:116064, 2023.
- [37] Tom Savage, Nausheen Basha, Jonathan McDonough, James Krassowski, Omar Matar, and Ehecatl Antonio del Rio Chanona. Machine learning-assisted discovery of flow reactor designs. Nature Chemical Engineering, 1(8):522–531, 2024.
- [38] Yunjia Yang, Runze Li, Yufei Zhang, and Haixin Chen. Buffet onset optimization for supercritical airfoils with prior-based pressure profile prediction model. In AIAA SCITECH 2024 Forum, page 1227, 2024.
- [39] Yunjia Yang, Runze Li, Yufei Zhang, Lu Lu, and Haixin Chen. Transferable machine learning model for the aerodynamic prediction of swept wings. Physics of Fluids, 36(7), 2024.
- [40] Yunjia Yang, Runze Li, Yufei Zhang, Lu Lu, and Haixin Chen. Rapid aerodynamic prediction of swept wings via physics-embedded transfer learning. AIAA Journal, pages 1–15, 2024.
- [41] Tingtao Zhou, Xuan Wan, Daniel Zhengyu Huang, Zongyi Li, Zhiwei Peng, Anima Anandkumar, John F. Brady, Paul W. Sternberg, and Chiara Daraio. Ai-aided geometric design of anti-infection catheters. SCIENCE ADVANCES, 10(1), JAN 5 2024.
- [42] Lu Lu, Raphaël Pestourie, Steven G Johnson, and Giuseppe Romano. Multifidelity deep neural operators for efficient learning of partial differential equations with application to fast inverse design of nanoscale heat transport. Physical Review Research, 4(2):023210, 2022.
- [43] Lu Lu, Raphael Pestourie, Wenjie Yao, Zhicheng Wang, Francesc Verdugo, and Steven G Johnson. Physics-informed neural networks with hard constraints for inverse design. SIAM Journal on Scientific Computing, 43(6):B1105–B1132, 2021.
- [44] Zhongkai Hao, Chengyang Ying, Hang Su, Jun Zhu, Jian Song, and Ze Cheng. Bi-level physics-informed neural networks for pde constrained optimization using broyden’s hypergradients. arXiv preprint arXiv:2209.07075, 2022.
- [45] Min Zhu, Shihang Feng, Youzuo Lin, and Lu Lu. Fourier-DeepONet: Fourier-enhanced deep operator networks for full waveform inversion with improved accuracy, generalizability, and robustness. Computer Methods in Applied Mechanics and Engineering, 416:116300, 2023.
- [46] Zhongyi Jiang, Min Zhu, and Lu Lu. Fourier-MIONet: Fourier-enhanced multiple-input neural operators for multiphase modeling of geological carbon sequestration. Reliability Engineering & System Safety, 251:110392, 2024.
- [47] Jonathan E Lee, Min Zhu, Ziqiao Xi, Kun Wang, Yanhua O Yuan, and Lu Lu. Efficient and generalizable nested Fourier-DeepONet for three-dimensional geological carbon sequestration. Engineering Applications of Computational Fluid Mechanics, 18(1):2435457, 2024.
- [48] Shengkun Jia, Xuepu Cao, Fang Wang, Chao Zhang, Xigang Yuan, and Kuo-Tsong Yu. Renormalization group method for the turbulent mass transport two-equation model. Chemical Engineering Science, 249:117306, 2022.
- [49] George Keith Batchelor. An introduction to fluid dynamics. Cambridge university press, 1967.
- [50] Brian Edward Launder and Dudley Brian Spalding. The numerical computation of turbulent flows. In Numerical prediction of flow, heat transfer, turbulence and combustion, pages 96–116. Elsevier, 1983.
- [51] Kuo-Tsong Yu and Xigang Yuan. Introduction to computational mass transfer. Springer, 2014.
- [52] Shengkun Jia, Xigang Yuan, and Kuo-Tsong Yu. The investigation of gas distributor in column inlet section based on topology optimization. Chemical Engineering Science, 248:117148, 2022.
- [53] Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895, 2020.
- [54] Gege Wen, Zongyi Li, Kamyar Azizzadenesheli, Anima Anandkumar, and Sally M Benson. U-fno—an enhanced fourier neural operator-based deep-learning model for multiphase flow. Advances in Water Resources, 163:104180, 2022.
- [55] Lu Lu, Xuhui Meng, Zhiping Mao, and George Em Karniadakis. Deepxde: A deep learning library for solving differential equations. SIAM review, 63(1):208–228, 2021.
Appendix S1 Initial dataset
Training dataset.
Since the pressure distribution is more sensitive to variations in the topology structure than the concentration distribution, we use a larger dataset to train the neural operator than . Specifically, we randomly generate 891 topological structures, every 89 structures corresponding to one value of inlet velocity . The CFD equations are used to solve the pressure distributions, yielding 891 data for training the neural operator. Then, 351 cases, every 39 structures corresponding to one value of inlet velocity , are randomly selected to solve the CMT equations and obtain concentration distributions, forming the training dataset for the neural operator. The 9 cases of the smooth channel () with 9 inlet velocities are also included in the training dataset for both and . In total, the training dataset for the neural operator consists of 900 cases, and the training set for the neural operator consists of 360 cases.
Test dataset.
We randomly generate 18 topological structures, every two structures corresponding to one value of . Numerical methods were employed to solve the CMT equation system and obtain the corresponding pressure and concentration distributions. These datasets serve as the test datasets for the and neural operators.
Interpolation dataset.
To evaluate the predictive accuracy of the neural operators for pressure and concentration distributions at , we randomly generate 16 topological structures, every two structures corresponding to one value of . This interpolation dataset tests the neural operators’ ability to interpolate velocity boundary conditions within the training range.
Extrapolation dataset.
To evaluate the neural operators’ ability to extrapolate velocity boundary conditions outside the training range at , we randomly generate 4 topological structures, every two structures corresponding to one value of .
Appendix S2 Validation of neural operators
To obtain the pre-trained neural operators in the TO framework, the and neural operators were trained using the initial training dataset. The test, interpolation, and extrapolation datasets were then used to validate the generalization ability of the neural operators. Here, we visualize the ground truth, / network prediction, and error for one case from each of the three datasets (Fig. S1). The predictions of the neural operators are in good agreement with the reference solutions. This demonstrates that the neural operator achieves high prediction accuracy across different topological structures and also exhibits predictive capability for cases with interpolated or extrapolated inlet velocities.

Appendix S3 Training procedure
Here, we show the losses during network training, including the losses of the neural operators (Fig. S2) and the neural topology optimization (Fig. S3).


Appendix S4 Comparison of three TO results
To validate the effectiveness of the TO algorithm, we use the mechanism model to verify the accuracy of the neural operator’s predictions for the optimal topological structures (Table S1). In the active learning-based TO approach, the optimal topological structure is iteratively improved, and we show three topology structure examples of the first active data augmentation in Fig. S4. We also illustrate the evolution of the optimal topological structures under different weights and inlet velocities in Fig. S5.
Physical field error | Objective error | ||||||
Inlet velocity | 0.1 | 0.5 | 0.9 | 0.1 | 0.5 | 0.9 | |
TO result I | 3.281% | 2.504% | 2.721% | 0.547% | 0.372% | 2.119% | |
7.747% | 5.254% | 3.855% | 16.32% | 7.058% | 3.577% | ||
TO result II | 3.728% | 2.506% | 2.746% | 0.470% | 0.350% | 2.520% | |
6.040% | 2.613% | 1.943% | 7.286% | 1.894% | 2.560% | ||
Final TO result | 0.515% | 0.294% | 0.160% | 0.259% | 0.253% | 0.257% | |
0.537% | 0.346% | 0.546% | 0.550% | 0.335% | 0.480% |


Appendix S5 Experimental data
The experimental measurement data for both the smooth and optimized channel structures can be accessed at experiment data. The corresponding time-averaged gray images of the smooth and optimized channels are shown in Fig. S6a. We measured the grayscale of standard concentration solutions using the experimental method described in Section 2.5, establishing a correlation between grayscale and concentration value (Fig. S6b), which is expressed by the decreasing power function: .
