This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Deep learning black hole metrics from shear viscosity

Yu-Kun Yan [email protected] Department of physics, Shanghai University, Shanghai, 200444, China School of Physics, University of Chinese Academy of Sciences, Beijing, 100049, China    Shao-Feng Wu [email protected] Department of physics, Shanghai University, Shanghai, 200444, China Center for Gravitation and Cosmology, Yangzhou University, Yangzhou 225009, China    Xian-Hui Ge [email protected] Department of physics, Shanghai University, Shanghai, 200444, China Center for Gravitation and Cosmology, Yangzhou University, Yangzhou 225009, China    Yu Tian [email protected] School of Physics, University of Chinese Academy of Sciences, Beijing, 100049, China Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China Center for Theoretical Physics, Massachusetts Institute of Technology, MA 02139, Cambridge, USA Center for Gravitation and Cosmology, Yangzhou University, Yangzhou 225009, China
Abstract

Based on AdS/CFT correspondence, we build a deep neural network to learn black hole metrics from the complex frequency-dependent shear viscosity. The network architecture provides a discretized representation of the holographic renormalization group flow of the shear viscosity and can be applied to a large class of strongly coupled field theories. Given the existence of the horizon and guided by the smoothness of spacetime, we show that Schwarzschild and Reissner-Nordström metrics can be learned accurately. Moreover, we illustrate that the generalization ability of the deep neural network can be excellent, which indicates that by using the black hole spacetime as a hidden data structure, a wide spectrum of the shear viscosity can be generated from a narrow frequency range. These results are further generalized to an Einstein-Maxwell-dilaton black hole. Our work might not only suggest a data-driven way to study holographic transports but also shed some light on holographic duality and deep learning.

Introduction.—Renormalization group (RG) is a physical scheme to understand various emergent phenomena in the world through iterative coarse graining Kadanoff1966 ; Wilson74 ; Wilson83 ; Polchinski84 . Deep learning (DL) is the core algorithm of the recent wave of artificial intelligence Sejnowski2018 . It has been suggested that RG and DL might have a logic in common Wen16 ; Mehta1401 and their relation has attracted a lot of interest Beny13 ; Saremi13 ; Paul1412 ; Braddea1610 ; Lin1608 ; Ringel1704 ; Oprisa1705 ; Foreman1710 ; Iso1801 ; Wang1802 .

RG is believed to be one of the key elements to understand quantum gravity. In particular, by anti-de Sitter/conformal field theory (AdS/CFT) correspondence Maldacena97 ; Gubser98 ; Witten98 ; Susskind98 , a strongly coupled quantum critical theory in the d-dimensional spacetime is reorganized along the RG scale, inducing a classical theory of gravity in the (d+1)-dimensional AdS spacetime. RG is not the only connection between DL and gravity. Through the study of tensor networks Bridgeman1603 ; Scholl2011 ; Vidal2011 , especially the multi-scale entanglement renormalization ansatz (MERA) Vidal0512 , it has been realized that the way the geometry is emergent from field theories usually involves the network and optimization, which are two important ingredients of DL.

Because of these connections, the deep neural network (DNN) may be capable of providing a research platform for holographic duality Shu1705 ; You1709 ; Hashimoto1802 ; Hashimoto1809 ; You1903 ; Hashimoto1903 ; Hartnoll1906 ; Tan1908 . One can expect at least two benefits: It is helpful to understand how the spacetime emerges and can be used to build a data-driven phenomenological model for strongly coupled field theories. The latter was initiated in Hashimoto1802 , where the inverse problem of the AdS/CFT is studied, that is, how to reconstruct the spacetime metric from the given field theory data by the DNN which implements the AdS/CFT. Subsequently, the so-called AdS/DL correspondence is applied to learn the bulk metric from the lattice QCD data of the finite-temperature chiral condensate. Interestingly, the emergent metric exhibits both a black hole horizon and an IR wall with finite height, signaling the crossover of QCD thermal phases Hashimoto1809 .

Let us briefly introduce the prototype of AdS/DL (i.e., the first numerical experiment in Hashimoto1802 ). It establishes the architecture of the DNN according to the discretized equation of motion of the ϕ4\phi^{4} theory minimally coupled to gravity. The training data are the one-point function and the conjugate source with a label determined by the near-horizon scalar field. The target of learning is the metric of the Schwarzschild black hole. A key technique is to design the regularization by which the emergent metric is favored to be smooth. It is found that the DNN performs better near the boundary than near the horizon where the relative error is around 30%. In Tan1908 , it has been attempted to learn the Reissner-Nordström (RN) metric by AdS/DL but the mean square error (MSE) ranges from 𝒪(103)\mathcal{O}\left(10^{-3}\right) to 𝒪(101)\mathcal{O}\left(10^{-1}\right). Importantly, it was revealed that the form of the regularization term must be fine-tuned for different metrics. This suggests that their DNN may not find the target metric if it is unknown previously, since it is difficult to judge which is closer to the target metric under different regularizations.

In this paper, we will extend the physical range of the AdS/DL nontrivially and illustrate that it can be realized without previous technical problems. Our strategies are as follows. First, AdS/CFT is almost customized for the computation of the transports of strongly coupled quantum critical systems at finite temperatures ZaanenBook . In particular, the application of holography is anchored partially in the prediction of the nearly perfect fluidity KSS , which has been observed in the hot quark gluon plasmas and cold unitary Fermi gases Schaefer0904 . With these in mind, we adapt the complex frequency-dependent shear viscosity as the given field theory data. Second, we propose to build the DNN according to the holographic RG flow of the shear viscosity. Up to the holographic renormalization, this flow was presented in the well-known holographic membrane paradigm Liu0809 ; Strominger1006 , which interpolates the standard AdS/CFT correspondence and the classical black hole membrane paradigm smoothly Thorne86 ; Wilczek97 . Third, we assume the existence of the horizon, which will reduce the learning difficulties. Fourth, the system error in Hashimoto1802 comes from adding labels on the data and introducing the regularization. Because the horizon value of the shear response is completely determined by the regularity analysis on the horizon, we can generate the data by the flow from IR to UV. The direction of information transfer is contrary to Hashimoto1802 ; Hashimoto1809 and there is no error caused by the labels of the data. Fifth, we still use the regularization to guide the network to find a smooth metric. However, our training process has two stages and the regularization is only required in the first stage, so we can choose any regularization term as long as it leads to a smaller loss in the second stage. Finally, we will discuss possible extensions and physical implications.

From RG flow to DNN.—Suppose that a strongly coupled field theory is dual to the (3+1)-dimensional Einstein gravity minimally coupled with matter, which admits a homogeneous and isotropic (along the field theory directions) black hole solution with the metric ansatz

ds2=gtt(r)dt2+grr(r)dr2+gxx(r)dx2.ds^{2}=-g_{tt}(r)dt^{2}+g_{rr}(r)dr^{2}+g_{xx}(r)d\vec{x}^{2}. (1)

When the black hole is perturbed by time-dependent sources, the shear mode (δg)x2x1=h(r)eiωt\left(\delta g\right)_{\;x_{2}}^{x_{1}}=h(r)e^{-i\omega t} of the gravitational wave is controlled by the equation of motion

1gr(ggrrrh)+gttω2h=0,\frac{1}{\sqrt{-g}}\partial_{r}(\sqrt{-g}g^{rr}\partial_{r}h)+g^{tt}\omega^{2}h=0, (2)

provided that the graviton is massless111The gauged coordinate-invariance symmetries in the bulk demand that the gravitons are massless and they are dual to the global spacetime symmetries on the boundary ZaanenBook . Hartnoll1601 . In the Hamiltonian form, the wave equation can be written as

Π\displaystyle\Pi =\displaystyle= ggrrrh,\displaystyle-\sqrt{-g}g^{rr}\partial_{r}h, (3)
rΠ\displaystyle\partial_{r}\Pi =\displaystyle= ggrrgttω2h,\displaystyle\sqrt{-g}g^{rr}g^{tt}\omega^{2}h, (4)

where Π\Pi is the momentum conjugate to the field hh. Consider the foliation in the rr-direction and define the shear response function χ=Π/(iωh)\chi=\Pi/(i\omega h) on each cutoff surface. Substituting Eq. (3) into Eq. (4), one can obtain

rχiωgrrgtt(χ2gxxgxx)=0.\partial_{r}\chi-i\omega\sqrt{\frac{g_{rr}}{g_{tt}}}\left(\frac{\chi^{2}}{g_{xx}}-g_{xx}\right)=0. (5)

Note that this radial flow equation has been derived in Liu0809 where the DC limit is focused on222In the Wilsonian formulation, the flow equation can be retrieved as the β\beta-functions of double-trace couplings Son1009 ; Polchinski1010 ; Liu1010 .. We will study the frequency-dependent behavior.

Applying the regularity of χ\chi on the horizon, one can read off the horizon value of χ\chi directly

χ(rh)=gxx(rh),\chi(r_{h})=g_{xx}(r_{h}), (6)

where rhr_{h} is the horizon radius. Taking Eq. (6) as the IR boundary condition, the flow equation can be integrated to the UV boundary. However, it should be pointed out that the response function χ\chi on the UV boundary is not equal to the shear viscosity η\eta of the boundary field theory. In the Supplementary Material (SM), we will clarify the relation between them using the Kubo formula of the complex frequency-dependent shear viscosity Read1207 ; Wu2015 and the holographic renormalization Henningson9806 ; HJ . It can be found that for a large class of holographic models, including the Einstein-Maxwell theory which will be studied below, the relation is

η(ω)=χ(ω,r)+iωr|r.\eta(\omega)=\left.\chi(\omega,r)+i\omega r\right|_{r\rightarrow\infty}. (7)

In the metric ansatz (1), gxxg_{xx} can be fixed as r2r^{2} without loss of generality but gttg_{tt} and grrg_{rr} are independent in general. However, there are some black holes which share the feature gttgrr=1g_{tt}g_{rr}=1, indicating that the radial pressure is the negative of the energy density Jacobson0707 . For simplicity, we will consider this situation first and return to the more general case later. Thus, the metric ansatz can be reduced to

ds2=1z2[f(z)dt2+1f(z)dz2+dx2],ds^{2}=\frac{1}{z^{2}}\left[-f(z)dt^{2}+\frac{1}{f(z)}dz^{2}+d\vec{x}^{2}\right], (8)

where we have used the coordinate z=rh/rz=r_{h}/r so that the horizon is located at z=1z=1 and the boundary at z=0z=0. Accordingly, Eq. (5) can be rewritten as

(ηiωz)+iωf[z2(ηiωz)21z2]=0,\left(\eta-\frac{i\omega}{z}\right)^{\prime}+\frac{i\omega}{f}\left[z^{2}\left(\eta-\frac{i\omega}{z}\right)^{2}-\frac{1}{z^{2}}\right]=0, (9)

where we have set rh=1r_{h}=1 and the prime denotes the derivative with respect to zz. Note that we have replaced χ(ω,z)\chi(\omega,z) with η(ω,z)iω/z\eta(\omega,z)-i\omega/z from IR to UV. Compared to Eq. (7) where the replacement occurs only on the UV, we have found that this technique reduces the discretized error considerably. The radially varying function η(ω,z)\eta(\omega,z) can be referred to the holographic RG flow of the shear viscosity. In the following, we will build a DNN according to the flow equation (9).

Refer to caption
Figure 1: The architecture of the DNN. The green and blue nodes have NN layers, which propagate the shear viscosity from IR to UV by discretized RG flow equations (11). The arrows indicate the direction of information transfer.

FIG. 1 is a schematic diagram of the DNN. The NN deep layers are located by discretizing the radial direction

z(n)=zb+nΔz,Δz=zhzbN,z(n)=z_{b}+n\Delta z,\;\Delta z=\frac{z_{h}-z_{b}}{N}, (10)

where zbz_{b} (zhz_{h}) is the UV (IR) cutoff and the integer nn belongs to [1,N]\left[1,N\right]. The trainable weights of the DNN represent the discretized metrics. The input is η(ω,zh)\eta(\omega,z_{h}) and the output is η(ω,zb)\eta(\omega,z_{b}). The information is transferred from the NNth layer (IR) to the 11th layer (UV). The propagation rule of η(ω,z)\eta(\omega,z) between layers is determined by the discretized representation of Eq. (9):

Reη(z+Δz)\displaystyle\text{{Re}}\eta\left(z+\Delta z\right) =\displaystyle= Reη(z)[1+Δz2ωz2f(z)(Imη(z)ωz)],\displaystyle\text{{Re}}\eta\left(z\right)\left[1+\Delta z\frac{2\omega z^{2}}{f\left(z\right)}\left(\text{{Im}}\eta\left(z\right)-\frac{\omega}{z}\right)\right],
Imη(z+Δz)\displaystyle\text{{Im}}\eta\left(z+\Delta z\right) =\displaystyle= Imη(z)+Δzωz2f(z)[1f(z)z4\displaystyle\text{{Im}}\eta\left(z\right)+\Delta z\frac{\omega z^{2}}{f\left(z\right)}\Big{[}\frac{1-f(z)}{z^{4}} (11)
(Reη(z))2+(Imη(z)ωz)2].\displaystyle-\left(\text{{Re}}\eta\left(z\right)\right)^{2}+\left(\text{{Im}}\eta\left(z\right)-\frac{\omega}{z}\right)^{2}\Big{]}.

Here we have separated the discretized flow equation into real and imaginary parts for the convenience in DL.

The loss function we choose is the L2L^{2}-norm

LDNN=data|η(ω,zb)η¯(ω,zb)|2,L_{\mathrm{DNN}}=\sum_{\mathrm{data}}\left|\eta(\omega,z_{b})-\bar{\eta}(\omega,z_{b})\right|^{2}, (12)

up to a regularization term, if existent. Here η\eta represents the input data and η¯\bar{\eta} is what the DNN generates.

We need a regularization term which can guide the DNN to find a smooth black hole metric. In principle, the form of the regularization term can be arbitrary as long as it can reduce the final loss. In practice, our regularization term is specified as

LREG\displaystyle L_{\text{{REG}}} =\displaystyle= c1n=1N11z(n)c2[f(z(n+1))f(z(n))]2\displaystyle c_{1}\sum_{n=1}^{N-1}\frac{1}{z(n)^{c_{2}}}\left[f(z(n+1))-f(z(n))\right]^{2} (13)
+c3[f(z(N))0]2,\displaystyle+c_{3}\left[f(z(N))-0\right]^{2},

where the two parts are designed for the smoothness of the metric and the existence of the horizon, respectively. Three hyperparameters c1,c2c_{1},c_{2}, and c3c_{3} are introduced.

Refer to caption
Refer to caption
Figure 2: The data of the shear viscosity generated by different metrics. (a) and (b) denote the real and imaginary parts, respectively. The curves are generated by the continuous holographic RG flow equation of the shear viscosity, while the markers are generated by the discretized equation.

Generated data and discretized error.—We specify the discretized RG flow and hence the DNN by setting zb=0.01z_{b}=0.01, zh=0.99z_{h}=0.99, and N=10N=10. Using the discretized flow equation (11) with the IR boundary condition η(ω,zh)=1+iω\eta(\omega,z_{h})=1+i\omega and the target metric, we can generate the data required by DL. Here we consider the two most famous black holes, i.e. the Schwarzschild and RN. They are characterized respectively by the functions

f(z)\displaystyle f(z) =\displaystyle= 1z3,\displaystyle 1-z^{3}, (14)
f(z)\displaystyle f(z) =\displaystyle= 1z3q2z34+q2z44,\displaystyle 1-z^{3}-\frac{q^{2}z^{3}}{4}+\frac{q^{2}z^{4}}{4}, (15)

where qq is the charge density. We generate 2000 data (ω,η(ω,zb))(\omega,\ \eta(\omega,z_{b})) with even frequency spacing. The training set and validation set account for 90% and 10%, respectively. In Fig. (2), we compare the data with the ones generated by numerically solving the continuous flow equation of the shear viscosity. It is found that the discretized error is small when the target is the Schwarzschild metric, and increases with the frequency and the charge density. The discretized error can be reduced by adding more layers but it requires powerful computing capabilities. As a proof of principle, here we simply assume that the discretized error does not affect our results qualitatively333Recently, the discretized error has been taken into account for the application of AdS/DL to the QCD experimental data Hashimoto2005 ..

Emergence and generalization.—With the data in hand, we will train the DNN and extract the weights. The training scheme will be given in the SM. In TABLE S.1, we list the training reports after two training stages of various numerical experiments. Among others, it is shown that from the dataset with ω(0,1]\omega\in(0,1]444The frequency is measured in units of the horizon radius., the Schwarzschild and RN metrics can be learned with high accuracy: the mean relative error (MRE) is around 0.1%0.1\%. Note that the MSE is 𝒪(107)\mathcal{O}\left(10^{-7}\right). The target and learned metrics have been plotted in FIG. (3.a).

Hereto, we almost naively select the frequency range of the data as Δω=1\Delta\omega=1. One important question in DL is how well the model generalizes. To proceed, we consider different datasets with the narrow frequency range Δω=102\Delta\omega=10^{-2} and keep each of them with 2000 data. Interestingly, we find that both Schwarzschild and RN metrics still can be well learned, although the error will increase when the frequency window is close to zero and especially when the charge density is large. This is shown by the MRE of the metrics learned from two typical windows, see the right half of FIG. (3.b). Furthermore, it suggests that the generalization ability of the DNN can be excellent. Indeed, in the left half of FIG. (3.b), we illustrate that using the metric learned from the data with Δω=102\Delta\omega=10^{-2}, one can generate the data with Δω=1\Delta\omega=1 very accurately. In the best performing example, the MRE of the generated data can be 𝒪(106)\mathcal{O}\left(10^{-6}\right). We also note that the examples with relatively large errors in FIG. (3.b) can be expected because the DC limit of the shear viscosity is determined solely by the physics on the horizon. In particular, when the charge density increases, the RN black hole approaches extremality and the IR CFT associated with the AdS×2R2{}_{2}\times\mathrm{R}^{2} geometry gradually begins to dominate the low-frequency physics Liu0907 ; Edalati0910 . Similarly, we do not expect that the DNN can learn well from a very high-frequency window, where the UV CFT associated with the AdS boundary should dominate555In fact, it has been observed in Matteo1903 ; Matteo1910 that the retarded Green function for the shear stress operator at the infinite frequency is determined by the energy density. We thank Matteo Baggioli for the discussion on this point..

Refer to caption
Refer to caption
Figure 3: The performance of the DNN. (a) The curves are the target metrics and the markers are the results learned from the data A, which has a wide frequency range. (b) The right half of the bars represents the MRE of the metrics which are learned from the data with narrow ranges B and C. The left half represents the MRE of the wideband data, which are generated using the metrics learned from the narrowband.

Beyond Einstein-Maxwell.—Our DNN is not only applicable to the Einstein-Maxwell theory. In the SM, we demonstrate in general that the DNN can be applied to the Einstein-Maxwell-dilaton (EMD) theory with the typical potential and coupling of the dilaton Gubser0911 . The only nontrivial constraint is that the conformal dimension of the dilaton operator Δϕ\Delta_{\phi} should be less than 5/2. To be more specific, we further study a concrete EMD theory. It admits the analytical black hole solution with Δϕ=2\Delta_{\phi}=2 and its holographic renormalization has been given in Kim1608 . This exemplifies our general argument explicitly. We also carry out the numerical experiments as before. It is found that the performance of the DNN for the EMD black hole is similar to that for the RN, see TABLE S.1. Note that at the zero-temperature limit, the EMD black hole becomes a special hyperscaling violating Lifshitz geometry with the asymptotic AdS. Moreover, here the metric components gtt(r)grr(r)1g_{tt}(r)g_{rr}(r)\neq 1. So what the DNN has learned is the joint factor

f(z)=1r2gtt(r)grr(r)|r=rh/z,f(z)=\left.\frac{1}{r^{2}}\sqrt{\frac{g_{tt}(r)}{g_{rr}(r)}}\right|_{r=r_{h}/z}, (16)

as we will explain below.

Viscosity and entanglement.—A more general metric like the EMD black hole has two independent components gttg_{tt} and grrg_{rr}. From Eq. (5), one can find that they appear in the form of the joint factor grr/gttg_{rr}/g_{tt}. Therefore, without the constraint on the energy-momentum tensor for gttgrr=1g_{tt}g_{rr}=1, the DNN still can be applied to learn the joint factor, but in general each of them cannot be learned separately from the shear response. Nevertheless, if there is another way to determine one component, the other can be obtained by the DNN. For example, there is evidence that the entanglement plays an important role in weaving the spacetime Maldacena0106 ; RT0603 ; Swingle0905 ; Raamsdonk1010 ; Susskind1306 . Among others, it has been shown that the holographic entanglement entropy S(l)S(l) can be used to fix the bulk metric wherever the extremal surface reaches Bilson0807 , which can be described as

rgrr(r)|r=rh/z=12π2Lz2zzbzzS(z)z4z4𝑑z,\left.r\sqrt{g_{rr}(r)}\right|_{r=r_{h}/z}=\frac{1}{2\pi^{2}L}z^{2}\partial_{z}\int_{z_{b}}^{z}\frac{z_{\ast}S(z_{\ast})}{\sqrt{z^{4}-z_{\ast}^{4}}}dz_{\ast}, (17)

where we have set the gravitational constant 16πG=116\pi G=1 and zz_{\ast} is determined by S(l)=8πL/z2S^{\prime}(l)=8\pi L/z_{\ast}^{2}. Note that ll and LL denote the finite width and the (regularized) infinite length of the rectangle on which the extremal surface is anchored RT0603 . Since the holographic entanglement entropy is only related to grrg_{rr}, it can complement to the shear viscosity to determine two metric components.

Conclusion and discussion.—Using a simple DL algorithm, we studied an inverse problem of AdS/CFT: Given the complex frequency-dependent shear viscosity of boundary field theories at finite temperatures, can the bulk metrics of black holes be extracted? We showed that Schwarzschild, RN, and EMD metrics can be learned by the DNN with high accuracy. The network architecture can be taken as a discretized representation of the holographic RG flow of the shear viscosity, hence supporting the underlying relationship among DL, RG, and gravity. We pointed out that our DNN is applicable to any strongly coupled field theory provided that: it is dual to the (3+1)-dimensional Einstein gravity minimally coupled with matter, it allows a homogeneous and isotropic black hole solution, the graviton mass in the wave equation vanishes, and the UV relation (7) holds. The extensions to the symmetry-breaking situations, the higher spacetime dimensions, and the modified theories of gravity should be worthwhile. Among others, we note that using the wave equation with the graviton mass which has been built up in Hartnoll1601 , one can construct the RG flow and the DNN where the graviton mass is encoded into new trainable weights. It is interesting to study whether the DNN can learn the metric and the mass simultaneously. In addition to various extensions, there are two open questions which should be mentioned. (i) Is there a better ansatz for the regularization term? Note that the regularization in this work is not to prevent overfitting as usual in machine learning. Instead, it is a guide to the minimum loss. We might need a deeper physical understanding of the regularization. (ii) How to reduce the discretized error at high frequencies and low temperatures sufficiently? Compared with directly increasing the number of layers, a more efficient method might be to apply the recently proposed DNN models of ordinary differential equations Chen1806 . We ultimately hope that our work could suggest a data-driven way to study holographic transports.

Moreover, we found that the complete black hole metric from IR to UV can be well learned from the data with narrow frequency ranges. We also checked that the performance of the DNN is hardly changed by randomly deleting several data points in our numerical experiments. These two facts indicate that the shear viscosity encodes the spacetime in a very different way from the entanglement entropy. The latter probes the deeper spacetime only by the S(l)S(l) with a larger ll, so any data point is necessary to reconstruct the spacetime. Perhaps we can describe the difference concisely as follows: the non-local observable (entanglement entropy) on the boundary probes the bulk spacetime locally, while the local observable (shear viscosity) probes it non-locally.

Furthermore, this non-locality leads to the excellent generalization ability of the DNN, which should be important in the application to the experimental data collected only in a part of the spectrum. Theoretically, from the perspective of machine learning, it usually implies that the data are highly structured666Another possibility is that the network has some symmetry Zhai1901 .. This structure is often important but obscure777For example, using the generative adversarial network (GAN), the approximate statistical predictions have been made recently in the string theory landscape Halverson2001 , where the accurate extrapolation capability has been exhibited on simulating Kähler metrics. It was speculated that this is the first evidence of Reid’s fantasy: all Calabi-Yau manifolds with fixed dimension are connected., due to the infamous black-box problem of machine learning. However, here the structure is nothing but the higher-dimensional black hole spacetime. This strong emergence might shed some light on the understanding of DL. Last but not least, the excellent generalization suggests that the strongly coupled field theories with gravity dual could exhibit another feature of the hologram in addition to encoding the higher dimension: The local (a small piece of the hologram) can reproduce the whole, see the schematic diagram FIG. (4).

Refer to caption
Figure 4: Schematic diagram: the DNN encodes the black hole spacetime, by which a wide spectrum of the field theory data can be generated from its narrow piece.

Acknowledgments.—We thank Koji Hashimoto for reading the draft and giving valuable comments. We also thank Matteo Baggioli, Yongcheng Ding, Wei-Jia Li, Tomi Ohtsuki, and Fu-Wen Shu for helpful discussions. SFW and XHG are supported by NSFC grants No. 11675097 and No. 11875184, respectively. YT is supported partially by NSFC grants No. 11975235 and No. 11675015. He is also supported by the “Strategic Priority Research Program of the Chinese Academy of Sciences” with Grant No. XDB23030000.

References

Supplementary material for
‘Deep learning black hole metrics from shear viscosity’

I Complex frequency-dependent shear viscosity

Compared to the real shear viscosity at the zero frequency limit, the study of the complex frequency-dependent counterpart is rare. So let’s begin from reviewing the Kubo formula of the complex frequency-dependent shear viscosity. In Read1207 , the Kubo formulas for the stress-stress response function at zero wavevector is derived from first principles. The approach given in Read1207 starts from a microscopic Hamiltonian and define the viscosity tensor as the linear response to a uniform external strain. In Wu2015 , an alternative field-theory approach is proposed, by which the Ward identity of viscosity coefficients in Read1207 is retrieved and extended. Here we will follow Wu2015 to give the definition of the complex frequency-dependent shear viscosity by the generating functional.

For a theorist, the viscosity can be measured by sending a gravitational wave through the system Son2007 . Suppose that a homogeneous and isotropic system lives in the two-dimensional flat space which is perturbed by a uniform gravitational wave. The response tensor YijklY^{ijkl} can be defined by

δTij(t)r=12𝑑tYijkl(tt)tδgrkl(t).\delta\langle T^{ij}(t)\rangle_{\mathrm{r}}=-\frac{1}{2}\int dt^{\prime}Y^{ijkl}(t-t^{\prime})\partial_{t^{\prime}}\delta g_{\mathrm{r}kl}(t^{\prime}). (S.1)

Here the subscript r\mathrm{r} and the subscript a\mathrm{a} below indicate that we have invoked the closed time-path formalism to discuss the real-time response. The elastic modulus and the viscosity tensor can be further defined by separating the right hand of Eq. (S.1) into two parts,

δTij(t)r=12𝑑tλijkl(tt)δgrkl(t)12𝑑tηijkl(tt)tδgrkl(t).\delta\langle T^{ij}(t)\rangle_{\mathrm{r}}=-\frac{1}{2}\int dt^{\prime}\lambda^{ijkl}(t-t^{\prime})\delta g_{\mathrm{r}kl}(t^{\prime})-\frac{1}{2}\int dt^{\prime}\eta^{ijkl}(t-t^{\prime})\partial_{t^{\prime}}\delta g_{\mathrm{r}kl}(t^{\prime}). (S.2)

The stress tensor can be derived by the variation of the generating functional with respect to the metric

Tij(t)r=2gδWδgaij(t),\langle T^{ij}(t)\rangle_{\mathrm{r}}=\frac{2}{\sqrt{g}}\frac{\delta W}{\delta g_{\mathrm{a}ij}(t)}, (S.3)

and the second variation leads to the retarded correlator

Graij,kl(t)4δWδgaij(t)δgrkl(0)=δklTijδ(t)λijkl(t)tηijkl(t).G_{\mathrm{ra}}^{ij,kl}(t)\equiv\frac{4\delta W}{\delta g_{\mathrm{a}ij}(t)\delta g_{\mathrm{r}kl}(0)}=\delta^{kl}\langle T^{ij}\rangle\delta(t)-\lambda^{ijkl}(t)-\partial_{t}\eta^{ijkl}(t). (S.4)

The elastic modulus is the stress response up to the zeroth-order in time derivatives, which can be determined by the constitute relation of perfect fluids. In hydrodynamic expansion, one has

δTij(t)r=(Pδikδjl+12δijδklκ1)δgrkl(t),\delta\langle T^{ij}(t)\rangle_{\mathrm{r}}=-\left(P\delta^{ik}\delta^{jl}+\frac{1}{2}\delta^{ij}\delta^{kl}\kappa^{-1}\right)\delta g_{\mathrm{r}kl}(t), (S.5)

where PP is the pressure and κ1\kappa^{-1} is the inverse compressibility. Then the elastic modulus can be given by

λijkl(t)=[P(δikδjl+δilδjk)+δijδklκ1]δ(t).\lambda^{ijkl}(t)=\left[P\left(\delta^{ik}\delta^{jl}+\delta^{il}\delta^{jk}\right)+\delta^{ij}\delta^{kl}\kappa^{-1}\right]\delta(t). (S.6)

The viscosity tensor can be decomposed as

ηijkl(t)=ζ(t)δijδkl+η(t)(δikδjl+δilδjkδijδkl)+ηH(t)(δjkϵilδilϵkj).\eta^{ijkl}(t)=\zeta(t)\delta^{ij}\delta^{kl}+\eta(t)\left(\delta^{ik}\delta^{jl}+\delta^{il}\delta^{jk}-\delta^{ij}\delta^{kl}\right)+\eta^{H}(t)\left(\delta^{jk}\epsilon^{il}-\delta^{il}\epsilon^{kj}\right). (S.7)

The coefficients ζ,η,ηH\zeta,\eta,\eta^{H} denote the bulk, shear, and Hall viscosities, respectively. Substituting the last two equations into Eq. (S.4), one can obtain

Gra12,12(t)\displaystyle G_{\mathrm{ra}}^{12,12}(t) =\displaystyle= λ1212(t)tη1212(t)\displaystyle-\lambda^{1212}(t)-\partial_{t}\eta^{1212}(t) (S.8)
=\displaystyle= Pδ(t)tη(t).\displaystyle-P\delta(t)-\partial_{t}\eta(t).

In Fourier space, we have the Kubo formula of the shear viscosity

η(ω)=Gra12,12(ω)+Piω.\eta(\omega)=\frac{G_{\mathrm{ra}}^{12,12}(\omega)+P}{i\omega}. (S.9)

Note that this formula applies to the complex shear viscosity at all frequencies. In contrast, many literatures focus on the real part and the DC limit of the shear viscosity, so the pressure in Eq. (S.9) is often neglected.

II Holographic renormalization

We proceed to bridge the shear viscosity η(ω)\eta(\omega) to the shear response χ(ω)\chi(\omega). The essential procedure is to carry out the holographic renormalization Henningson9806 . Consider that the bulk action includes the Einstein gravity and the minimally coupled matter

Sbulk=d4xg(R+6+Lmatter),S_{\mathrm{bulk}}=\int d^{4}x\sqrt{-g}\left(R+6+L_{\mathrm{matter}}\right), (S.10)

where we have set the AdS radius L=1L=1 and the Newton constant 16πGN=116\pi G_{N}=1. Suppose that the background metric is homogeneous and isotropic along the field theory directions

ds2=gtt(r)dt2+grr(r)dr2+gxx(r)dx2.ds^{2}=-g_{tt}(r)dt^{2}+g_{rr}(r)dr^{2}+g_{xx}(r)d\vec{x}^{2}. (S.11)

Accordingly, the energy-momentum tensor can be written by

Tμν=diag(Ttt(r),Trr(r),Txx(r),Txx(r)).T_{\mu\nu}=\mathrm{diag}\left(T_{tt}(r),T_{rr}(r),T_{xx}(r),T_{xx}(r)\right). (S.12)

Perturbing the Einstein equation on the background, the wave equation of the shear mode is derived as Hartnoll1601

1gr(ggrrrh)+gttω2h=0,\frac{1}{\sqrt{-g}}\partial_{r}(\sqrt{-g}g^{rr}\partial_{r}h)+g^{tt}\omega^{2}h=0, (S.13)

where we have assumed that the square of the graviton mass m2=gxxTxxδTxy/δgxym^{2}=g^{xx}T_{xx}-\delta T_{xy}/\delta g_{xy} is vanishing. Without loss of generality, we will set gxx=r2g_{xx}=r^{2} hereafter. We further require that the metric can be expanded near the AdS boundary, with the form

gtt\displaystyle g_{tt} =\displaystyle= r2(1+a1r+a2r2+a3r3+),\displaystyle r^{2}(1+\frac{a_{1}}{r}+\frac{a_{2}}{r^{2}}+\frac{a_{3}}{r^{3}}+\cdots),
grr\displaystyle g^{rr} =\displaystyle= r2(1+b1r+b2r2+b3r3+),\displaystyle r^{2}(1+\frac{b_{1}}{r}+\frac{b_{2}}{r^{2}}+\frac{b_{3}}{r^{3}}+\cdots), (S.14)

where aia_{i} and bib_{i} are some constants.

Near the boundary, the wave equation of the shear mode has the asymptotic solution

h=h(0)+1r2h(2)+1r3h(3)+.h=h^{(0)}+\frac{1}{r^{2}}h^{(2)}+\frac{1}{r^{3}}h^{(3)}+\cdots. (S.15)

Here h(0)h^{(0)} is the source, h(2)h^{(2)} is fixed by h(0)h^{(0)} as h(2)=h(0)ω2/2h^{(2)}=h^{(0)}\omega^{2}/2, and h(3)h^{(3)} relies on h(0)h^{(0)} and the incoming boundary condition on the horizon. In solving the asymptotic equation, one can find

(a1+b1)ω2h(0)=0.\left(a_{1}+b_{1}\right)\omega^{2}h^{(0)}=0. (S.16)

The situation a1=b10a_{1}=-b_{1}\neq 0 is rare, if existed. So we focus on a1=b1=0a_{1}=-b_{1}=0.

We write down the Gibbons-Hawking term and the counterterms

SGH\displaystyle S_{\mathrm{GH}} =\displaystyle= 2d3xγK,\displaystyle-2\int d^{3}x\sqrt{-\gamma}K, (S.17)
Sct\displaystyle S_{\mathrm{ct}} =\displaystyle= d3xγ(4+R+Lmatter(1)),\displaystyle\int d^{3}x\sqrt{-\gamma}\left(-4+R+L_{\mathrm{matter}}^{(1)}\right), (S.18)

where γab\gamma^{ab} is the induced metric, KK is the external curvature, and Lmatter(1)L_{\mathrm{matter}}^{(1)} is contributed by the matter.

We expand the on-shell bulk action, the Gibbons-Hawking term and the counterterms to the quadratic order of the shear mode,

Sbulk+SGH+Sct|onshell,quadratic\displaystyle\left.S_{\mathrm{bulk}}+S_{\mathrm{GH}}+S_{\mathrm{ct}}\right|_{\mathrm{on-shell,\;quadratic}} (S.19)
=\displaystyle= d2xdω2π12[(r2ω2gtt+4r2gtt2rgttgrrr2gttgttgrr+Lmatter(2))h¯hr2gttgrrh¯h],\displaystyle\int d^{2}x\int_{-\infty}^{\infty}\frac{d\omega}{2\pi}\frac{1}{2}\left[\left(-\frac{r^{2}\omega^{2}}{\sqrt{g_{tt}}}+4r^{2}\sqrt{g_{tt}}-2r\sqrt{\frac{g_{tt}}{g_{rr}}}-\frac{r^{2}g_{tt}^{\prime}}{\sqrt{g_{tt}g_{rr}}}+L_{\mathrm{matter}}^{(2)}\right)\bar{h}h-r^{2}\sqrt{\frac{g_{tt}}{g_{rr}}}\bar{h}h^{\prime}\right],

where h¯\bar{h} has the argument ω-\omega and Lmatter(2)L_{\mathrm{matter}}^{(2)} denotes the matter contribution which may be divergent on the boundary.

Substituting the asymptotic solution (S.15) and the metric (S.14) into Eq. (S.19), we obtain the renormalized action:

Sren=d2xdω2π12[(3a32b3+Lmatter(3))h¯(0)h(0)+3h¯(0)h(3)],S_{\mathrm{ren}}=\int d^{2}x\int_{-\infty}^{\infty}\frac{d\omega}{2\pi}\frac{1}{2}\left[(3a_{3}-2b_{3}+L_{\mathrm{matter}}^{(3)})\bar{h}^{(0)}h^{(0)}+3\bar{h}^{(0)}h^{(3)}\right], (S.20)

where Lmatter(3)L_{\mathrm{matter}}^{(3)} is contributed by the matter and it is finite.

Invoking the holographic dictionary, one can extract the retarded correlator from SrenS_{\mathrm{ren}}:

Gra12,12(ω)=(3a32b3+Lmatter(3))+3h(3)h(0).G_{\mathrm{ra}}^{12,12}(\omega)=(3a_{3}-2b_{3}+L_{\mathrm{matter}}^{(3)})+3\frac{h^{(3)}}{h^{(0)}}. (S.21)

Expand the response function near the boundary, which yields

χΠiωh=ggrrrhiωh=3iωh(3)h(0)iωr|r.\chi\equiv\frac{\Pi}{i\omega h}=-\frac{\sqrt{-g}g^{rr}\partial_{r}h}{i\omega h}=\left.\frac{3}{i\omega}\frac{h^{(3)}}{h^{(0)}}-i\omega r\right|_{r\rightarrow\infty}. (S.22)

Then we have

Gra12,12(ω)=(3a32b3+Lmatter(3))+iω(χ+iωr)r.G_{\mathrm{ra}}^{12,12}(\omega)=(3a_{3}-2b_{3}+L_{\mathrm{matter}}^{(3)})+i\omega\left(\chi+i\omega r\right)_{r\rightarrow\infty}. (S.23)

Reading the pressure P=Gra12,12(0)P=-G_{\mathrm{ra}}^{12,12}(0) from Eq. (S.9) and the DC response χ(0)=rh2\chi(0)=r_{h}^{2} from Eq. (5) of the main text, we find

Gra12,12(ω)=P+iω(χ+iωr)r+Lmatter(3)(ω)Lmatter(3)(0).G_{\mathrm{ra}}^{12,12}(\omega)=-P+i\omega\left(\chi+i\omega r\right)_{r\rightarrow\infty}+L_{\mathrm{matter}}^{(3)}\left(\omega\right)-L_{\mathrm{matter}}^{(3)}\left(0\right). (S.24)

Notice that given the Dirichlet boundary conditions as usual, Lmatter(1)L_{\mathrm{matter}}^{(1)} should be an intrinsic scalar on the 2+1-dimensional boundary, which indicates that we can parameterize

Lmatter(3)(ω)Lmatter(3)(0)=ω2M,L_{\mathrm{matter}}^{(3)}\left(\omega\right)-L_{\mathrm{matter}}^{(3)}\left(0\right)=\omega^{2}M, (S.25)

where MM represents a finite real number.

Combining Eq. (S.24), Eq. (S.25) and Eq. (S.9), we obtain the relation between shear response and shear viscosity

η(ω)=χ(ω)+iωr|riωM.\eta(\omega)=\left.\chi(\omega)+i\omega r\right|_{r\rightarrow\infty}-i\omega M. (S.26)

Some remarks on the parameter MM are in order. First of all, for the Einstein-Maxwell theory, MM is equal to zero. Second, it is also vanishing for some other matter fields. We take the massive scalar field ϕ\phi as an example, which is dominated by

Lmatter=12gμνμϕνϕ12mϕ2ϕ2.L_{\mathrm{matter}}=-\frac{1}{2}g^{\mu\nu}\partial_{\mu}\phi\partial_{\nu}\phi-\frac{1}{2}m_{\phi}^{2}\phi^{2}. (S.27)

It is dual to the relevant operator with the conformal dimension Δϕ=32+94+mϕ2\Delta_{\phi}=\frac{3}{2}+\sqrt{\frac{9}{4}+m_{\phi}^{2}} when Δϕ<3\Delta_{\phi}<3. The scalar field yields two counterterms related to ω2\omega^{2}, that is

Lmatter(1)=ϕ2ϕ+Rϕ2,L_{\mathrm{matter}}^{(1)}=\phi\nabla^{2}\phi+R\phi^{2}, (S.28)

where we have neglected two prefactors. When Δϕ<5/2\Delta_{\phi}<5/2, these two terms do not contribute to MM. Third, for the bottom-up holographic model, usually only the IR behavior of the metric is concerned. Instead, the holographic renormalization depends on the UV alone. Thus, one can assume that the target IR metric is embedded into a suitable UV background, by which M=0M=0. In the main text, we focus on the UV-complete metrics with M=0M=0 for simplicity. More generally, one can follow this philosophy. Thus, for a large class of strongly coupled theories with gravity dual, we have obtained

η(ω)=χ(ω)+iωr|r.\eta(\omega)=\left.\chi(\omega)+i\omega r\right|_{r\rightarrow\infty}. (S.29)

III Einstein-Maxwell-dilaton theory

We will illustrate that our DNN can be applied to the Einstein-Maxwell-dilaton (EMD) theory Gubser0911 . Consider the bulk action in a d+1d+1-dimensional spacetime

Sbulk=dd+1xg[RZ(ϕ)4FμνFμν12μϕμϕV(ϕ)].S_{\mathrm{bulk}}=\int d^{d+1}x\sqrt{-g}\left[R-\frac{Z\left(\phi\right)}{4}F^{\mu\nu}F_{\mu\nu}-\frac{1}{2}\nabla_{\mu}\phi\nabla^{\mu}\phi-V\left(\phi\right)\right]. (S.30)

Here V(ϕ)V\left(\phi\right) is the potential of the dilaton and Z(ϕ)Z\left(\phi\right) is the coupling between the dilaton and the gauge field. For any dilaton functions, the wave equation of the shear mode is same as Eq. (S.13). The dilaton functions should be constrained to accommodate a spacetime solution with asymptotically AdS. Typically, they can be expanded near the AdS boundary

Z(ϕ)\displaystyle Z\left(\phi\right) =\displaystyle= 1+,\displaystyle 1+\cdots,
V(ϕ)\displaystyle V\left(\phi\right) =\displaystyle= d(d1)+12V2ϕ2+,\displaystyle-d(d-1)+\frac{1}{2}V_{2}\phi^{2}+\cdots, (S.31)

where the ellipsis denotes the higher order terms of ϕ\phi. Using the gauge

ds2=dρ2+γij(ρ,x)dxidxj,Aρ=0,ds^{2}=d\rho^{2}+\gamma_{ij}(\rho,x)dx^{i}dx^{j},\;A_{\rho}=0, (S.32)

we further require the boundary conditions of the fields as Papadimitriou0505

γij(ρ,x)e2ργ¯ij(x),Ai(ρ,x)Ai(x),ϕ(ρ,x)e(dΔϕ)ρϕ¯(x),\gamma_{ij}(\rho,x)\simeq e^{2\rho}\bar{\gamma}_{ij}(x),\;A_{i}(\rho,x)\simeq A_{i}(x),\;\phi(\rho,x)\simeq e^{-(d-\Delta_{\phi})\rho}\bar{\phi}(x), (S.33)

where the conformal dimension is

Δϕ=d2+d24+mϕ2\Delta_{\phi}=\frac{d}{2}+\sqrt{\frac{d^{2}}{4}+m_{\phi}^{2}} (S.34)

with the mass square

mϕ2=2ϕ2[V(ϕ)+Z(ϕ)4F2]|ϕ=0.m_{\phi}^{2}=\left.\frac{\partial^{2}}{\partial\phi^{2}}\left[V\left(\phi\right)+\frac{Z\left(\phi\right)}{4}F^{2}\right]\right|_{\phi=0}. (S.35)

According to the holographic renormalization, especially the Hamilton-Jacobi approach HJ , one can build up the most general ansatz for the intrinsic counterterms on the boundary888Here we select the grand canonical ensemble and impose the Dirichlet boundary condition on the dilaton as usual. If the Neumann or mixed boundary condition is imposed, one should involve an additional boundary term which is not intrinsic on the boundary. However, it does not contribute to the parameter MM, as we will show below.

Sct=2ddxγU(γij,Ai,ϕ).S_{\mathrm{ct}}=-2\int d^{d}x\sqrt{-\gamma}U(\gamma^{ij},A_{i},\phi). (S.36)

It can be organized in an expansion

U=U(0)+U(2)++U(2d2),U=U_{(0)}+U_{(2)}+\cdots+U_{(2\left\lfloor\frac{d}{2}\right\rfloor)}, (S.37)

where U(2k)U_{(2k)} contains kk inverse metrics and d/2\left\lfloor d/2\right\rfloor denotes the integer no more than d/2d/2. Hereafter, we will focus on d=3d=3 and U=U(0)+U(2)U=U_{(0)}+U_{(2)}.

Keeping in mind the U(1) symmetry, one can find that the contribution of gauge fields starts from U(4)U_{(4)}. The dilaton may contribute to both U(0)U_{(0)} and U(2)U_{(2)}. The leading terms of U(2)U_{(2)} with the dilaton are ϕ2ϕ\phi\nabla^{2}\phi and Rϕ2R\phi^{2}. Thus, as analyzed in the previous section, we can set Δϕ<5/2\Delta_{\phi}<5/2 to impose the vanishing of the matter parameter MM.

We proceed to study a concrete EMD theory Gubser0911 . We specify the potential and the coupling as

Z(ϕ)=exp(ϕ/3),V(ϕ)=6cosh(ϕ/3).Z\left(\phi\right)=\exp(\phi/\sqrt{3}),\;V\left(\phi\right)=-6\cosh(\phi/\sqrt{3}). (S.38)

This theory allows an analytical black hole solution

ds2\displaystyle ds^{2} =\displaystyle= u2f2(u)(f1(u)dt2+dx2+dy2)+1u2f1(u)f2(u)du2,\displaystyle u^{2}f_{2}(u)\left(-f_{1}(u)dt^{2}+dx^{2}+dy^{2}\right)+\frac{1}{u^{2}f_{1}(u)f_{2}(u)}du^{2}, (S.39)
f1(u)\displaystyle f_{1}(u) =\displaystyle= 11(Q+u)3(m++Q3),\displaystyle 1-\frac{1}{\left(Q+u\right)^{3}}\left(m_{+}+Q^{3}\right),
f2(u)\displaystyle f_{2}(u) =\displaystyle= (1+Qu)32,\displaystyle\left(1+\frac{Q}{u}\right)^{\frac{3}{2}},

associated with the profile of matter fields

A\displaystyle A =\displaystyle= 3Q(m++Q3)Q+u+uu+Q+udt,\displaystyle\frac{\sqrt{3Q\left(m_{+}+Q^{3}\right)}}{Q+u_{+}}\frac{u-u_{+}}{Q+u}dt, (S.40)
ϕ(u)\displaystyle\phi(u) =\displaystyle= 32log(1+Qu),\displaystyle\frac{\sqrt{3}}{2}\log\left(1+\frac{Q}{u}\right), (S.41)

where m+m_{+} and QQ are two independent parameters. Note that m+m_{+} is related to the horizon radius u+u_{+}. Interestingly, when the black hole solution approaches extremality, it interpolates the asymptotic AdS on the UV and the conformal-to-AdS2 geometry on the IR with the Lifshitz and hyperscaling violating exponents z=z=\infty\ and θ/z=1-\theta/z=1.

Substituting the background solution into Eq. (S.34) and Eq. (S.35), we obtain the conformal dimension Δϕ=2\Delta_{\phi}=2. Since it is less than 5/25/2, we can infer the matter parameter M=0M=0. Moreover, the holographic renormalization of this theory has been studied in Kim1608 . Consequently, we can double check M=0M=0 by calculating the renormalized action explicitly. From Kim1608 , we read the counterterms

Sct=d3xγ(4+R+13ϕnuuϕ16ϕ2).S_{\mathrm{ct}}=\int d^{3}x\sqrt{-\gamma}\left(-4+R+\frac{1}{3}\phi n^{u}\partial_{u}\phi-\frac{1}{6}\phi^{2}\right). (S.42)

Note that we have selected the grand canonical ensemble and the mixed boundary condition of the dilaton has been imposed to be consistent with the thermodynamic first law. Here nun^{u} is the radial component of the outward unit vector normal to the boundary.

In order to use the formalism in the previous section, we need to change the coordinate uru\rightarrow r, defined by

r=guu=(1+Qu)3/4u.r=\sqrt{g_{uu}}=\left(1+\frac{Q}{u}\right)^{3/4}u. (S.43)

The exact solution is lengthy. To be explicit, we expand Eq. (S.43) at small QQ, which has the approximate solution

u=14(3Q+3Q2+16r2).u=\frac{1}{4}(-3Q+\sqrt{3Q^{2}+16r^{2}}). (S.44)

Then we read off

Lmatter(1)\displaystyle L_{\mathrm{matter}}^{(1)} =\displaystyle= 13[ϕnrrϕ12ϕ2],\displaystyle\frac{1}{3}\left[\phi n^{r}\partial_{r}\phi-\frac{1}{2}\phi^{2}\right], (S.45)
Lmatter(2)\displaystyle L_{\mathrm{matter}}^{(2)} =\displaystyle= 112[12gttr2ϕ2gttgrrr2ϕϕ]h¯h,\displaystyle\frac{1}{12}\left[\frac{1}{2}\sqrt{g_{tt}}r^{2}\phi^{2}-\frac{\sqrt{g_{tt}}}{\sqrt{g_{rr}}}r^{2}\phi\phi^{\prime}\right]\bar{h}h, (S.46)
Lmatter(3)\displaystyle L_{\mathrm{matter}}^{(3)} =\displaystyle= 116Q3h¯h.\displaystyle\frac{1}{16}Q^{3}\bar{h}h. (S.47)

As expected, one can find

Lmatter(3)(ω)Lmatter(3)(0)=ω2M=0.L_{\mathrm{matter}}^{(3)}\left(\omega\right)-L_{\mathrm{matter}}^{(3)}\left(0\right)=\omega^{2}M=0.\ (S.48)

Also, we have checked that MM is still zero for large QQ.

Then we can carry out the numerical experiments. We set the horizon radius u+=1u_{+}=1 for convenience. The performance of the DNN for the EMD black hole is similar to that for the RN black holes.

IV Training scheme and report

For all numerical experiments in this paper, we implement the same training scheme. We train the network in two stages. First, the initial weights are randomly selected from (0,2)\left(0,2\right). The loss function is given by the sum of Eq. (12) and Eq. (13) of the main text. We will adopt the RMSProp optimizer RMSProp . Second, the initial weights of the DNN will be replaced by the trained weights of the first stage. The loss function is re-set as Eq. (12) of the main text without the regularization. Then the network will be trained again with the optimizer Adam Adam . After the training of each stage, one can read the loss, extract the weights, and calculate their error. It can be found that after the second stage of training, the performance of the DNN is usually improved. In particular, the loss (without regularizations) of the first stage can be reduced by several orders of magnitude. Moreover, turning the regularization factors in the first stage can improve the performance of the DNN in the second stage. With this in mind, we will scan the parameter space of regularization factors.

In two stages of training, we fix the batch size cbs=512c_{\mathrm{bs}}=512, but the learning rate is changed even not once and selected by experience. The number of epoches in each stage is large enough so that the training will not stop until the validation loss is almost not reduced. The regularization factor c3c_{3} is set as 15c115c_{1}. We focus on turning the regularization factors c1c_{1} and c2c_{2} because the performance of the DNN is more sensitive to them than other hyperparameters. Initially, c1c_{1}\ and c2c_{2} are limited to some suitable ranges. Then we scan the two-dimensional parameter space. Considering the statistical fluctuation due to the randomized initialization of the network, we train 55 times for each set of regularization factors Iten1807 . We gradually move and reduce the ranges of parameters. We also gradually reduce the step sizes. Thus, the scanning is concentrated around the c1c_{1} and c2c_{2} where the DNN produces smaller losses. We stop the scanning when the step sizes Δc1<105\Delta c_{1}<10^{-5} and Δc2<0.01\Delta c_{2}<0.01.

We select the optimal regularization factors as the ones according to the minimum loss. We read the minimum loss and the trained weights of the network accordingly, and then calculate the MRE of the weights and the MRE of the wideband data generated by the weights learned from narrowband data. These quantities are considered as the final performance of the DNN. We simply refer them as the “minimum loss”, the “learned metric”, the “MRE of learned metrics”, and the “MRE of generated data”, respectively. They have been exhibited in FIG. 3 of the main text and in TABLE S.1.

Table S.1: Training reports of various numerical experiments. We have five target metrics and each of them is learned from three datasets: A ω(0,1]\omega\in(0,1], B ω(0.99,1]\omega\in(0.99,1], and C ω(0.01,0.02]\omega\in(0.01,0.02]. We list two optimal regularization factors and three quantities which characterize the performance of the DNN.
 Target  Data Optimal c1c_{1} Optimal c2c_{2}  Minimum loss MRE of learned metrics MRE of generated data
 Schwarzschild A 1.20×1031.20\times 10^{-3} 2.022.02 1×10131\times 10^{-13} 1×1031\times 10^{-3} /
 Schwarzschild B 1.4×1041.4\times 10^{-4} 1.521.52 6×10136\times 10^{-13} 2×1032\times 10^{-3} 5×1065\times 10^{-6}
 Schwarzschild C 3.4×1043.4\times 10^{-4} 1.491.49 1×10141\times 10^{-14} 5×1035\times 10^{-3} 4×1044\times 10^{-4}
 RN q=1 A 9×1059\times 10^{-5} 2.002.00 1×10131\times 10^{-13} 1×1031\times 10^{-3} /
 RN q=1 B 2.6×1042.6\times 10^{-4} 1.301.30 7×10137\times 10^{-13} 4×1034\times 10^{-3} 3×1053\times 10^{-5}
 RN q=1 C 1.1×1041.1\times 10^{-4} 0.900.90 1×10141\times 10^{-14} 3×1023\times 10^{-2} 2×1032\times 10^{-3}
 RN q=2 A 7×1057\times 10^{-5} 1.251.25 1×10121\times 10^{-12} 7×1047\times 10^{-4} /
 RN q=2 B 6×1056\times 10^{-5} 1.151.15 2×10112\times 10^{-11} 1×1021\times 10^{-2} 5×1055\times 10^{-5}
 RN q=2 C 4.2×1044.2\times 10^{-4} 0.500.50 2×10142\times 10^{-14} 4×1024\times 10^{-2} 7×1037\times 10^{-3}
 EMD Q=1 A 8.5×1048.5\times 10^{-4} 1.851.85 2×10132\times 10^{-13} 1×1031\times 10^{-3} /
 EMD Q=1 B 2.2×1042.2\times 10^{-4} 1.201.20 1×10121\times 10^{-12} 6×1036\times 10^{-3} 5×1055\times 10^{-5}
 EMD Q=1 C 2.6×1042.6\times 10^{-4} 1.001.00 1×10141\times 10^{-14} 1×1021\times 10^{-2} 1×1031\times 10^{-3}
 EMD Q=5 A 2.5×1042.5\times 10^{-4} 1.161.16 6×10136\times 10^{-13} 6×1046\times 10^{-4} /
 EMD Q=5 B 1.1×1041.1\times 10^{-4} 0.850.85 2×10122\times 10^{-12} 4×1034\times 10^{-3} 2×1052\times 10^{-5}
 EMD Q=5 C 3.0×1043.0\times 10^{-4} 0.400.40 2×10142\times 10^{-14} 4×1024\times 10^{-2} 7×1037\times 10^{-3}