p-Adic Statistical Field Theory and Convolutional Deep Boltzmann Machines
Abstract
Understanding how deep learning architectures work is a central scientific problem. Recently, a correspondence between neural networks (NNs) and Euclidean quantum field theories (QFTs) has been proposed. This work investigates this correspondence in the framework of -adic statistical field theories (SFTs) and neural networks (NNs). In this case, the fields are real-valued functions defined on an infinite regular rooted tree with valence , a fixed prime number. This infinite tree provides the topology for a continuous deep Boltzmann machine (DBM), which is identified with a statistical field theory (SFT) on this infinite tree. In the -adic framework, there is a natural method to discretize SFTs. Each discrete SFT corresponds to a Boltzmann machine (BM) with a tree-like topology. This method allows us to recover the standard DBMs and gives new convolutional DBMs. The new networks use parameters while the classical ones use parameters.
1 Introduction
The deep neural networks have been successfully applied to various tasks, including self-driving cars, natural language processing, and visual recognition, among many others, [1]-[2]. There is consensus about the need of developing a theoretical framework to understand how deep learning architectures work. Recently, physicists have proposed the existence of a correspondence between neural networks (NNs) and quantum field theories (QFTs), more precisely statistical field theory (SFT), see [3]-[12], and the references therein. This correspondence takes different forms depending on the architecture of the networks involved.
In [12], the study of the above-mentioned correspondence was initiated in the framework of the non-Archimedean statistical field theory (SFT). In this case, the background space (the set of real numbers) is replaced by the set of -adic numbers, where is a fixed prime number. The -adics are organized in a tree-like structure; this feature facilitates the description of hierarchical architectures. In [12], a -adic version of the convolutional deep Boltzmann machines is introduced where only binary data is considered and with no implementation. By adapting the mathematical techniques introduced by Le Roux and Benigio in [13], the author shows that these machines are universal approximators for binary data tasks.
In this article, we continue discussing the correspondence between STFs and NNs, in the -adic framework. Compare with [12] here we consider more general architectures and data types. We note that dealing with general data is challenging both in theory and implementation practice. We argue that -adic analysis still provides the right framework to understand the dynamics of NNs with large tree-like hierarchical architectures. In our approach, a NN corresponds to the discretization of a -adic STF. The discretization process is carried out in a rigorous and general way. Moreover, such discretization allows us to obtain many recently developed deep BM. For instance, the NNs constructed in [5] are a particular case of the ones introduced here. We also discuss the implementation of a class of -adic convolutional networks and obtain desired results on a feature detection task based on hand-writing images of decimal digits.
The main novelty of our -adic convolutional DBMs is that they use significantly fewer parameters than the conventional ones. A detailed discussion is given in Section 6. We note that the connections between -adic numbers and neural networks have been considered before. Neural networks whose states are -adic numbers were studied in [14, 15]. These models are completely different from the ones considered here. These ideas have been used to develop non-Archimedean models of brain activity and mental processes [16]. In [17, 18], p-adic versions of the cellular neural networks were studied. These models involved abstract evolution equations.
2 -Adic Numbers
In this section, we introduce basic concepts for the -adic numbers. For more detailed information, refer Section 7 (Appendix A).
From now on, denotes a fixed prime number. Any non-zero adic number has a unique expansion of the form
with , where is an integer, and the s are numbers from the set . The set of all possible numbers of such form constitutes the field of -adic numbers . There are natural field operations, sum, and multiplication, on -adic numbers, see, e.g., [36]. There is also a natural norm in defined as where depends on , for a nonzero -adic number .

The field of -adic numbers with the distance induced by is a complete ultrametric space. The ultrametric property refers to the fact that for any , , in . In this article, we work with -adic integers, which -adic numbers satisfying . All such -adic integers constitute the unit ball . The unit ball is closed under addition and multiplication, so it is a commutative ring. Along this article, we work mainly with locally constant functions supported in the unit ball, i.e., with functions of type , such that for all in . The simplest example of such function is the chararcterisstic function of the unit ball : if , otherwise . To check that , we use that is closed under addition. If , where the integer is fixed and independent of . We denote by the real vector space of test functions supported in the unit ball. There is a natural integration theory so that gives a well-defined real number. The measure is the so-called Haar measure of . Further details are given in Section 7.3.
Since the -adic numbers are infinite series, any computational implementation involving these numbers requires a truncation process: , . The set of all truncated integers mod is denote as . This set can be represented as a rooted tree with levels; see Figure 1.
The unit ball is an infinite rooted tree with fractal structure; see Figure 2. Section 7 (Appendix A) provides review of the basic aspects of the -adic analysis required here. We note that the word ‘field’ will be used here in two different contexts throughout the article. In a mathematical context, we refer to algebraic fields; in a physical context, we refer to Euclidean quantum fields.
3 Non-Archimedean -statistical field theories
We fix , , , , , and an integrable function . A -adic continuous Boltzmann machine (or a -adic continuous BM) is a statistical field theory involving two scalars fields , . The function is called the visible field and the function is called the hidden field. We assume that the field performs thermal fluctuations and that the expectation value of the field is zero.


The size of the fluctuations is controlled by an energy functional of the form
where . The first term
is an analogue of the free-field energy. The second term
is an analogue of the interaction energy. The results presented in this section are valid for more general functionals in which the first term in is replaced by
The motivation behind the definition of the energy functionals is that the discretizations of these functionals give the energy functionals considered in [5], [13], [38]-[39].
All the thermodynamic properties of the system are described by the partition function of the fluctuating fields, which is defined as
where is the Boltzmann constant and is the temperature constant. We normalize in such a way that . The measure is ill-defined. However, it is expected that such measure can be defined rigorously by a limit process. The statistical field theory corresponding to the energy functional is defined as the probability measure
on the space .
The information about the local properties of the system is contained in the correlation functions of the field : for , and two disjoint subsets , , with , where is the disjoint union, we set
These functions are also called the -point Green functions.
To study these functions, one introduces two auxiliary external fields called currents, and adds to the energy functional as a linear interaction energy of these currents with the field ,
now the energy functional is . The partition function formed with this energy is
where
The functional derivatives of with respect to , evaluated at , give the correlation functions of the system:
where . The functional is called the generating functional of the theory.
The description of the -SFTs presented above is based in the classical version of these theories [40]-[41]. In [31], a mathematically rigorous formulation of -SFTs is presented, the fields are functions from into , with arbitrary. We expect that this theory can be extended to the -SFTs presented here.
4 Discrete SFTs and -adic discrete Boltzmann machines
A central difference between the -adic STFs and the classical ones is that in the -adic case, the discretization process can be carried out in an easy rigorous way. More specifically, the discretization of a -adic SFT is constructed by restricting the energy functional to a finite dimensional vector subspace of the space of test functions . The test functions in have the form
where , , and is the characteristic function of the ball . Here, it is important to notice that is a finite, Abelian, additive group. In the -adic world, the discrete functions are a particular case of the -adic continuous functions, more precisely, and . There is no Archimedean counterpart of this result.
By taking and sufficiently large, the restriction of the energy functional to has the form
(1) | |||
We refer Section 8 (Appendix B) for further details of this calculation. From now on, we refer eq. 1 as the energy functional for a discrete -STF.
In the general case, is a -adic analogue of the neural networks introduced in [5]. In this article, each hidden state and visible state are interacted through a weight . Therefore, requires the number of for weights , whereas our counterpart requires only . See Section 6, for further discussion.
We now attach to the discrete energy functional a -adic discrete BM. For any visible and hidden states, and , the Boltzmann probability distribution attached to is given by
When there is no risk of confusion, we will omit in the notations. When for all , the energy functional corresponds to a adic analogue of the convolutional deep belief networks studied in [38].
We note that the energy functional has translational symmetry, i.e., is invariant under the transformations , , for any , . This transformation is well-defined since is an additive group. In the case of applications to image processing, the group property also implies that the convolution operation does not alter image dimensions. The convolutional -adic discrete BMs introduced here are a specific type of deep Boltzmann machines (DBMs) (also called deep belief networks DBNs).
5 Experimental Results
We implement a -adic discrete Boltzmann machine for processing binary images, then , in this case, we use the following energy functional:
for some natural number . Note that, comparing to Equation 1, the quadratic and biquadratic terms are omitted since they do not play any role in the case in which are binary variables. The condition implies that the convolution operation is restricted to a small neighborhood of radius for each pixel. The condition means that the convolution involves all the points in the image.
Our numerical experiment is based on the MNIST dataset, where each image is considered as a sample of the visible state. Our first task is to train the network to maximize the log-likelihood of the visible states. We choose and since image dimension is . In general, and depend on the size of the images to be processed. Typically is chosen as a small prime number, e.g., or .
To tune the parameters , , , of the network, we first transform each image into a test function . The test function is defined in terms of the tree structure of in the following way: we define as the root of the tree. Later we divide into three horizontal even slices (sub-images). These sub-images are the vertices level , and they are the children of . Each sub-image at level is then divided vertically into sub-images; these are the children of the . All the new sub-images correspond to the vertices at level . We repeat this process until reaching level . At level , each vertex corresponds to a pixel, and we denote by value for . Then is defined as . See Figure 3 for the construction of the tree corresponding to a image. Figure 4 shows the graph of a test function corresponding to an image from the MNIST data set. For further details, the reader may consult the Appendix in [18].



We note that in a -adic discrete deep RBM, the visible and hidden states are functions on a finite tree. Only the vertices at the top level, which are marked as orange and blue balls are allowed to have states. See Figure 5 for the case . The remaining trees’ vertices (see the black dots in Figure 5) only codify the hierarchical relation between states. The visible and hidden vertices connected by the same type of lines share the same weight .
We adapted the contrastive divergence learning method to the -adic framework, [48]. The technical details are presented in Section 9.2. We implement two different types of networks. In the first type, the function is supported in the entire tree ; in the second type, the function is supported in a proper subset of the tree . We use the full MNIST hadwritten digits, without considering labels, to train a six layer -adic feature detector. The results are show in Figure 6.

After the processing by the network ends, it is necessary to transform the test function into an image.
6 Conclusions
The standard RBMs [48] and the - neural networks introduced in [5] are particular classes of -adic discrete BMs. Indeed, if is a test function and the interaction between the visible and hidden field has the form , then in the corresponding discrete energy functional, after a suitable rescaling of the weights , the interaction of the visible and hidden states takes the form . Here is an ordinary matrix, which means that its entries do not depend on the algebraic structure of neither on its topology.
In the case in which the interaction between the visible and hidden field has the form , then corresponding discrete energy functional depends on the group structure of , and the corresponding neural network is a particular case of a DBM, see Figure 5.
The condition , for some constant , means that a -adic discrete BM admits copies arbitratly large, this is a consequence of the fact of -adic numbers has a tree-like structure. We expect that for sufficiently large the statistical properties of the network can be studied using a -adic continuous SFT.
Our numerical experiments show that -adic discrete convolutional deep BMs alone can be used to process real data, this opens the possibility of using these networks as layers in specialized NNs.
In [12] the first author conjectured that the limit
(2) |
exists in some sense, here denotes the Lebesgue measure of . Then the correlation between the network activity in different regions of the underlaying tree can be understood by computing the correlations functions of the corresponding continuous SFT.
For practical applications the NNs should be discrete entities. This type of NNs naturally correspond with discrete SFTs. To use Euclidean QFT to study NNs, it is convenient to have continuous versions of these networks. Thus, a clear way of passing between discrete STFs to continuous ones is required. The existence of the limit 2 is a very difficult problem in classical QFT. In [31] the existence of this limit was established for -adic - theories involving one scalar field, we expect that these techniques can be extended to the case QFTs considered here.
It is widely accepted in the artificial intelligence community that the probability distributions should approximate any finite probability distribution very well, which means that the corresponding NNs are universal approximators. We argue that this property is connected with the topology and the structure of the NNs, and that the problem of designing good architectures for NNs is out of the scope of QFT. We expect that QFT techniques will be useful to understand the qualitative behavior of large NNs, which can be well-approximated as ‘continuos’ NNs.
The study of the correspondence between -adic Euclidean QFTs and NNs is just starting. We envision that the next step is to develop perturbative calculations of the correlation functions via Feynman diagrams, to study connections with Ginzburg–Landau theory, and to develop practical applications of the -adic convolutional BMs.




7 Apendix A: Basic facts on -adic analysis
In this section we review some basic results on -adic analysis required in this article. For a detailed exposition on -adic analysis the reader may consult [20], [43]-[45]. For a quick review of -adic analysis the reader may consult [46].
7.1 The field of -adic numbers
The field of adic numbers is defined as the completion of the field of rational numbers with respect to the adic norm , which is defined as
(3) |
where and are integers coprime with . The integer , with , is called the adic order of . The metric space is a complete ultrametric space. Ultrametric means that . As a topological space is homeomorphic to a Cantor-like subset of the real line, see, e.g., [20], [37], [43].
Any adic number has a unique expansion of the form
where and . In addition, for any we have
7.2 Topology of
For , denote by the ball of radius with center at , and take . The ball equals the ring of adic integers . The balls are both open and closed subsets in . We use to denote the characteristic function of the ball . Two balls in are either disjoint or one is contained in the other. As a topological space is totally disconnected, i.e., the only connected subsets of are the empty set and the points. A subset of is compact if and only if it is closed and bounded in , see e.g. [20, Section 1.3], or [43, Section 1.8]. The balls and spheres are compact subsets. Thus is a locally compact topological space.
7.2.1 Tree-like structures
Any -adic integer admits an expansion of the form for some . The set of -adic truncated integers modulo , , consists of all the integers of the form . These numbers form a complete set of representatives for the elements of the additive group , which is isomorphic to the set of integers (written in base ) modulo . By restricting to , it becomes a normed space, and . With the metric induced by , becomes a finite ultrametric space. In addition, can be identified with the set of branches (vertices at the top level) of a rooted tree with levels and branches. By definition the root of the tree is the only vertex at level . There are exactly vertices at level , which correspond with the possible values of the digit in the -adic expansion of . Each of these vertices is connected to the root by a non-directed edge. At level , with , there are exactly vertices, each vertex corresponds to a truncated expansion of of the form . The vertex corresponding to is connected to a vertex at the level if and only if is divisible by . See Figure 1. The balls are an infinite rooted trees.
7.3 The Haar measure
Since is a locally compact topological group, there exists a Haar measure , which is invariant under translations, i.e., , [47]. If we normalize this measure by the condition , then is unique. It follows immediately that
In a few ocassions we use the two-dimensional Haar measure of the additive group normalize this measure by the condition . For a quick review of the integration in the -adic framework the reader may consult [46] and the references therein.
7.4 The Bruhat-Schwartz space in the unit ball
A real-valued function defined on is called Bruhat-Schwartz function (or a test function) if for any there exist an integer such that
(4) |
The -vector space of Bruhat-Schwartz functions supported in the unit ball is denoted by . For , the largest number satisfying (4) is called the exponent of local constancy (or the parameter of constancy) of . A function in can be written as
where the , , are points in , the , , are integers, and denotes the characteristic function of the ball .
8 Appendix B: Discretization of the energy functional
The elments of , , have the form , where the s are -adic digits. We denote by the -vector space of all test functions of the form
here is the characteristic function of the ball . Notice that is supported on and that is a finite dimensional vector space spanned by the basis .
By identifying with the column vector , we get that is isomorphic to endowed with the norm . Furthermore,
where denotes a continuous embedding.
The restriction of to the subspace gives a discretization of denoted as . Indeed, by assuming that
we have
and by using that , , , are test functions supported in the unit ball, and taking sufficiently large, we have
and consequently
We take , ,
, , , , for , , and . We also rescale to , then
We now recall that is an additive group, then
and consequently
9 Appendix C: Some probability distributions
From now on, we assume that visible and hidden fields are binary variables. However, most of our mathematical formulation is valid under the assumption that visible and hidden fields are discrete variables. We set
, to be the respective visible and hidden random variable sets. The random variables take values . For the sake of simplicity, we will identify the random vector with , and the random vector with . We identify with the set of branches (vertices at the top level) of a rooted tree with levels and branches. Attached to each branch there are two are two states: , . With this notation, is a realization of the visible field, and is a realization of the hidden field. The joint distribution of the random vectors is given by the following Boltzmann probability distribution:
(5) |
where
and is defined in Equation 1
By using the joint distribution of the visible field and the hidden field (5), we compute the marginal probability distributions as follows:
Since we are assuming that are binary, the energy functional takes the form
The classical RBM has the advantage of independence between the visible units as well as the hidden units. The -adic BM shares the same advantage, i.e., by fixing the hidden field , the random variables , , become independent. An analog assertion is valid if we fix the visible field . More precisely, the conditional probability distributions satisfy
Indeed, by direct computation, we have
Similarly, we can prove that
(6) |
9.1 Gradient of the Log-likelihood
The log-likelihood giving a single visible state is given
Taking the derivative with respect to the parameters gives the following mean-like representation:
(7) |
In the case of multiple visible states , the log-likelihood is defined in the average sense, i.e., .
Taking the derivative with respect to gives
(8) |
Now, let be the ordered indexes of all the positive states in .
(9) |
Note that the above formula is different from the classical RBM, [48, Formula 29].
We derive the derivatives with respect to and similarly as in the classical RBM:
(11) |
and
(12) |
9.2 Contrastive divergence learning
As in the classical case, the exact computation of the gradient of the log-likelihood involves an exponential number of terms, see (10). We adopt the contrastive divergence (CD) method, introduced by Hinton [49], to the -adic case to approximate the minimization of the gradient of the log-likelihood.
The approximation of eq. 10 –eq. 13 using the contrastive convergence method can be represented as follows:
(13) |
where is a pre-determined positive integer, is a training example and is a sample of the Gibbs chain after steps. More precisely, we implement the Gibbs sampling in the following way. First, we obtain a sample of using the conditional distribution . Then, we obtain a sample of using . We repeat this process until we get .
The following formulas are utilized in the calculation. Let denotes the state of all hidden units expect for the -th one:
Similarly, we have
(14) |
References
- [1] LeCun, Y., Bengio, Y. & Hinton, G. E. Deep learning. Nature 521, 436–444 (2015).
- [2] Bahri Y., Kadmon J., Pennington J., Schoenholz S.S., Sohl-Dickstein J., Ganguli S. Statistical Mechanics of Deep Learning. Annual Review of Condensed Matter Physics 11, 501-528 (2020).
- [3] Buice Michael A., Chow Carson C. Beyond mean field theory: statistical field theory for neural networks. J. Stat. Mech. Theory Exp. 3, P03003, 21 pp. (2013).
- [4] Buice Michael A., Cowan Jack D., Field-theoretic approach to fluctuation effects in neural networks. Phys. Rev. E 75, no. 5, 051919, 14 pp. (2007).
- [5] Bachtis D., Aarts G., & Lucini B. Quantum field-theoretic machine learning. Physical Review D, 103 (7), 074510 pp. 14 (2021).
- [6] Dyer E., Gur-Ari G. Asymptotics of wide networks from Feynman diagrams. https://doi.org/10.48550/arXiv.1909.11304.
- [7] Erbin H., Lahoche V., Ousmane Samary D. Nonperturbative renormalization for the neural network-QFT correspondence. Mach. Learn.: Sci. Technol. 3 015027 (2022).
- [8] Halverson J., Maiti A. and Stoner K. Neural networks and quantum field theory. Mach. Learn. Sci. Technol. 2, 035002 (2021).
- [9] Maiti A., Stoner K. and Halverson J. Symmetry-via-duality: Invariant neural network densities from parameter-space correlators. https://arxiv.org/abs/2106.00694.
- [10] Helias Moritz, Dahmen David, Statistical field theory for neural networks. Lecture Notes in Physics, 970 ( Springer, Cham, 2020).
- [11] Yaida, S. (2020, August). Non-Gaussian processes and neural networks at finite widths. In Mathematical and Scientific Machine Learning (pp. 165-192). PMLR.
- [12] Zúñiga-Galindo, W. A. (2023). p-adic statistical field theory and deep belief networks. Physica A: Statistical Mechanics and its Applications, 128492.
- [13] Le Roux Nicolas, Bengio Yoshua. Representational power of restricted Boltzmann machines and deep belief networks. Neural Comput. 20 , no. 6, 1631–1649 (2008).
- [14] Albeverio, Sergio; Khrennikov, Andrei; Tirozzi, Brunello p-adic dynamical systems and neural networks, Math. Models Methods Appl. Sci. 9, no. 9, 1417–1437, 1999.
- [15] Khrennikov, Andrei; Tirozzi, Brunello Learning of p-adic neural networks. Stochastic processes, physics and geometry: new interplays, II (Leipzig, 1999), 395–401, CMS Conf. Proc., 29, Amer. Math. Soc., Providence, RI, 2000.
- [16] Khrennikov A, Information Dynamics in Cognitive, Psychological, Social and Anomalous Phenomena; Springer: Berlin/Heidelberg, Germany, 2004.
- [17] Zambrano-Luna, B.A., Zúñiga-Galindo, W.A. p-adic Cellular Neural Networks. J Nonlinear Math Phys (2022). https://doi.org/10.1007/s44198-022-00071-8
- [18] Zambrano-Luna, B. A., Zúñiga-Galindo, W. A. (2023). p-adic cellular neural networks: Applications to image processing. Physica D: Nonlinear Phenomena, 133668.
- [19] Volovich I. V., Number theory as the ultimate physical theory. -Adic Numbers Ultrametric Anal. Appl. 2, 77–87 (2010).
- [20] Vladimirov V. S., Volovich I. V., Zelenov E. I. -Adic analysis and mathematical physics (Singapore, World Scientific, 1994).
- [21] Dragovich B., Khrennikov A. Yu., Kozyrev S. V., Volovich I. V. On adic mathematical physics. Adic Numbers Ultrametric Anal. Appl. 1 (1), 1, 1–17 (2009).
- [22] Lerner E. Y., Misarov M. D. Scalar models in adic quantum field theory and hierarchical models. Theor. Math. Phys. 78, 248–257 (1989).
- [23] Missarov M. D. Adic theory as a functional equation problem. Lett. Math. Phys. 39(3), 253-260 (1997).
- [24] Missarov M. D. Adic renormalization group solutions and the Euclidean renormalization group conjectures. p-Adic Numbers Ultrametric Anal. Appl. 4(2), 109-114 (2012)
- [25] Khrennikov A. Yu. -Adic Valued Distributions in Mathematical Physics (Dordrecht, Kluwer Academic Publishers, 1994).
- [26] Kochubei Anatoly N., Sait-Ametov Mustafa R. Interaction measures on the space of distributions over the field of -adic numbers. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 6(3), 389–411 (2003).
- [27] Khrennikov, Andrei, Kozyrev, Sergei, Zúñiga-Galindo, W. A. Ultrametric Equations and its Applications. Encyclopedia of Mathematics and its Applications 168 (Cambridge, Cambridge University Press, 2018).
- [28] Zúñiga-Galindo W. A. Non-Archimedean white noise, pseudodifferential stochastic equations, and massive Euclidean fields. J. Fourier Anal. Appl. 23 (2), 288–323 (2017)
- [29] Mendoza-Martínez M. L., Vallejo J. A., Zúñiga-Galindo, W. A. Acausal quantum theory for non-Archimedean scalar fields. Rev. Math. Phys. 31(4), 1950011, 46 pp. (2019).
- [30] Arroyo-Ortiz, Edilberto, Zúñiga-Galindo, W. A. Construction of -adic covariant quantum fields in the framework of white noise analysis. Rep. Math. Phys. 84(1), 1–34 (2019).
- [31] Zúñiga-Galindo W. A., Non-Archimedean statistical field theory, Rev. Math. Phys. 34 (8), Paper No. 2250022, 41 pp. (2022).
- [32] Liu Ding et al. Machine learning by unitary tensor network of hierarchical tree structure. New J. Phys. 21, 073059 (2019).
- [33] Cheng Song, Wang Lei, Xiang Tao, and Zhang Pan. Tree tensor networks for generative modeling. Phys. Rev. B 99, 155131 (2019).
- [34] Li Sujie, Feng Pan, Zhou Pengfei, and Zhang Pan. Boltzmann machines as two-dimensional tensor networks. Phys. Rev. B 104, 075154 (2021).
- [35] Orús, R. Tensor networks for complex quantum systems. Nat Rev Phys 1, 538–550 (2019).
- [36] Koblitz Neal. -Adic Numbers, -adic Analysis, and Zeta-Functions. Graduate Texts in Mathematics No. 58 (New York, Springer-Verlag, 1984).
- [37] Chistyakov D. V. Fractal geometry of images of continuous embeddings of p-adic numbers and solenoids into Euclidean spaces.Theoret. and Math. Phys. 109 (1996), no. 3, 1495–1507 (1997).
- [38] Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng. Unsupervised Learning of Hierarchical Representations with Convolutional Deep Belief Networks. Communications of the ACM, vol. 54, no. 10, pp. 95-103 (2011).
- [39] Le Roux Nicolas, Bengio Yoshua. Deep belief networks are compact universal approximators. Neural Comput. 22, no. 8, 2192–2207 (2010).
- [40] Kleinert Hagen, Schulte-Frohlinde V. Critical properties of -theories (Singapore, World Scientific, 2001).
- [41] Mussardo Giuseppe. Statistical Field Theory. An Introduction to Exactly Solved Models of Statistical Physics (Oxford , Oxford University Press, 2010).
- [42] G. E. Hinton, R.R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), doi: 10.1126/science.112764.
- [43] Albeverio S., Khrennikov A. Yu., Shelkovich V. M. Theory of -adicdistributions: linear and nonlinear models (Cambridge University Press, Cambridge 2010).
- [44] Kochubei Anatoly N. Pseudo-differential equations and stochastics over non-Archimedean fields (New York, Marcel Dekker, 2001).
- [45] Taibleson M. H. Fourier analysis on local fields (Princeton University Press, 1975).
- [46] Bocardo-Gaspar Miriam, García-Compeán H., Zúñiga-Galindo W. A. Regularization of p-adic string amplitudes, and multivariate local zeta functions. Lett. Math. Phys. 109, no. 5, 1167–1204 (2019).
- [47] Halmos P. Measure Theory (D. Van Nostrand Company Inc., New York, 1950).
- [48] Fischer A., Igel C. An Introduction to Restricted Boltzmann Machines. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2012. Lecture Notes in Computer Science, vol 7441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33275-3_2.
- [49] Hinton G. E., Training products of experts by minimizing contrastive divergence, Neural Comput. 14, no. 8, pp. 1771–1800 (2002).
- [50] D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques (The MIT Press, Cambridge, MA, 2009).