Abstract
We introduce block Markov chains (BMCs) indexed by an infinite rooted tree.
It turns out that BMCs define a new class of tree-indexed Markovian processes. We clarify the structure of BMCs in connection with Markov chains (MCs) and Markov random fields (MRFs). Mainly, show that probability measures which are BMCs for every root are indeed
Markov chains (MCs) and yet they form a strict subclass of Markov random fields (MRFs) on the considered tree. Conversely, a class of MCs which are BMCs is characterized. Furthermore, we establish that in the one-dimensional case the class of BMCs coincides
with MCs. However, a slight perturbation of the one-dimensional lattice leads to us to an example of BMCs which are not MCs appear.
1. Introduction
Markov random fields (MRFs) on lattice have become standard tools in several branches of science and technology including computer science, machine learning, graphical models, statistical physics. Namely, MRFs are known to provide pertinent models for interacting particles systems in statistical mechanics.
We notice that MRFs were introduced by Dobrushin in [?] for the multi-dimensional integer lattice, and developed then on trees [?], [?], [?]. QMFs consist multi-dimensional extensions of Markov chains [?] but with a deeper Markovian structure. In, fact even in the one dimensional case MRFs were shown to be distinct from MCs [?].
MRFs play a crucial role in many areas such as computer science, image recognition, graphical models, psychology and in an increasing number of biological and neurological models. The reader is referred to [?], [?], [?] and the references cited therein for further applications.
In the present paper we introduce the notion of block Markov chains indexed by the vertex-set of a rooted tree . The definition of this notion is quite natural. Since in the one-dimensional case with distinguished vertex (root) , a Markov chain
with (finite) state space is defined through the well known Markov property
|
|
|
The above property can be reformulated by means of the joined probability measure on of the process as follows
|
|
|
(1.1) |
where is the set of successors of the site and it the set of successive descendants
of the vertex w.r.t. the considered root . We emphasize a suitable natural generalization of the sets and
for general trees. Roughly speaking, a BMC is a probability measure on satisfying the Markov property (1.1) for a fixed root.
The main purpose of this paper is to clarify the structure of BMCs in connection with MCs and MRFs. Mainly, we show that a probability measure which is BMC for every root is a MC in the sense of [?]. The correlation functions of BMCs are different from those of MCs and MRFs. Consequently, their Markov structure are also different. Namely, it turns out that some conditional independence conditions are necessary on a MC on the considered tree to be BMC.
On the other hand we show that in the one-dimensional case, the notions of MCs and QMCs coincide. This coincidence Makes BMCs strictly a sub-class of MRFs in the one dimensional case. However, we emphasize that a slight modification of the one-dimensional lattice leads to a counter-examples that confirms the huge difference between MCs and BMCs over multi-dimensional trees.
We notice that the natural hierarchical structure of rooted trees, due to the absence of loops, plays a crucial role in the mere definition of BMCs. Therefore, the results are no longer available on general graphs. We forecast that BMCs will play a crucial role in connection with Gibbs measures on trees and their associated phenomena of phase transitions (see [?], [?] and [?]). Namely, phenomena of phase transitions were associated with interesting p-adic models such as the Potts model and the Ising–Vannimenus model [?], [?].
In fact, a work under preparation is dedicated to the clarification of a bridge between BMCs and some p-adic models.
In [?], [?] we clarified the structure of quantum Markov states on a quasi-local algebra trees in terms of classical Markovian measure and Gibbs measures on the spectrum of a maximal abelian subalgebra. We stress that this classical Markovian measure is indeed a BMC. This will makes a new bridge between classical and quantum Markov fields.
Let us mention the outlines of the paper. Section 2. is devoted to some notions and notions on rooted trees.
In section 3., we recall the basic definition of MC and MRF on graphs.
Section 4. is devoted to definition of BMCs as far as its correlation functions. Section 5. is dedicated to results related to the connection of BMCs with MCs and MRFs on trees.
In section 6. we deal with the one-dimensional case for which the vertex set is the classical 1D integer
lattice occupied with its natural tree structure. In section 7. we develop a counter-example for a BMC which is not a MC.
2. Rooted trees
Recall that [?] a tree is a connected graph with no cycles, i.e. a connected graph which becomes disconnected when each one of its edges is removed.
Let be given an infinite tree . First, we fix any vertex as a ”root”. Recall that two vertices and are said to be nearest neighborsand we denote if they are joined through an edge (i.e. ). A list of the vertices is called a path from the site to the site . The distance on the tree is the length of the shortest path from to .
For , its direct successors (children) is defined by
|
|
|
(2.2) |
and its successors w.r.t. the root is defined by induction as follows
|
|
|
|
|
|
The ”future” w.r.t. the vertex is defined by:
|
|
|
(2.3) |
Note that in the homogeneous case, for which is constant, the graph is the semi-infinite Cayley tree of order .
Namely, for , the graph is reduced to the one-dimensional integer lattice .
Consider the map from into itself characterized by
|
|
|
Let . If then
|
|
|
(2.4) |
is the minimal edge-path joining the root to the vertex , where .
The set
|
|
|
(2.5) |
represents the ”past” of the vertex for the root
The set of nearest-neighbors vertices of is given as follows:
|
|
|
(2.6) |
It is clear that .
In the sequel, the tree is assumed to be locally finite, i.e. for each , in this case the integer is called degree of .
The tree can be regarded as growing (upward) away from its fixed root . Each vertex then
has branches leading to its ”children”, which are represented here by and . With the possibility of leaves, that is, vertices without children i.e. .
3. Some Reminders on Markov fields
Let . By a stochastic process we mean a family of random variables defined on a probability space
and valued in a finite set . The process is defined through its joined probability measure on the Borel space where is the cylindrical -algebra, which is generated by the cylinder sets of the following form
|
|
|
(3.7) |
where finite and . For the sake of shortness we denote instead of and instead of . For , we denote .
Recall that
|
|
|
(3.8) |
where .
The conditional probability is defined as follows
|
|
|
(3.9) |
where and such that
|
|
|
Denoting
|
|
|
(3.10) |
the -algebra generated by and respectively.
DEFINITION 1
[?]
A probability measure on is said to be Markov random field (MRF) if it takes strictly positive values on finite cylinder sets of the form (3.7) and such that for every
|
|
|
(3.11) |
The set of Markov random fields over will be denoted by .
The conditional probabilities (3.11) are assumed to be invariant under graph isomorphism.
DEFINITION 2
[?]A probability measure on is said to be Markov chain (MC) over the tree if for each subtree the restriction of on the measurable space defines a Markov random field. i.e.
|
|
|
(3.12) |
for all and all .
The set of Markov chains over will be denoted by .
Remark 1
The class is clearly included in . Conversely, in [?] it was proven that if the tail -field is trivial
then the considered Markov field is indeed a MC.
4. Structure of Block Markov chains on trees
In what follows, a root for the tree is fixed. For each , we denote the set of vertices whose
distance to the root equals .
Let . For the sake of shortness, when confusion seems impossible we will
use the notations and instead of and , respectively.
Let us set a random enumeration for elements of as follows
|
|
|
where denotes the cardinality of .
DEFINITION 3
A measure on is called o-block Markov chain (o-BMC) if it satisfies
|
|
|
(4.13) |
for all and . The equation (4.13) will be referred as block Markov property.
The set of -block Markov chains over the tree will be denoted .
In [?] a triplet of -algebras such that
|
|
|
(4.14) |
was referred as Markov triple. In these notations (4.13) means that is a Markov triple.
Remark 2
The word ”block” in Definition. 3 comes from the conditioning w.r.t. the -algebra rather then the
-algebra , while this latter represents the past of the vertex w.r.t. the root .
The following elementary formula for conditional probabilities will be used frequently in the sequel.
|
|
|
(4.15) |
Let is an -BMC. According to (4.15), we have
|
|
|
|
|
|
|
|
|
|
For , the same reason as above implies that
|
|
|
Since then the block Markov property
(4.13) leads to
|
|
|
Therefore
|
|
|
(4.16) |
Remark 3
The BMC is characterized by the initial distribution on together with the family of transition probabilities . The ”stochastic” matrices
are clearly inhomogeneous. This lets the measure a multi-dimensional markovian process which is inhomogeneous both in space and time.
The following theorem extends the local Markov property (4.13) to a global one, which concerns the conditional independence
of the -algebras and given .
THEOREM 1
Let be a block Markov chain on . Then
|
|
|
(4.17) |
For all and all .
Proof. If then (4.17) holds true.
We will proceed by induction on . One has
|
|
|
|
|
|
|
|
|
Denoting , one has
|
|
|
|
|
|
From (4.13), one gets
|
|
|
|
|
|
Thus
|
|
|
|
|
|
Using the same argument as above , we obtain
|
|
|
(4.18) |
|
|
|
On the other hand, the induction’s hypthesis leads to
|
|
|
Therefore
|
|
|
|
|
|
|
|
|
Finally, one finds
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
COROLLARY 1
In the notations of Theorem 1, if then
|
|
|
(4.19) |
for all .
Proof.
From Theorem 1, for each
|
|
|
|
|
|
Summing up on , one finds (4.19).
The following result proposes a multi-dimensional analogue of the Chapmann-Kolmogorov equation.
THEOREM 2
Let be a BMC on . Then for and one has
|
|
|
(4.20) |
|
|
|
for all .
Proof.
For each , using the same reason as in (4.18), we get
|
|
|
|
|
|
Then
|
|
|
|
|
|
Summing up, one gets (4.20).
5. Connection with MCs and MRFs
LEMMA 1
Let . If is a subset of then the subgraph of the tree , whose set of vertices is is itself a tree.
Proof.
First, we see that if then . This implies that for
each is connected, the set of its roots ( defined in (2.4))
is disjoint of the set . Then the path is in .
Therefore, the subgraph whose vertex set is connected. Since every element of is joined to , we conclude that the subgraph is connected. Taking into account that the fact that every connected subgraph of a tree is a subtree, the proof is complete.
THEOREM 3
Let be a Markov chain on . Then for each the following property holds true.
|
|
|
(5.21) |
If in addition, the -algebras are conditionally independent given then is an o-BMC.
Proof.
First let us write .
According to (4.15), we have
|
|
|
|
|
|
By Lemma 1 the subgraph of whose set of vertices is is a tree. Then
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
where the last equality derives from the fact that is a Markov chain in the sense of Definition. 2.
Since , we get (5.21).
For the second part of the proof, the conditional independence of leads to
|
|
|
Hence, (5.21) leads to (4.17). Therefore is a o-block Markov chain, for any root . This achieves the proof.
LEMMA 2
If is an -BMC on and then
|
|
|
(5.22) |
for all containing .
Proof. Since , then according to (4.17) one gets
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Again from (4.17), we have
|
|
|
This leads to
|
|
|
|
|
|
|
|
|
|
This completes the proof.
Remark 4
Notice that Definition.2 extends the notion of Markov chain introduced in [?] and [?] into inhomogeneous trees and for inhomogeneous transition probabilities. It was shown [?] that the class of homogenous Markov chain is strictly included in the class of Markov random fields. In the inhomogeneous we have the following
THEOREM 4
Let be a probability measure on . If is an -BMC for each then it is a MC.
Proof. Consider a subtree of . Let . If then
and (3.12) is trivial. Otherwise, let us denote with .
Remark that if then
and . As is an -BMC,
by Lemma 2 we have
|
|
|
Since is an -BMC by Lemma 2 , we have
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Iterating this procedure, we get
|
|
|
|
|
|
|
|
|
|
because
|
|
|
(5.23) |
Therefore, the measure satisfies (3.12). This finishes the proof, the verification of (5.23) being left to the reader.
COROLLARY 2
|
|
|
6. One-dimensional BMC
In this section we consider the one-dimensional lattice occupied with its natural structure of tree, where the edge set is . Here .
PROPOSITION 1
Let be a probability measure on . The following assertions are equivalent:
-
(i) is a o-BMC for each ;
-
(ii) is an o’-BMC, for some root ;
-
In particular, a probability measure on is markovian for the backward direction if and only if it is for the forward direction.
Proof.
straightforward.
Let , without loss of generality we can assume that .
Observe that if or then . Then (4.13) is also true if we replace by .
Let us now examine the case then and . Let and .
Applying (4.13) to , we get
|
|
|
because .
According to (4.15), it follows that
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thus
|
|
|
for all . Hence is a .
Let and . Since is then it is a for and . By (4.13), it follows that
|
|
|
Therefore, is a Markov chain.
If is a Markov chain then
|
|
|
By taking , this implies that is -block Markov chain, which completes the proof.
Remark 5
Proposition 1 may be summarized by saying that for each the triple is a Markov triple in the sense of (4.14). Namely, this result is still true by taking instead of . However, a slight modification on the one dimensional lattice can provide a counter-example in the multi-dimensional case, in fact we have the following section.
7. Counter-example
Consider the sets and where .
We get then The tree (see Fig.LABEL:0RTgraph).
Consider a -valued Markov chain with initial measure and transition matrix .
Define the -valued stochastic process by
|
|
|
Let and be the probability measure on associated with . Let and , it easy to check that is an -BMC. However, is not a -BMC. In fact, if we have and . Let
|
|
|
|
|
|
On the other hand
|
|
|
This leads to
|
|
|
Hence is not an -BMC.
Furthermore, the probability measure is not a MC. In fact, by considering the subtree with vertex set . We get
|
|
|