This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A residual-based message passing algorithm for constraint satisfaction problems

Chun-Yan Zhao1 Corresponding author: [email protected]    Yan-Rong Fu1    Jin-Hua Zhao2,3 Corresponding author: [email protected] 1College of Science, University of Shanghai for Science and Technology, Shanghai 200093, China 2Guangdong Provincial Key Laboratory of Nuclear Science, Institute of Quantum Matter, South China Normal University, Guangzhou 510006, China 3Guangdong-Hong Kong Joint Laboratory of Quantum Matter, Southern Nuclear Science Computing Center, South China Normal University, Guangzhou 510006, China
Abstract

Message passing algorithms, whose iterative nature captures well complicated interactions among interconnected variables in complex systems and extracts information from the fixed point of iterated messages, provide a powerful toolkit in tackling hard computational tasks in optimization, inference, and learning problems. In the context of constraint satisfaction problems (CSPs), when a control parameter (such as constraint density) is tuned, multiple threshold phenomena emerge, signaling fundamental structural transitions in their solution space. Finding solutions around these transition points is exceedingly challenging for algorithm design, where message passing algorithms suffer from a large message fluctuation far from convergence. Here we introduce a residual-based updating step into message passing algorithms, in which messages varying large between consecutive steps are given high priority in the updating process. For the specific example of model RB, a typical prototype of random CSPs with growing domains, we show that our algorithm improves the convergence of message updating and increases the success probability in finding solutions around the satisfiability threshold with a low computational cost. Our approach to message passing algorithms should be of value for exploring their power in developing algorithms to find ground-state solutions and understand the detailed structure of solution space of hard optimization problems.

I Introduction

The difficulty in understanding complex systems manifests not only in finding a proper description of their interaction topology, but also in quantifying the collective and cooperative behaviors among their constituents at multiple scales Newman-AIP-2012 . Message passing algorithms prove to be an efficient tool in tackling hard computational problems formulated in graphical models Koller.Friedman-2009 , which is a simple description of complex interacted systems. In a typical algorithmic procedure, one defines messages on the connections of basic units in a graphical representation, derives iterative equations of messages to account for the propagation of correlation among variables, and extracts information on problem solutions through messages after a sufficiently large number of updating steps. Among the main roots of the message passing algorithms are coding theory in information theory research Richardson.Urbanke-2008 and statistical physics of disordered systems Mezard.Montanari-2009 . Message passing algorithms have been heavily explored in optimization problems Garey.Johnson-1979 ; Zhou-2015 , inference problems Zdeborova.Krzakala-AdvInPhys-2016 , learning problems Braunstein.Zecchina-PRL-2006 ; Mezard-PRE-2017 , and so on. Different frameworks of message passing algorithms are based on mean-field theory at a certain level of approximation, whose typical examples encompass the belief propagation (BP) algorithm Mezard.Parisi-EPJB-2001 , the survey propagation algorithm Mezard.Parisi.Zecchina-Science-2002 , the cluster variational algorithm Kikuchi-PRB-1951 ; Pelizzola-JPhysA-2005 , and so on.

Constraint satisfaction problems (CSPs) Garey.Johnson-1979 are among the most typical computational tasks in complex systems research. A typical instance of CSPs can be formulated as a bipartite graph with two types of nodes denoting mm constraints and nn variables respectively and edges only between constraints and variables. In a random setting, edges can be established from a null graph with only nodes by connecting a given constraint node with multiple variable nodes chosen among the variable node set with a uniform probability. A basic computational question for CSPs is to find the existence and the number of assignments to variables which satisfy all the constraints in a given instance. CSPs are heavily discussed in computational complexity theory in theoretical computer science Garey.Johnson-1979 , as many prototype NP-hard problems can be formulated as CSPs. In the language of statistical physics, satisfiability and optimization problems can be understood as the energy landscapes of configurations of a physical system with continuous or discrete states, while their global optima correspond to the ground-state energy of landscapes. Thus we can adopt statistical mechanical frameworks, such as the mean-field theory of spin glasses Mezard.Montanari-2009 , to estimate the energy and the entropy of their ground-state solutions. Typical CSPs studied with the spin glass theory are the random kk-satisfiability (kk-SAT) problem Krzakala.etal-PNAS-2007 ; Montanari.RicciTersenghi.Semerjian-JStatMech-2008 , the minimum vertex cover problem Weigt.Hartmann-PRL-2000 ; Weight.Zhou-PRE-2006 , and the qq-coloring problem Zdeborova.Krzakala-PRE-2007 . These problems focus on random CSPs with fixed domains, whose size of variables is independent of the variable number nn (that is, 2, 2, and qq as given integers, respectively). Yet many practical problems, such as the Latin square problem, the nn-queens problem and so on, can be transformed into random CSPs with growing domains. Among the models for these CSPs, model RB (revised B) proposed in Xu-JAIR-2000 is a popular one, whose domain size grows polynomially with the variable number nn.

Spin glass mean-field theory on satisfiability problems shows that fundamental structural transitions exist in the geometric organization of ground-state configurations. Taking the random kk-SAT (k3k\geqslant 3) problem Krzakala.etal-PNAS-2007 as an example, with an increase in constraint densities (ratio of constraint number to variable number), there are clustering, condensation, and SAT/UNSAT phase transitions separated by multiple thresholds, where drastic changes in the numbers and sizes of the clusters of ground-state configurations take place. Furthermore, there is a deep connection between these solution space phase transitions and the algorithmic performance in finding solutions. Specifically, around these thresholds, messages in mean-field theory often show large fluctuations, rendering impossible the convergence of messages from which a typical message passing algorithm can extract information. Meanwhile, each mean-field theory has an approximation at some level on the organization of the solution space. For example, for the BP algorithm Mezard.Parisi-EPJB-2001 , solutions are assumed to be organized in a single cluster, or a pure state. When a level of approximation fails, there are often two choices: whether we generalize the current mean-field assumption to a theoretical framework with a more complicated and computationally demanding form, or we introduce modifications at an algorithmic level to achieve better performance by leveraging the existing form of message passing algorithm.

Our focus in this paper is to suppress the fluctuation of iterative messages around thresholds, a critical problem for message-passing-based algorithms, thus to improve the performance of existing BP algorithm for solving CSPs at the algorithmic level. Our main contribution here is to define residuals in the message updating process, based on which the existing BP algorithm is further modified. We show that, for a specific CSP model RB, compared with its usual BP algorithm, our modification of the algorithm can significantly improve the convergence of the updated messages and increase the probability of finding ground-state solutions around the satisfiability thresholds at a lower computational cost.

The layout of the paper is as follows. In Section 22, we briefly introduce the concept and some known results of model RB, which is a typical CSP with growing domains. Section 33 presents the maximal residual BP (MRBP) algorithm, which modifies the current BP algorithm based on residuals in the process of updating messages. In Section 44, we illustrate and analyze the numerical results of our algorithm. In Section 55, we conclude the paper with some discussions.

II Model RB: an example of constraint satisfaction problems

Over the past few decades, studies on CSPs mainly focus on the initial standard CSP models, named A, B, C and D Smith-AI-1996 ; Gent-C-2001 ; Prosser-AI-1996 . However, it has been proved by Achlioptas et al. that the random instances generated by the four standard models have trivial asymptotic insolubility with an increase in the problem size Achlioptas-C-2001 . Therefore, in order to overcome this defect, various improved models from different perspectives in numerous efforts have been proposed successively Xu-JAIR-2000 ; Molloy-SIAMJC-2003 ; Frieze-RSA-2006 ; Smith-TCS-2001 ; Gao-JAIR-2007 . Model RB, proposed by Xu and Li to avoid the trivial unsatisfiability of model B Xu-JAIR-2000 , is a modification on the domain size of variables and the number of constraints of model B without special restrictions on the constraint structure.

Model RB consists of a set of nn variables X={x1,x2,,xn}X=\{x_{1},x_{2},\ldots,x_{n}\} and a set of mm constraints C={C1,C2,,Cm}C=\{C_{1},C_{2},\ldots,C_{m}\} with m=rnlnnm=rn\ln n, where r>0r>0 is a constant. Each variable xix_{i} (i=1,2,,ni=1,2,\ldots,n) has a nonempty domain D={1,2,,dn}D=\{1,2,\ldots,d_{n}\} of dn=nαd_{n}=n^{\alpha} (α>0\alpha>0 is a constant) possible values. Each constraint CaC_{a} (a=1,2,,ma=1,2,\ldots,m) involves kk (2\geqslant 2) different variables, and there is a set of incompatible assignments QaDkQ_{a}\subset D^{k} which specifies the disallowable combinations of values for the kk variables. More precisely, we generate a random instance of model RB with the following two steps.

Step 1.

To construct a single constraint, we randomly select kk (2\geqslant 2) distinct variables out of the nn variables, and then randomly select exactly pdnkpd_{n}^{k} (pp is the constraint tightness) different incompatible assignments out of dnkd_{n}^{k} possible ones as the set QaQ_{a} to restrict the values of these kk variables.

Step 2.

We repeat the above step to obtain m=rnlnnm=rn\ln n independent constraints and their corresponding sets QaQ_{a} (a=1,2,,m)(a=1,2,\cdots,m).

A random instance of model RB is simply the conjunction of the above mm randomly selected constraints.

Let Pr(SAT)\Pr(\rm SAT) be the probability that a random instance is satisfied. It is shown in Ref. Xu-JAIR-2000 that

Theorem 1

Let ps=1eαrp_{s}=1-{\rm e}^{-\frac{\alpha}{r}}. If α>1k\alpha>\frac{1}{k}, r>0r>0 are two constants and keαr1k{\rm e}^{-\frac{\alpha}{r}}\geqslant 1, then

limnPr(SAT)={1,if p<ps;0,if p>ps.\lim_{n\rightarrow\infty}\Pr(\rm SAT)=\left\{\begin{array}[]{ll}1,&\hbox{if $p<p_{s}$;}\\ 0,&\hbox{if $p>p_{s}$.}\end{array}\right. (1)
Theorem 2

Let rs=αln(1p)r_{s}=-\frac{\alpha}{\ln(1-p)}. If α>1k\alpha>\frac{1}{k}, 0<p<10<p<1 are two constants and k11pk\geqslant\frac{1}{1-p}, then

limnPr(SAT)={1,if r<rs;0,if r>rs.\lim_{n\rightarrow\infty}\Pr(\rm SAT)=\left\{\begin{array}[]{ll}1,&\hbox{if $r<r_{s}$;}\\ 0,&\hbox{if $r>r_{s}$.}\end{array}\right. (2)

The two theorems show that model RB exhibits a sharp threshold of satisfiability at the critical values psp_{s} and rsr_{s}, as pp and rr are two control parameters of the problem. Nevertheless, we cannot rigorously determine the exact value of satisfiability threshold for some random CSPs with fixed domains Achlioptas-Nature-2005 ; Mertens-RSA-2006 . It has been theoretically proved that almost all instances of model RB have no tree-like resolution proofs of less than exponential size Xu-TCS-2006 . In other words, these random instances of model RB in the transition region are extremely difficult to solve. Numerical results, using complete and incomplete search algorithms, have confirmed the exact satisfiability thresholds and the hardness of forced and unforced satisfiable instances Xu-AI-2007 . Although model RB shows an asymptotic phase transition phenomenon, it is still very challenging to find solutions of a random instance, and the size of the problem has been restricted to n100n\sim 100 Cai-AI-2011 . Therefore, benchmarks based on model RB are widely used in various kinds of algorithm competitions such as CSP, SAT, pseudo-Boolean and so on, and these results have verified the intrinsic computational hardness of these benchmarks.

Both analytical and algorithmic studies have been carried out on model RB. On the analytical side, Fan and Shen defined a large class of random CSP models dd-kk-CSP which unified several related models including model RB Fan-AI-2012 ; Shen-JCO-2016 . Later, it was proved that the restriction on kk can be weaker, which can simplify the generation of random instances Zhou-JCO-2015 . In order to study the transition behaviors of model RB due to finite-size effects, Zhao et al. use finite-size scaling analysis to bound the width of the transition region Zhao-IPL-2011 . On the algorithmic side, statistical physics concepts and methods, especially the replica method and the cavity method in spin glass mean-field theory, have played a very significant role in constructing solutions for random CSPs Olivier-TCS-2001 ; Marino-NC-2016 . Zhao et al. proposed two different types of message passing algorithms guided by BP to solve random CSPs with large domains like model RB Zhao-JSTAT-2011 . Subsequently, Zhao et al. proposed a reinforced BP algorithm, which can effectively force the BP equations to converge to a solution of model RB by introducing an external field with a certain probability Zhao-PRE-2012 . Moreover, it was shown that the solutions of model RB are grouped in disconnected clusters before the theoretical satisfiability phase transition threshold Zhao-PRE-2012 . Recently, Zhao et al. proposed three kinds of BP decimation algorithms, which improved the performance of BP algorithms by improving the updating method of BP iterative equation Zhao-JSTAT-2021 .

III Maximal residual belief propagation algorithm

Here we propose the MRBP algorithm, which combines modifications in the updating process based on residuals with the usual BP algorithm. We first lay down the general procedure of the BP algorithm and the relevant equations, then we define the residuals of messages in the updating process and present our algorithm in detail.

For k2k\geqslant 2, model RB is NP-complete. Therefore, we take the binary case (k=2k=2) of model RB as the tested problem. Model RB with k3k\geqslant 3 can be reduced to the case of k=2k=2 in polynomial time.

III.1 Belief propagation algorithm

Refer to caption
Figure 1: Part of a factor graph representing a random instance of binary model RB. Circles and squares denote the variable nodes and the constraint nodes, respectively. In the message passing algorithm of model RB, μai(σi)\mu_{a\rightarrow i}(\sigma_{i}) and ηia(σi)\eta_{i\rightarrow a}(\sigma_{i}) are the messages passed between aa and ii along the edge (a,i)(a,i) in opposite directions.

An instance of binary model RB admits a natural factor graph representation, as shown in figure 1. A factor graph is a bipartite undirected graph, in which one type of nodes represents the variables (denoted by i,j,i,j,\cdots) and the other type of nodes represents the constrains (denoted by a,b,a,b,\cdots). An edge (a,i)(a,i) connects a constraint node aa and a variable node ii if and only if aa contains ii. The average degree of each variable node in the graph is rlnnr\ln n. Locally such a graph is tree-like: the typical size of a loop in the graph scales like lnn/lnlnn\ln n/\ln\ln n for large nn.

An instance of model RB can be treated as a statistical mechanical system with interacting variables with discrete states and an interaction topology in the form of a factor graph. In a random instance of binary model RB, we let σ=(σ1,σ2,,σn)\vec{\sigma}=(\sigma_{1},\sigma_{2},\ldots,\sigma_{n}) (σiD\sigma_{i}\in D, i=1,2,,ni=1,2,\cdots,n) be an assignment to the nn variables. We further let σa=(σa1,σa2)\vec{\sigma_{a}}=(\sigma_{a1},\sigma_{a2}) be an assignment to the two variables involved in the constraint aa ({1,2,,m}\in\{1,2,\cdots,m\}). We define the energy of the constraint aa as fa(σa)=1f_{a}(\vec{\sigma_{a}})=1 if σaQa\vec{\sigma_{a}}\in Q_{a} and fa(σa)=0f_{a}(\vec{\sigma_{a}})=0 if σaQa\vec{\sigma_{a}}\notin Q_{a}. The total energy of an assignment σ\vec{\sigma}, denoted as E(σ)E(\vec{\sigma}), is the sum of energy terms on all the mm constraints, which is the number of unsatisfied constraints computed following

E(σ)=a=1mfa(σa).E(\vec{\sigma})=\sum_{a=1}^{m}f_{a}(\vec{\sigma_{a}}). (3)

Based on the locally tree-like property of the factor graph representation of model RB, we derive the framework of BP algorithm Mezard.Montanari-2009 . Associated with each edge (a,i)(a,i) there are two cavity messages μai(σi)\mu_{a\rightarrow i}(\sigma_{i}) and ηia(σi)\eta_{i\rightarrow a}(\sigma_{i}) passing in opposite directions. See figure 1 for an example. Specifically, μai(σi)\mu_{a\rightarrow i}(\sigma_{i}) is the probability that the constraint aa is satisfied if the variable ii takes the value σi\sigma_{i}, and ηia(σi)\eta_{i\rightarrow a}(\sigma_{i}) is the probability that the variable ii takes the value σi\sigma_{i} in the absence of the constraint aa. Let {μait(σi)}\{\mu_{a\rightarrow i}^{t}(\sigma_{i})\} and {ηiat(σi)}\{\eta_{i\rightarrow a}^{t}(\sigma_{i})\} denote the set of messages passed along all the edges in the factor graph at time t(=0,1,)t(=0,1,\cdots). According to the cavity method in disordered systems of statistical physics Mezard.Montanari-2009 ; Zhao-JSTAT-2011 , the self-consistent BP equations of cavity messages can be derived as

ηiat(σi)\displaystyle\eta_{i\rightarrow a}^{t}(\sigma_{i}) =\displaystyle= 1ZiabV(i)\aμbit(σi),\displaystyle\frac{1}{Z_{i\rightarrow a}}\prod_{b\in V(i)\backslash a}\mu_{b\rightarrow i}^{t}(\sigma_{i}), (4)
μait+1(σi)\displaystyle\mu_{a\rightarrow i}^{t+1}(\sigma_{i}) =\displaystyle= 1ZaijV(a)\i,σjDfa(σa)ηjat(σj).\displaystyle\frac{1}{Z_{a\rightarrow i}}\sum_{j\in V(a)\backslash i,\sigma_{j}\in D}f_{a}(\vec{\sigma_{a}})\eta_{j\rightarrow a}^{t}(\sigma_{j}). (5)

Here ZiaZ_{i\rightarrow a} and ZaiZ_{a\rightarrow i} are both normalization factors, V(i)\aV(i)\backslash a the set of constraints adjacent to the variable ii removing aa, and V(a)\iV(a)\backslash i the set of variables adjacent to the constraint aa removing ii.

In a typical procedure of the BP algorithm, messages on the edges in the factor graph of a given problem instance are uniformly initialized at random in [0,1][0,1]. A message iteration process is carried out, in which messages are updated following the self-consistent equations (4) and (5). After a sufficiently long iterative process, the messages may converge to a fixed point. The convergence criterion can be measured by a precision parameter ε\varepsilon (1\ll 1): when the maximal absolute difference between two consecutive steps of the messages on all edges of a factor graph is smaller than ε\varepsilon, we can say that the message updating process converges. With the fixed point of messages, statistical mechanical properties of model RB in an average sense can be extracted, such as energy and entropy of ground-state solutions, and so on.

To find a solution configuration of a given problem instance, we adopt the BP decimation (BPD) algorithm, which assigns values to variables one by one based on the marginal probability after message updating. In order to find a solution configuration of the problem, we can assign the value of each variable based on its marginal probability distribution. This is the basic procedure of the BPD algorithm. With the fixed point of messages {μai(σi)}\{\mu_{a\rightarrow i}^{\ast}(\sigma_{i})\} for all edges (a,i)(a,i), the marginal probability of a variable σi\sigma_{i} (i=1,2,,n)(i=1,2,\cdots,n) is computed following

μi(σi)=aV(i)μai(σi)σiDaV(i)μai(σi).\mu_{i}(\sigma_{i})=\frac{\prod_{a\in V(i)}\mu_{a\rightarrow i}^{\ast}(\sigma_{i})}{\sum_{\sigma_{i}\in D}\prod_{a\in V(i)}\mu_{a\rightarrow i}^{\ast}(\sigma_{i})}. (6)

In a decimation step, the variable with the largest marginal probability is selected to be assigned to its corresponding most biased component. In the decimation procedure, we iteratively switch between a variable fixing process and a message updating process after a graph simplification process upon fixing variables.

III.2 Maximal residual belief propagation algorithm

Refer to caption
Figure 2: Local procedure of message updating based on maximal residuals on a factor graph. The edge (aγ,iγ)(a_{\gamma},i_{\gamma}) with the largest residual is selected for updating and then marked with 0, and the numbers 1 and 2 represent the order of the message updating.
Refer to caption
Figure 3: In the decimation process of the BPD algorithm, we classify the variables into three different types, indicated by A, B and C.

The convergence of messages in the BP iterative equations affects the performance of the algorithm. However, the BP equations usually do not converge when the control parameter approaches the theoretical satisfiability threshold. The core idea of this work is the introduction of the residuals for messages. We modify the message updating process, as we dynamically update the passing messages between the constraints and the variables based on the maximal residual in BP iterative equations, to improve the convergence and the performance of BP algorithm. The residuals of messages measure the relative difference sent from constraints to variables between two consecutive steps during the iteration of messages. If we consider all the messages in the BP algorithm as a time series are {μait}\{\mu_{a\rightarrow i}^{t}\} with t=0,1,t=0,1,\cdots, the residual on an edge (a,i)(a,i) at time tt is defined as

γait=|μaitμait1|μait.\gamma_{a\rightarrow i}^{t}=\frac{|\mu_{a\rightarrow i}^{t}-\mu_{a\rightarrow i}^{t-1}|}{\mu_{a\rightarrow i}^{t}}. (7)

A large residual indicates a large fluctuation of the message μait\mu_{a\rightarrow i}^{t}, thus the messages at these steps are far from the value after convergence of the whole message passing algorithm, if there is one.

Based on the residual on each edge at a time step, the MRBP algorithm dynamically adjusts the order of updating messages according to the real-time status of the messages, in which the messages connected to the edge with the largest residual are prioritized to update. See figure 2 for an example. At time tt, we select the edge (aγ,iγ)(a_{\gamma},i_{\gamma}) with the largest residual. We first adopt equation (4)(4) to compute the messages sent from the variables iγi_{\gamma} and jγj_{\gamma} associated with the constraint node aγa_{\gamma} to the neighboring constraint nodes connected to iγi_{\gamma} and jγj_{\gamma} except aγa_{\gamma} (denoted by V(iγ)V(jγ)aγV(i_{\gamma})\cup V(j_{\gamma})\setminus a_{\gamma}), as indexed by the number 1 in figure 2. We need to reduce the greediness of the algorithm to achieve better performance. Hence, the updating messages are considered from the point of view of node aγa_{\gamma} rather than edge (aγ,iγ)(a_{\gamma},i_{\gamma}). Therefore, we use equation (4)(4) to calculate the messages sent from the nodes iγi_{\gamma} and jγj_{\gamma} connected to aγa_{\gamma} to V(iγ)V(jγ)aγV(i_{\gamma})\cup V(j_{\gamma})\setminus a_{\gamma} simultaneously. Then, we follow equation (5)(5) to update the messages that the constraint nodes V(iγ)V(jγ)aγV(i_{\gamma})\cup V(j_{\gamma})\setminus a_{\gamma} send to the variables connected to the constraints except iγi_{\gamma} and jγj_{\gamma}, as indexed by the number 2 in figure 2. In order to avoid falling into the local optimum in the iteration process, we mark the edge with the largest residual after the messages updating, and the residuals on unmarked edges are recalculated. The selection of the current edge with largest residual and the recalculation of residuals on unmarked edges are carried out iteratively until all edges are marked to ensure that messages on all edges are updated during the BP iteration process. In this way, the messages between constraints and variables are constantly updated till they converge to a fixed point or the maximal number of iterations is reached.

During the decimation process of the BPD algorithm, we iteratively fix variables with values until we have an assignment for all variables. Once a variable in a factor graph is assigned with a certain value, variables in the factor graph are divided into three different types, as illustrated in figure 3.

  • Type A: The variables that have been assigned with some values.

  • Type B: The variables not assigned with values, yet connected to the same constraints with the assigned ones.

  • Type C: Others.

The MRBP algorithm are detailed in the following pseudocode.

 
MRBP algorithm
 

Input: A factor graph of a random instance generated by model RB, maximal iterations tmaxt_{\rm max} used in the subroutines MRBP-UPDATE and MRBP-UPDATE, and a precision parameter ε\varepsilon.

Output: A solution or ‘UNCONVERGED’.

Step 1

At T=0T=0:

1.1

Run the subroutine MRBP-UPDATE.

1.2

For each variable ii, compute the marginal probability μi(σi)\mu_{i}(\sigma_{i}) by equation (6)(6).

1.3

Select the variable ii^{\ast} with the highest marginal probability from the nn variables, and assign it to its most biased value σi=argmaxμi(σi)\sigma_{i^{\ast}}^{\ast}=\arg\max\mu_{i^{\ast}}(\sigma_{i^{\ast}}).

Step 2

From T=1T=1 to n1n-1:

2.1

Run the subroutine MRBP-UPDATE.

2.2

For each free (unfixed) variable ii, compute the marginal probability μi(σi)\mu_{i}(\sigma_{i}) by equation (6)(6).

2.3

Select the variable ii^{\ast} with the highest marginal probability from the (nT)(n-T) free variables, and assign it to its most biased value σi=argmaxμi(σi)\sigma_{i^{\ast}}^{\ast}=\arg\max\mu_{i^{\ast}}(\sigma_{i^{\ast}}).

2.4

Check the energy function E(σ)E(\vec{\sigma}) for the variables with assigned values according to equation (3)(3). If E(σ)>0E(\vec{\sigma})>0, break the loop and output ‘UNCONVERGED’.

Step 3

If the assignment σ=(σ1,σ2,,σn)\vec{\sigma}^{\ast}=(\sigma_{1}^{\ast},\sigma_{2}^{\ast},\ldots,\sigma_{n}^{\ast}) of the nn variables satisfy the mm constraints simultaneously, output σ\vec{\sigma}^{\ast} as a solution of the instance, otherwise output ‘UNCONVERGED’.

 

The subroutine MRBP-UPDATE is to update the messages on edges of a factor graph until a fixed point or a maximal number of iteration steps.

 
Subroutine MRBP-UPDATE:

Step 1

At t=0t=0: for each edge (a,i)(a,i), randomly initialize μai0(σi)[0,1]\mu_{a\rightarrow i}^{0}(\sigma_{i})\in[0,1].

Step 2

For the constraints from a=1a=1 to mm:

2.1

For each edge (a,i)(a,i) with iV(a)i\in V(a), use equation (4)(4) to obtain ηia0(σi)\eta_{i\rightarrow a}^{0}(\sigma_{i}). In the case of V(i)\a=ϕV(i)\backslash a=\phi, set ηia0(σi)=1/dn\eta_{i\rightarrow a}^{0}(\sigma_{i})=1/d_{n} for any σiD\sigma_{i}\in D.

2.2

Use equation (5)(5) to calculate μai1(σi)\mu_{a\rightarrow i}^{1}(\sigma_{i}).

Step 3

From t=1t=1 to tmaxt_{\rm max}:

3.1

For each edge (a,i)(a,i), calculate the residual γait\gamma_{a\rightarrow i}^{t} using equation (7)(7). Select the edge (aγ,iγ)(a_{\gamma},i_{\gamma}) with the largest residual among all edges and mark it.

3.2

For iV(aγ)i\in V(a_{\gamma}), use equation (4)(4) to calculate ηiat(σi)\eta_{i\rightarrow a}^{t}(\sigma_{i}), where aV(i)\aγa\in V(i)\backslash a_{\gamma}. Update μajt+1(σj)\mu_{a\rightarrow j}^{t+1}(\sigma_{j}) using equation (5)(5), where jV(a)\ij\in V(a)\backslash i.

3.3

For each edge (a,i)(a,i), if |μait+1(σi)μait(σi)|<ε|\mu_{a\rightarrow i}^{t+1}(\sigma_{i})-\mu_{a\rightarrow i}^{t}(\sigma_{i})|<\varepsilon holds, break the loop and set μai(σi)=μait(σi)\mu_{a\rightarrow i}^{\ast}(\sigma_{i})=\mu_{a\rightarrow i}^{t}(\sigma_{i}). Otherwise go to 3.1 until all edges are marked.

3.4

If t=tmaxt=t_{\rm max}, output ‘UNCONVERGED’.

 

The subroutine MRBP-UPDATE is to obtain the fixed point of message updating in the decimation process, during which some variables have already been assigned with certain values.

 
Subroutine MRBP-UPDATE:

Step 1

At t=0t=0:

  • For variables of Type A: skip;

  • For variables of Type B: if the assignment of the variable ii and the value of the fixed variable satisfy the corresponding constraint aa, set μai0(σi)=1\mu_{a\rightarrow i}^{0}(\sigma_{i})=1; otherwise, set μai0(σi)=0\mu_{a\rightarrow i}^{0}(\sigma_{i})=0;

  • For variables of Type C: uniformly initialize the values of μai0(σi)[0,1]\mu_{a\rightarrow i}^{0}(\sigma_{i})\in[0,1] at random.

Step 2

For the constraints from a=1a=1 to mm:

2.1

For each edge (a,i)(a,i), where iV(a)i\in V(a).

  • For variables of Type A and Type B: skip;

  • For variables of Type C: use equation (4)(4) to obtain ηia0(σi)\eta_{i\rightarrow a}^{0}(\sigma_{i}). In the case of V(i)\a=ϕV(i)\backslash a=\phi, set ηia0(σi)=1/dn\eta_{i\rightarrow a}^{0}(\sigma_{i})=1/d_{n} for any σiD\sigma_{i}\in D.

2.2

Update μai1(σi)\mu_{a\rightarrow i}^{1}(\sigma_{i}).

  • For variables of Type A: skip;

  • For variables of Type B: let μai1(σi)=μai0(σi)\mu_{a\rightarrow i}^{1}(\sigma_{i})=\mu_{a\rightarrow i}^{0}(\sigma_{i});

  • For variables of Type C: use equation (5)(5) to update μai1(σi)\mu_{a\rightarrow i}^{1}(\sigma_{i}).

Step 3

From t=1t=1 to tmaxt_{\rm max}:

3.1

Compute the residual γait\gamma_{a\rightarrow i}^{t}.

  • For variables of Type A and Type B: skip;

  • For variables of Type C: for each edge (a,i)(a,i), calculate the residual γait\gamma_{a\rightarrow i}^{t} using equation (7)(7). Select the edge (aγ,iγ)(a_{\gamma},i_{\gamma}) with the largest residual among all unmarked edges and mark it.

3.2

For iV(aγ)i\in V(a_{\gamma}), use equation (4)(4) to calculate ηiat(σi)\eta_{i\rightarrow a}^{t}(\sigma_{i}), where aV(i)\aγa\in V(i)\backslash a_{\gamma}. Update μajt+1(σj)\mu_{a\rightarrow j}^{t+1}(\sigma_{j}) using equation (5)(5), where jV(a)\ij\in V(a)\backslash i.

3.3

Determine whether the iterative equations converge or not.

  • For variables of Type A and Type B: skip;

  • For variables of Type C: for each edge (a,i)(a,i), if |μait+1(σi)μait(σi)|<ε|\mu_{a\rightarrow i}^{t+1}(\sigma_{i})-\mu_{a\rightarrow i}^{t}(\sigma_{i})|<\varepsilon holds, break the loop and set μai(σi)=μait(σi)\mu_{a\rightarrow i}^{\ast}(\sigma_{i})=\mu_{a\rightarrow i}^{t}(\sigma_{i}). Otherwise go to 3.1 until all edges are marked.

3.4

If t=tmaxt=t_{\rm max}, output ‘UNCONVERGED’.

 

IV Results

Table 1: For different problem size nn, the domain size dnd_{n}, the constraint number mm, and the theoretical satisfiability threshold psp_{s} obtained by Theorem 1 are shown correspondingly.
nn α\alpha rr dnd_{n} mm psp_{s}
20 0.8 3 11 180 0.23\simeq 0.23
40 0.8 3 19 443 0.23\simeq 0.23
60 0.8 3 26 737 0.23\simeq 0.23
80 0.8 3 33 1052 0.23\simeq 0.23
Refer to caption
Figure 4: Fraction of satisfiable instances as a function of pp obtained by the MRBP algorithm in solving 50 random instances generated by model RB.
Refer to caption
Figure 5: Entropy S(xiT)S(x_{i}^{T}) of the decimated variables at each step TT in the MRBP algorithm for binary model RB as a function of T/nT/n at p=0.1p=0.1.
Refer to caption
Figure 6: Mean degree of freedom s(p)s(p) on nn variables in function of pp during the execution of MRBP algorithm on a random instance generated by binary model RB for n=40n=40.

Here we test our MRBP algorithm based on residuals and show that it not only reduces the fluctuation of the messages, but also significantly improves the convergence of the BP iterative equations. Hence, it can effectively construct a solution of a random instance generated by model RB when approaching the satisfiability threshold.

As a representative parameter set to generate random instances of binary model RB, we take α=0.8\alpha=0.8 and r=3r=3. We should mention that, on random instances with other values of parameters of α\alpha and rr, we can obtain similar results on the algorithmic performance around thresholds. In Table 1, for different problem sizes nn, corresponding quantities and thresholds are shown. In the MRBP algorithm, we take tmax=400t_{\max}=400 and ε=104\varepsilon=10^{-4}.

As the first part of our results, we consider the performance of the MRBP algorithm. The fraction of successful runs over 5050 random instances, which are generated by binary model RB for n{20,40,60,80}n\in\{20,40,60,80\}, is shown in figure 4. It is observed that almost all instances are satisfiable when the constraint tightness p<0.19p<0.19. The MRBP algorithm can construct a solution efficiently when p<0.20p<0.20. However, the algorithm fails with probability tending to 1 when p>0.22p>0.22. Therefore, the algorithm shows a good performance when pp approaches the theoretical satisfiability threshold ps0.23p_{s}\simeq 0.23. Unfortunately, the algorithm fails in the extreme hard region, which may be closely related to the structural transition of the solution space. It is worth noting that the performance of the MRBP algorithm is almost independent of the problem size nn.

In the decimation process of the MRBP algorithm, we measure the entropy of the selected variable at each time step TT for different problem size nn. We define the entropy of the decimated variable at TT as

S(xiT)=σiDμi(σi)lnμi(σi).S(x_{i}^{T})=-\sum_{\sigma_{i}\in D}\mu_{i}(\sigma_{i})\ln\mu_{i}(\sigma_{i}). (8)

In figure 5, the results of the entropy in function of T/nT/n for p=0.1p=0.1 and n={20,40}n=\{20,40\} are presented. For a given nn, the entropy curve shows a concave shape with a minimum at T/n0.5T/n\approx 0.5, where half of the variables are fixed with certain values. For n=20n=20 and 4040, the entropy S(xiT)S(x_{i}^{T}) is always less than the maximum in the case of each variable is evaluated with equal probability from its domain, which guarantees that the MRBP algorithm can select a polarized variable among the remaining free variables to fix.

We then consider the mean degree of freedom of the nn variables in the decimation procedure of the BPD algorithm, which is defined as

s(p)=1ni=1nσiD[μi(σi)lnμi(σi)].s(p)=\frac{1}{n}\sum_{i=1}^{n}\sum_{\sigma_{i}\in D}[-\mu_{i}(\sigma_{i})\ln\mu_{i}(\sigma_{i})]. (9)

In figure 6, the relationship between the mean degree of freedom s(p)s(p) and the constraint tightness pp for n=40n=40 is reported. It is shown that, with the increase of constraint tightness pp, s(p)s(p) monotonously decreases to 0. Therefore, we suspect that as pp increases, some variables are frozen at certain assignments, which prevents the MRBP algorithm from escaping from the locally optimal solution. Further improving the performance of the algorithm involves studying in more detail on the structural evolution of the solution space of model RB.

Refer to caption
Figure 7: (a) Number of convergent instances and the number of instances for which a solution can be found by the algorithm 3 in Ref. Zhao-JSTAT-2011 when solving 5050 random instances of binary model RB for n=40n=40. (b) Results of the same content with (a) yet obtained from the MRBP algorithm. (c) and (d) Comparison of the two algorithms on the number of convergent instances and the number of successful instances, respectively.
Refer to caption
Figure 8: Comparison of the successful probability among the MRBP algorithm, ABP algorithm with γ=0.5\gamma=0.5, SBP algorithm with γ=0.5\gamma=0.5, VABP algorithm in Ref. Zhao-JSTAT-2021 and algorithm 3 in Ref. Zhao-JSTAT-2011 . We consider here n=40n=40.
Refer to caption
Figure 9: Comparison of the average number of iterations required by the MRBP algorithm, ABP algorithm with γ=0.5\gamma=0.5, SBP algorithm with γ=0.5\gamma=0.5, VABP algorithm in Ref. Zhao-JSTAT-2021 and algorithm 3 in Ref. Zhao-JSTAT-2011 at each decimated step TT. We consider here n=60n=60 and p=0.15p=0.15.
Refer to caption
Figure 10: Running time of the MRBP algorithm in function of pp. We consider here n=20n=20.

As the second part of our results, we present a comparison between the usual BP algorithm (algorithm 33 in Ref. Zhao-JSTAT-2011 ) and the MRBP algorithm. Figure 7 shows their performance comparison in terms of message convergence and solving power. It can be seen from figure 7 (a), for most instances, if BP algorithm converges, it can usually construct a solution of the instance. However, this is not the case for the MRBP algorithm. As shown in figure 7 (b), when p<0.21p<0.21, if the MRBP algorithm converges, it almost always finds the solution of the instance. Howerver, when p0.21p\geqslant 0.21, the assignment obtained after the algorithm has converged may not be the solution of the instance. In other words, compared with the algorithm 33 in Ref. Zhao-JSTAT-2011 , although the convergence of the MRBP algorithm has been greatly improved, the constructed assignment may not be a solution of the instance. As we can see from figure 7 (c), the convergence of the MRBP algorithm is significantly better than the algorithm 33 in Ref. Zhao-JSTAT-2011 . For the MRBP algorithm, the number of convergent instances decreases first and then increases with an increase in pp. Moreover, the convergence is the worst at p=0.21p=0.21, where the probability of the MRBP algorithm finding a solution of a random instance is extremely low. In figure 77 (d), the MRBP algorithm shows consistently a higher success probability in finding ground-state solutions. It is obvious that the MRBP algorithm not only improves the convergence of BP iterative equations, but also greatly improves the efficiency of finding solutions.

We also compare the solving efficiency of the MRBP algorithm with three related algorithms based on BP (i.e. ABP algorithm with γ=0.5\gamma=0.5, SBP algorithm with γ=0.5\gamma=0.5 and VABP algorithm) in Ref. Zhao-JSTAT-2021 and algorithm 3 in Ref. Zhao-JSTAT-2011 . The results on 50 random instances generated by binary model RB for n=40n=40 are shown in figure 8, which illustrate that the MPBP algorithm can find a solution of the problem with a high probability when approaching satisfiability phase transition region.

Furthermore, from the perspective of computational cost, we compare the average number of iterations required at each time step for the decimated variable if BP equations converge. In figure 9, the diagram of average iterations is shown for n=60n=60 at p=0.15p=0.15, where the five algorithms can construct a solution of a random instance with probability 11. We can see that, the number of iterations required by the MRBP algorithm in early decimation process is much lower than that of ABP algorithm with γ=0.5\gamma=0.5, SBP algorithm with γ=0.5\gamma=0.5, VABP algorithm in Ref. Zhao-JSTAT-2021 and algorithm 3 in Ref. Zhao-JSTAT-2011 . After approximately 90%90\% of the variables are fixed, the iterations required by the five algorithms are drastically reduced. Therefore, compared with other novel related algorithms, the MRBP algorithm greatly improves the convergence of message passing algorithms.

Finally, we analyze the time complexity of the MRBP algorithm. The typical time complexity of BP algorithms is O(M)O(M) as MM is the size of edges in a graph instance. In our modification on BP algorithms, there are extra O(1)O(1) message updating steps after each iteration of messages on all edges. Thus the complexity of the running time scales as O(n2lnn)O(n^{2}\ln n), in which n2lnnn^{2}\ln n is simply the size of edges in model RB here. In figure 10, we show the running time of the MRBP algorithm in function of pp for n=20n=20. We can see that the running time increases sharply when pp is close to the satisfiability threshold, a common phenomenon for search algorithms approaching thresholds in CSPs.

Summing the above results together, for the problem of model RB, the MRBP algorithm greatly improves the convergence of BP iterative equations and shows significantly higher probabilities in finding solutions approaching the satisfiability threshold with a large reduction of iterations in computational cost.

V Conclusion

In this paper, we propose an improved message passing algorithm based on residuals of messages to solve CSPs. The residual of messages is to quantify the fluctuation of messages as a time series between two consecutive time steps. As a modification to the usual message updating process, those messages with the maximal residuals are given higher priority in the updating process. We test our MRBP algorithm on model RB, a random CSP with growing domains, which has an exact satisfiability threshold by previous analytical study. Numerical results show that our algorithm outperforms the BP algorithm with a better convergence of the message updating, a higher probability in constructing a solution for the problem, and a lower computational cost.

To explore the potential implication of residuals on optimization problems and message passing algorithms, several lines of research can be carried out as future work. The first one is to combine our algorithm with a detailed description of the landscape of hard optimization problems to develop better problem-solving algorithms. The second one is to extend our algorithm to combinatorial optimization and CSPs with fixed domains, such as the minimum vertex cover problem and the kk-SAT problem, which have a more complex and yet a more refined picture of solution space structure of ground-states configurations. At the same time, the idea of residuals of messages can be introduced into message passing algorithms in contexts other than sparse graphs, such as the approximated message passing algorithms on dense graphs, cluster variational message passing algorithms on lattice structures, and so on.

Acknowledgements

J.-H. Zhao is supported by Guangdong Major Project of Basic and Applied Basic Research No. 2020B0301030008, Science and Technology Program of Guangzhou No. 2019050001, the Chinese Academy of Sciences Grant QYZDJ-SSW-SYS018, and the National Natural Science Foundation of China (Grant No. 12171479). C.-Y. Zhao is supported by the National Natural Science Foundation of China (Grant Nos. 11301339 and 11491240108).

References

  • (1) Newman M E J 2012 Resource Letter CS-1: Complex Systems Am. J. Phys. 79 800-10
  • (2) Koller D and Friedman N 2009 Probabilistic Graphical Models: Principles and Techniques (Cambridge: The MIT Press)
  • (3) Richardson T and Urbanke R 2008 Modern Coding Theory (Cambridge: Cambridge University Press)
  • (4) Mézard M and Montanari A 2009 Information, Physics, and Computation (Oxford: Oxford University Press)
  • (5) Garey M and Johnson D S 1979 Computers and Intractability: A Guide to the Theory of NP-Completeness (San Francisco: W H Freeman & Co)
  • (6) Zhou H-J 2015 Spin Glasses and Message Passing (Beijing: Scientific Press)
  • (7) Zdeborová L and Krzakala F 2016 Statistical physics of inference: thresholds and algorithms Adv. Phys. 65 453-552
  • (8) Braunstein A and Zecchina R 2006 Learning by Message Passing in Networks of Discrete Synapses, Phys. Rev. Lett. 96 030201
  • (9) Mézard M 2017 Mean-field message-passing equations in the Hopfield model and its generalizations Phys. Rev. E 95 022117
  • (10) Mézard M and Parisi G 2001 The Bethe lattice spin glass revisited Eur. Phys. J. B 20 217-33
  • (11) Mézard M, Parisi G and Zecchina R 2002 Analytic and algorithmic solution of random satisfiability problems Science 297 812-5
  • (12) Kikuchi R 1951 A theory of cooperative phenomena Phys. Rev. 81 988-1003
  • (13) Pelizzola A 2005 Cluster variation method in statistical physics and probabilistic graphical models J. Phys. A: Math. Gen. 38 R309-39
  • (14) Krzakala K, Montanari A, Ricci-Tersenghi F, Semerjian G and Zdeborová L 2007 Gibbs states and the set of solutions of random constraint satisfaction problems Proc. Natl. Acad. Sci. USA 104 10318-23
  • (15) Montanari A, Ricci-Tersenghi F and Semerjian G 2008 Clusters of solutions and replica symmetry breaking in random kk-satisfiability J. Stat. Mech. 2008 P04004
  • (16) Weigt M and Hartmann A K 2000 Number of Guards Needed by a Museum: A Phase Transition in Vertex Covering of Random Graphs Phys. Rev. Lett. 84 6118-21
  • (17) Weigt M and Zhou H-J 2006 Message passing for vertex covers Phys. Rev. E 74 046110
  • (18) Zdeborová L and Krzakala F 2007 Phase transitions in the coloring of random graphs Phys. Rev. E 76 031131
  • (19) Xu K and Li W 2000 Exact Phase Transitions in Random Constraint Satisfaction Problems J. Artif. Intell. Res. 12 93-103
  • (20) Smith B M and Dyer M E 1996 Locating the phase transition in binary constraint satisfaction problems Artif. Intell. 81 155-81
  • (21) Gent I P, Macintyre E, Prosser P, Smith B M and Walsh T 2001 Random constraint satisfaction: Flaws and structure Constraints 6 345-72
  • (22) Prosser P 1996 An empirical study of phase transitions in binary constraint satisfaction problems Artif. Intell. 81 81-109
  • (23) Achlioptas D, Molloy M S O, Kirousis L M, Stamatiou Y C, Kranakis E and Krizanc D 2001 Random constraint satisfaction: A more accurate picture Constraints 6 329-44
  • (24) Molloy M 2003 Models for random constraint satisfaction problems SIAM J. Comput. 32 935-49
  • (25) Frieze A and Molloy M 2006 The satisfiability threshold for randomly generated binary constraint satisfaction problems Random Struct. Algor. 28 323-39
  • (26) Smith B M 2001 Constructing an asymptotic phase transition in random binary constraint satisfaction problems Theor. Comput. Sci. 265 265-83
  • (27) Gao Y and Culberson J 2007 Consistency and random constraint satisfaction models J. Artif. Intell. Res. 28 517-57
  • (28) Achlioptas D, Naor A and Peres Y 2005 Rigorous location of phase transitions in hard optimization problems Nature 435 759-64
  • (29) Mertens S, Mézard M and Zecchina R 2006 Threshold values of random KK-SAT from the cavity method Random Struct. Algor. 28 340-73
  • (30) Xu K and Li W 2006 Many hard examples in exact phase transitions Theor. Comput. Sci. 355 291-302
  • (31) Xu K, Boussemart F, Hemery F and Lecoutre C 2007 Random constraint satisfaction: Easy generation of hard (satisfiable) instances Artif. Intell. 171 514-34
  • (32) Cai S-W, Su K-L and Sattar A 2011 Local search with edge weighting and configuration checking heuristics for minimum vertex cover Artif. Intell. 175 1672-96
  • (33) Fan Y and Shen J 2012 A general model and thresholds for random constraint satisfaction problems Artif. Intell. 193 1-17
  • (34) Shen J and Ren Y-F 2016 Bounding the scaling window of random constraint satisfaction problems J. Comb. Optim. 31 786-801
  • (35) Zhou G-Y, Gao Z-S and Liu J 2015 On the constraint length of random kk-CSP J. Comb. Optim. 30 188-200
  • (36) Zhao C-Y and Zheng Z-M 2011 Threshold behaviors of a random constraint satisfaction problem with exact phase transitions Inform. Process. Lett. 111 985-8
  • (37) Olivier C M, Monasson Rémi and Riccardo Z 2001 Statistical mechanics methods and phase transitions in optimization problems Theor. Comput. Sci. 265 3-67
  • (38) Marino R, Parisi G and Ricci-Tersenghi F 2016 The backtracking survey propagation algorithm for solving random K-SAT problems Nat. Commun. 7 12996
  • (39) Zhao C-Y, Zhou H-J, Zheng Z-M and Xu K 2011 A message-passing approach to random constraint satisfaction problems with growing domains J. Stat. Mech. 2011 P02019
  • (40) Zhao C-Y, Zhang P, Zheng Z-M and Xu K 2012 Analytical and belief-propagation studies of random constraint satisfaction problems with growing domains Phys. Rev. E 85 016106
  • (41) Zhao C-Y and Fu Y-R 2021 Belief propagation guided decimation algorithms for random constraint satisfaction problems with growing domains J. Stat. Mech. 2021 033408