This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\hideLIPIcs

Graduate School of Information Sciences, Tohoku University, [email protected]://orcid.org/0000-0002-9912-6898Partially supported by JSPS KAKENHI Grant Numbers JP19K11814 and JP20H05793, Japan.Kyoto University, [email protected] Kyoto University, Japan and https://www.lab2.kuis.kyoto-u.ac.jp/minato/[email protected]://orcid.org/0000-0002-1397-1020 Nagoya University, Japan and https://www.math.mi.i.nagoya-u.ac.jp/~otachi/ [email protected]://orcid.org/0000-0002-0087-853X JSPS KAKENHI Grant Numbers JP18K11168, JP18K11169, JP20H05793, JP21K11752. Kyushu Institute of Technology, [email protected] Graduate School of Information Sciences, Tohoku University, Japan and http://www.ecei.tohoku.ac.jp/alg/suzuki/[email protected]://orcid.org/0000-0002-5212-0202Partially supported by JSPS KAKENHI Grant Numbers JP20K11666 and JP20H05794, Japan.Japan Advanced Institute of Science and Technology, Japan and http://www.jaist.ac.jp/~uehara[email protected]://orcid.org/0000-0003-0895-3765JSPS KAKENHI Grant Numbers JP20H05961, JP20H05964, JP20K11673. National Institute of Informatics, [email protected] Iwate University, Japan and http://www.kono.cis.iwate-u.ac.jp/~yamanaka/[email protected]://orcid.org/0000-0002-4333-8680Partially supported by JSPS KAKENHI Grant Numbers 19K11812, Japan.Graduate School of Information Sciences, Tohoku University, [email protected]://orcid.org/0000-0002-5175-465X \CopyrightTakehiro Ito, Jun Kawahara, Shin-ichi Minato, Yota Otachi, Toshiki Saitoh, Akira Suzuki, Ryuhei Uehara, Takeaki Uno, Katsuhisa Yamanaka, and Ryo Yoshinaka \ccsdesc[100]Mathematics of computing \rightarrow Discrete mathematics \rightarrow Combinatorics \rightarrow Combinatorial algorithms \fundingThis research is partially supported by JSPS Kakenhi Grant Number 18H04091. \EventEditorsJohn Q. Open and Joan R. Access \EventNoEds2 \EventLongTitle42nd Conference on Very Important Topics (CVIT 2016) \EventShortTitleCVIT 2016 \EventAcronymCVIT \EventYear2016 \EventDateDecember 24–27, 2016 \EventLocationLittle Whinging, United Kingdom \EventLogo \SeriesVolume42 \ArticleNo23

Sorting Balls and Water: Equivalence and Computational Complexity

Takehiro Ito    Jun Kawahara    Shin-ichi Minato    Yota Otachi    Toshiki Saitoh    Akira Suzuki    Ryuhei Uehara    Takeaki Uno    Katsuhisa Yamanaka    Ryo Yoshinaka
Abstract

Various forms of sorting problems have been studied over the years. Recently, two kinds of sorting puzzle apps are popularized. In these puzzles, we are given a set of bins filled with colored units, balls or water, and some empty bins. These puzzles allow us to move colored units from a bin to another when the colors involved match in some way or the target bin is empty. The goal of these puzzles is to sort all the color units in order. We investigate computational complexities of these puzzles. We first show that these two puzzles are essentially the same from the viewpoint of solvability. That is, an instance is sortable by ball-moves if and only if it is sortable by water-moves. We also show that every yes-instance has a solution of polynomial length, which implies that these puzzles belong to NP. We then show that these puzzles are NP-complete. For some special cases, we give polynomial-time algorithms. We finally consider the number of empty bins sufficient for making all instances solvable and give non-trivial upper and lower bounds in terms of the number of filled bins and the capacity of bins.

keywords:
Ball sort puzzle, recreational mathematics, sorting pairs in bins, water sort puzzle
category:
\relatedversion

1 Introduction

The ball sort puzzle and the water sort puzzle are popularized recently via smartphone apps.111These sort puzzles are released by several different developers. As far as the authors checked, it seems that the first one is released by IEC Global Pty Ltd in January, 2020. Both puzzles involve bins filled with some colored units (balls or water), and the goal is to somehow sort them. The most significant feature of these puzzles is that each bin works as a stack. That is, the items in a bin have to follow the “last-in first-out” (LIFO) rule.

Refer to caption
Figure 1: An example of the ball sort puzzle.

In the ball sort puzzle, we are given hnhn colored balls in nn bins of capacity hh and kk additional empty bins. For a given (unsorted) initial configuration, the goal of this puzzle is to arrange the balls in order; that is, to make each bin either empty or full with balls of the same color. (The ordering of bins does not matter in this puzzle.) The rule of this puzzle is simple: (0) Each bin works like a stack, that is, we can pick up the top ball in the bin. (1) We can move the top ball of a bin to the top of another bin if the second bin is empty or it is not full and its top ball before the move and the moved ball have the same color. An example with h=3h=3, n=4n=4, and k=1k=1 is given in Figure 1.

Refer to caption
Figure 2: Rules of the water sort puzzle. From the initial configuration (0), we obtain configuration (1) by moving two units of water of label 11 from a to f, configuration (2) by moving two units of water of label 11 from a to b, and configuration (3) by moving one unit of water of label 22 from c to e.

The water sort puzzle is similar to the ball sort puzzle. Each ball is replaced by colored water of a unit volume in the water sort puzzle. In the water sort puzzle, the rules (0) and (1) are the same as the ball sort puzzle except one liquid property: Colored water units are merged when they have the same color and they are consecutive in a bin. When we pick up a source bin and move the top water unit(s) to a target bin, the quantity of the colored water on the top of the bin to be moved varies according to the following conditions (Figure 2). If the target bin has enough margin, all the water of the same color moves to the target bin. On the other hand, a part of the water of the same color moves up to the limit of the target bin if the target bin does not have enough margin.

In this paper, we investigate computational complexities of the ball and water sort puzzles. We are given n+kn+k bins of capacity hh. In an initial configuration, nn bins are full (i.e., filled with hh units) and kk bins are empty, where each unit has a color in the color set CC. Our task is to fill nn bins monochromatically, that is, we need to fill each of them with hh units of the same color. We define 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} as the problems of deciding whether a given initial configuration can be reconfigured to a sorted configuration by a sequence of ball moves and water moves, respectively. We assume that instances are encoded in a reasonable way, which takes Θ(nhlog|C|+logk)\Theta(nh\log|C|+\log k) bits. Without loss of generality, we assume that each color cCc\in C occurs exactly hjhj times for some positive integer jj in any instance since otherwise the instance is a trivial no-instance. (This implies that |C||C| is at most nn.)

1.1 Our results

We first prove that the problems 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} are actually equivalent. By their definitions, a yes-instance of 𝖶𝖲𝖯\mathsf{WSP} is a yes-instance of 𝖡𝖲𝖯\mathsf{BSP} as well. Thus our technical contribution here is to prove the opposite direction. As a byproduct, we show that a yes-instance of the problems admits a reconfiguration sequence of length polynomial in n+hn+h as a yes-certificate. This implies that the problems belong to NP. We also show that 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} are solvable in time O(hn)O(h^{n}).

We next show that 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} are indeed NP-complete. By slightly modifying this proof, we also show that even for some kind of trivial yes-instances of 𝖶𝖲𝖯\mathsf{WSP}, it is NP-complete to find a shortest reconfiguration to a sorted configuration.

We show that if the capacity hh is 22 and the number of colors is nn, then 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} are polynomial-time solvable. In this case, we present an algorithm that finds shortest reconfiguration sequences for yes-instances.

We also consider the following question: how many empty bins do we need to ensure that all initial configurations can reach a sorted configuration? Observe that 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} are trivial if khnk\geq hn. It is not difficult to see that knk\geq n is also sufficient by using an idea based on bucket sort. By improving this idea, we show that kh1hnk\geq\lceil\frac{h-1}{h}n\rceil is sufficient for all instances. We also show that some instances need k1964min{n,h}k\geq\lfloor\frac{19}{64}\min\{n,h\}\rfloor empty bins.

1.2 Related studies

Various forms of sorting problems have been studied over the years. For example, sorting by reversals is well investigated in the contexts of sorting network [6, Sect. 5.2.2], pancake sort [2], and ladder lotteries [8]. There is another extension of sorting problem from the viewpoint of recreational mathematics defined as follows. For each i{1,,n}i\in\{1,\dots,n\}, there are hh balls labeled ii. These hnhn balls are given in nn bins in which each bin is of capacity hh. In one step, we are allowed to swap any pair of balls. Then how many swaps do we need to sort the balls? This sorting problem was first posed by Peter Winker for the case h=2h=2 in 2004 [7], and an almost optimal algorithm for that case was given by Ito et al. in 2012 [4].

The ball/water sorting puzzles are interesting also from the viewpoints of not only just puzzles but also combinatorial reconfiguration. Recently, the reconfiguration problems attracted the attention in theoretical computer science (see, e.g., [5]). These problems arise when we need to find a step-by-step transformation between two feasible solutions of a problem such that all intermediate results are also feasible and each step abides by a fixed reconfiguration rule, that is, an adjacency relation defined on feasible solutions of the original problem. In this sense, the ball/water sort puzzles can be seen as typical implementations of the framework of reconfiguration problems, while their reconfiguration rules are non-standard. In most reconfiguration problems, representative reconfiguration rules are reversible; that is, if a feasible solution AA can be reconfigured to another feasible solution BB, then BB can be reconfigured to AA as well (see e.g., the puzzles in [3]). In the ball sort puzzle, we can reverse the last move if we have two or more balls of the same color at the top of the source bin, or there is only one ball in the source bin. Otherwise, we cannot reverse the last move. That is, some moves are reversible while some moves are one-way in these puzzles. (In Figure 1, only the last move is reversible.) In the water sort puzzle, basically, we cannot separate units of water of the same color once we merge them, however, there are some exceptional cases. An example is given in Figure 2; the move from configuration (2) to configuration (3) is reversible, but the other moves are not.

2 Equivalence of balls and water

This section presents fundamental properties of 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP}, from which we will conclude that

  • a configuration is a yes-instance of 𝖡𝖲𝖯\mathsf{BSP} if and only if it is a yes-instance of 𝖶𝖲𝖯\mathsf{WSP}, and

  • 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} both belong to NP\mathrm{NP}.

2.1 Notation used in this section

Let BB and CC be finite sets of bins and colors, respectively. The set of sequences of colors of length at most hh is denoted by ChC^{\leq h}. A sequence of cCc\in C of length ll is called monochrome and denoted by clc^{l}. The empty sequence is denoted as ε\varepsilon, which is, in our terminology, also called monochrome. A configuration is a map S:BChS\colon B\to C^{\leq h}. An instance of 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} is a configuration SS such that |S(b)|{h,0}|S(b)|\in\{h,0\} for all bBb\in B. A bin bb is sorted under SS if S(b){ε,ch}S(b)\in\{\varepsilon,c^{h}\} for some cCc\in C. A configuration SS is sorted or goal if all bins are sorted under SS. Sometimes we identify bBb\in B with S(b)S(b) when SS is clear from the context.

The iith element of a sequence α\alpha is denoted by α[i]\alpha[i] for 1i|α|1\leq i\leq|\alpha|. The top color of a color sequence α\alpha is TC(α)=α[|α|]\mathrm{TC}(\alpha)=\alpha[|\alpha|] if α\alpha is not empty. In case α\alpha is empty, TC(α)\mathrm{TC}(\alpha) is defined to be ε\varepsilon. A border in α\alpha is a non-negative integer ii such that either 1i<|α|1\leq i<|\alpha| and α[i]α[i+1]\alpha[{i}]\neq\alpha[{i+1}] or i=0i=0, and the set of borders of α\alpha is denoted by 𝒟(α)\mathcal{D}(\alpha). We count non-trivial borders under SS as D(S)=bB|𝒟(S(b)){0}|D(S)=\sum_{b\in B}|\mathcal{D}(S(b))\setminus\{0\}|. For example, in Figure 2, the values of DD are 4,3,3,34,3,3,3 in (0), (1), (2), (3), respectively.

We now turn to define a move of the sort puzzles with the terminologies introduced above. Intuitively, we pick up two bins b1b_{1} and b2b_{2} from BB, and pour the top item(s) from b1b_{1} to b2b_{2} if b2b_{2} is empty or the top item of b2b_{2} has the same color. The quantity of the items are different in the cases of balls and water. In the ball case, a unit is moved if possible. On the other hand, two or more units of water of the same color move at once until all units are moved or b2b_{2} becomes full. We define the move more precisely below.

A ball-move can modify a configuration SS into SS^{\prime}, denoted as SSS\rightarrow S^{\prime}, if there are b1,b2Bb_{1},b_{2}\in B and cCc\in C such that

  • S(b1)=S(b1)cS(b_{1})=S^{\prime}(b_{1})\cdot c, TC(S(b2)){c,ε}\mathrm{TC}(S(b_{2}))\in\{c,\varepsilon\}, S(b2)=S(b2)cS^{\prime}(b_{2})=S(b_{2})\cdot c, and

  • S(b)=S(b)S(b)=S^{\prime}(b) for all bB{b1,b2}b\in B\setminus\{b_{1},b_{2}\}.

A water-move can modify SS into SS^{\prime}, denoted as SSS\Rightarrow S^{\prime}, if there are m1m\geq 1, b1,b2Bb_{1},b_{2}\in B and cCc\in C such that

  • S(b1)=S(b1)cmS(b_{1})=S^{\prime}(b_{1})\cdot c^{m}, TC(S(b2)){c,ε}\mathrm{TC}(S(b_{2}))\in\{c,\varepsilon\}, and S(b2)=S(b2)cmS^{\prime}(b_{2})=S(b_{2})\cdot c^{m},

  • either TC(S(b1))c\mathrm{TC}(S^{\prime}(b_{1}))\neq c or |S(b2)|=h\left|S^{\prime}(b_{2})\right|=h, and

  • S(b)=S(b)S(b)=S^{\prime}(b) for all bB{b1,b2}b\in B\setminus\{b_{1},b_{2}\}.

The reflexive and transitive closures of \rightarrow and \Rightarrow are denoted as \rightarrow^{*} and \Rightarrow^{*}, respectively. Clearly any single water-move can be simulated by some number of ball-moves. Thus we have {\Rightarrow^{*}}\subseteq{\rightarrow^{*}}.

2.2 Fundamental Theorems of 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP}

Hereafter we fix an arbitrary initial configuration S0S_{0}. Some notions introduced below are relative to S0S_{0}, though it may not be explicit. The top-border table of a configuration SS is a map TBS:B{0,1,,h1}\mathrm{TB}_{S}\colon B\to\{0,1,\dots,h-1\} defined as TBS(b)=max𝒟(S(b))\mathrm{TB}_{S}(b)=\max\mathcal{D}(S(b)).

The main object of this section is to observe that top-border tables play an essential role in analyzing the solvability of an instance. Note that TBS(b)=0\mathrm{TB}_{S}(b)=0 if and only if bb is monochrome. Once we have reached a configuration with TBS(b)=0\mathrm{TB}_{S}(b)=0 for all bBb\in B, we can achieve a goal configuration by trivial moves. Figure 3 illustrates some notions introduced in this section.

By the definition of moves, one can easily observe the following properties. {observation} If STS\rightarrow^{*}T, the following holds for all bins bBb\in B:

  1. 1.

    Moves monotonically remove borders from top to bottom:

    𝒟(T(b))={i𝒟(S(b))iTBT(b)};\mathcal{D}(T(b))=\{\,i\in\mathcal{D}(S(b))\mid i\leq\mathrm{TB}_{T}(b)\,\}\,;
  2. 2.

    The contents below the top borders of TT do not change:

    T(b)[i]=S(b)[i] for all iTBT(b);T(b)[i]=S(b)[i]\text{ for all }i\leq\mathrm{TB}_{T}(b)\,;
  3. 3.

    If bb is not monochrome in TT, the colors of all the units above the border in TT are the color of the unit just above that border in SS:

    TC(T(b))=S(b)[TBT(b)+1] if TBT(b)0.\mathrm{TC}(T(b))=S(b)[\mathrm{TB}_{T}(b)+1]\text{ if\/ $\mathrm{TB}_{T}(b)\neq 0$.}
S0S_{0}
SS
SS^{\prime}
Figure 3: The configuration SS can be obtained from the initial configuration S0S_{0} and can be modified into the tight one SS^{\prime}. The top-borders TBS\mathrm{TB}_{S} of SS are shown with thick lines. This figure does not care the units below those borders. All of S0S_{0}, SS, and SS^{\prime} have red(TBS)=9\mathcal{F}_{\textrm{red}}(\mathrm{TB}_{S})=9 red units, green(TBS)=7\mathcal{F}_{\textrm{green}}(\mathrm{TB}_{S})=7 green units, and blue(TBS)=3\mathcal{F}_{\textrm{blue}}(\mathrm{TB}_{S})=3 blue units above the borders in total. The colors red, green, and blue require red(TBS)=0\mathcal{M}_{\textrm{red}}(\mathrm{TB}_{S})=0, green(TBS)=1\mathcal{M}_{\textrm{green}}(\mathrm{TB}_{S})=1, and blue(TBS)=1\mathcal{M}_{\textrm{blue}}(\mathrm{TB}_{S})=1 monochrome bins, respectively, which coincide with the number of the monochrome bins of respective colors in SS^{\prime}.

We first give a necessary condition for S0SS_{0}\rightarrow^{*}S.

Recall that throughout the game play, the total number of units of each color does not change, i.e., if S0SS_{0}\rightarrow^{*}S, it holds for every color cCc\in C,

bB|{i1ih and S0(b)[i]=c}|=bB|{i1i|S(b)| and S(b)[i]=c}|.\sum_{b\in B}|\{\,i\mid 1\leq i\leq h\text{ and }S_{0}(b)[i]=c\,\}|=\sum_{b\in B}|\{\,i\mid 1\leq i\leq|S(b)|\text{ and }S(b)[i]=c\,\}|\,.

Since the units under the top border of each bin have not been moved (Observation 2.2-2), the total numbers of units of each color cc above the borders of TBS\mathrm{TB}_{S} coincide in S0S_{0} and SS. We count those units of color cc as

c(TBS)=bB|{iTBS(b)<ih and S0(b)[i]=c}|.\mathcal{F}_{c}(\mathrm{TB}_{S})=\sum_{b\in B}|\{\,i\mid\mathrm{TB}_{S}(b)<i\leq h\text{ and }S_{0}(b)[i]=c\,\}|\,.

Those units may have been moved, but units of color cc can go only to empty bins or bins whose top color is cc in SS. Let us partition the bins into |C|+1|C|+1 groups with respect to SS, where

  • Bε(TBS)={bBTBS(b)=0}B_{\varepsilon}(\mathrm{TB}_{S})=\{\,b\in B\mid\mathrm{TB}_{S}(b)=0\,\}: monochrome bins,

  • Bc(TBS)={bBBεS0(b)[TBS(b)+1]=c}B_{c}(\mathrm{TB}_{S})=\{\,b\in B\setminus B_{\varepsilon}\mid S_{0}(b)[\mathrm{TB}_{S}(b)+1]=c\,\}: non-monochrome bins with the top color cc.

Note that if bBε(TBS)b\notin B_{\varepsilon}(\mathrm{TB}_{S}), the top color of bb in SS is TC(S(b))=S(b)[TBS(b)+1]=S0(b)[TBS(b)+1]\mathrm{TC}(S(b))=S(b)[\mathrm{TB}_{S}(b)+1]=S_{0}(b)[\mathrm{TB}_{S}(b)+1]. Since each bin bBcb\in B_{c} may have at most hTBS(b)h-\mathrm{TB}_{S}(b) units of color cc above the border TBS(b)\mathrm{TB}_{S}(b), there are

𝒢c(TBS)=bBc(hTBS(b))\mathcal{G}_{c}(\mathrm{TB}_{S})=\sum_{b\in B_{c}}(h-\mathrm{TB}_{S}(b))

units of color cc can be on the top layer of non-monochrome bins. Thus, we must have at least

c(TBS)=max{0,c(TBS)𝒢c(TBS)h}\mathcal{M}_{c}(\mathrm{TB}_{S})=\max\left\{0,\,\left\lceil\frac{\mathcal{F}_{c}(\mathrm{TB}_{S})-\mathcal{G}_{c}(\mathrm{TB}_{S})}{h}\right\rceil\right\}

monochrome bins in BεB_{\varepsilon} which have some units of color cc. Therefore, it is necessary that

cCc(TBS)|Bε(TBS)|.\sum_{c\in C}\mathcal{M}_{c}(\mathrm{TB}_{S})\leq|B_{\varepsilon}(\mathrm{TB}_{S})|\,. (1)

We remark that the values of the functions introduced above, namely c\mathcal{F}_{c}, BcB_{c}, c\mathcal{M}_{c}, are all determined by the top-border table TBS\mathrm{TB}_{S}. Those values coincide for SS and SS^{\prime} if TBS=TBS\mathrm{TB}_{S}=\mathrm{TB}_{S^{\prime}}. Let us say that SS is consistent with S0S_{0} if and only if TBS\mathrm{TB}_{S} satisfies Eq. (1). Moreover, SS is tight if SS has exactly c(TBS)\mathcal{M}_{c}(\mathrm{TB}_{S}) monochrome bins with at least one unit of every color cCc\in C.

Lemma 2.1.

Suppose SS is consistent with S0S_{0}. There is a tight configuration SS^{\prime} such that TBS=TBS\mathrm{TB}_{S}=\mathrm{TB}_{S^{\prime}} and SSS\Rightarrow^{*}S^{\prime}.

Proof 2.2.

Recall that it is necessary that SS has at least c(TBS)\mathcal{M}_{c}(\mathrm{TB}_{S}) monochrome bins of each color cc. If SS has more than that, one can move all the units of any of the monochrome bins of color cc to other non-empty bins. The definition of c\mathcal{M}_{c} guarantees that we can repeat this until we have just c(TBS)\mathcal{M}_{c}(\mathrm{TB}_{S}) monochrome bins of color cc.

Of course TBS\mathrm{TB}_{S} being consistent with S0S_{0} does not imply S0SS_{0}\rightarrow^{*}S. Suppose S0SSS_{0}\rightarrow^{*}S\rightarrow S^{\prime} with TBSTBS\mathrm{TB}_{S}\neq\mathrm{TB}_{S^{\prime}}, where SS^{\prime} has one less border than SS by moving a unit above the top border of some bin bb to some other bins. Note that S(b)S(b) is not monochrome. During the move, the bin bb can keep units below its top border only. Therefore, the number of necessary monochrome bins of each color cc for this move is

cb(TBS)=max{0,c(TBS)(𝒢c(TBS)(hTBS(b)))h}\mathcal{M}_{c}^{b}(\mathrm{TB}_{S})=\max\left\{0,\,\left\lceil\frac{\mathcal{F}_{c}(\mathrm{TB}_{S})-(\mathcal{G}_{c}(\mathrm{TB}_{S})-(h-\mathrm{TB}_{S}(b)))}{h}\right\rceil\right\}

and it is necessary that

cCcb(TBS)|Bε(TBS)|.\sum_{c\in C}\mathcal{M}_{c}^{b}(\mathrm{TB}_{S})\leq|B_{\varepsilon}(\mathrm{TB}_{S})|\,.

Those conditions depend on top-border configuration of SS. For two top-border tables τ\tau and τ\tau^{\prime} and a bin bBb\in B, we write τbτ\tau\stackrel{{\scriptstyle b}}{{\Rrightarrow}}\tau^{\prime} if

  • τ(a)=τ(a)\tau(a)=\tau^{\prime}(a) for all aB{b}a\in B\setminus\{b\},

  • τ(b)>0\tau(b)>0 and τ(b)=max{i𝒟(S0(b))0i<τ(b)}\tau^{\prime}(b)=\max\{\,i\in\mathcal{D}(S_{0}(b))\mid 0\leq i<\tau(b)\,\},

  • cCcb(τ)|Bε(τ)|\sum_{c\in C}\mathcal{M}_{c}^{b}(\tau)\leq|B_{\varepsilon}(\tau)|.

We write ττ\tau\stackrel{{\scriptstyle}}{{\Rrightarrow}}\tau^{\prime} if τbτ\tau\stackrel{{\scriptstyle b}}{{\Rrightarrow}}\tau^{\prime} for some bBb\in B. Figure 4 may help intuitive understanding the above argument.

Theorem 2.3.

Suppose a configuration SS is consistent with S0S_{0}. Then, there is a configuration TT such that STS\Rightarrow^{*}T if and only if TBSTBT\mathrm{TB}_{S}\stackrel{{\scriptstyle}}{{\Rrightarrow}}^{*}\mathrm{TB}_{T}, where \stackrel{{\scriptstyle}}{{\Rrightarrow}}^{*} is the reflexive and transitive closure of \stackrel{{\scriptstyle}}{{\Rrightarrow}}.

Proof 2.4.

Suppose STS\Rightarrow T and TBSTBT\mathrm{TB}_{S}\neq\mathrm{TB}_{T}. Then TBS\mathrm{TB}_{S} and TBT\mathrm{TB}_{T} must satisfy the condition for TBSTBT\mathrm{TB}_{S}\stackrel{{\scriptstyle}}{{\Rrightarrow}}\mathrm{TB}_{T}.

Suppose TBSTBT\mathrm{TB}_{S}\stackrel{{\scriptstyle}}{{\Rrightarrow}}\mathrm{TB}_{T}. We assume without loss of generality that SS is tight by Lemma 2.1. The condition cCcb(τ)|Bε(τ)|\sum_{c\in C}\mathcal{M}_{c}^{b}(\tau)\leq|B_{\varepsilon}(\tau)| implies that units of color c=S0(b)[TBS(b)+1]c=S_{0}(b)[\mathrm{TB}_{S}(b)+1] in concern can be distributed to bins other than bb. If cb(TBS)=c(TBS)\mathcal{M}_{c}^{b}(\mathrm{TB}_{S})=\mathcal{M}_{c}(\mathrm{TB}_{S}), one can move the units in bb above TBT(b)\mathrm{TB}_{T}(b) to other bins whose top color is cc. If cb(TBS)>c(TBS)\mathcal{M}_{c}^{b}(\mathrm{TB}_{S})>\mathcal{M}_{c}(\mathrm{TB}_{S}), one can move those units to an empty bin, which must be available in SS, since SS is tight.

b1b_{1}b2b_{2}b3b_{3}b4b_{4}S1S_{1}
b1b_{1}b2b_{2}b3b_{3}b4b_{4}S2S_{2}
b1b_{1}b2b_{2}b3b_{3}b4b_{4}S3S_{3}
Figure 4: All configurations S1S_{1}, S2S_{2} and S3S_{3} are tight and have the same top-border table τ=TBS1=TBS2=TBS3\tau=\mathrm{TB}_{S_{1}}=\mathrm{TB}_{S_{2}}=\mathrm{TB}_{S_{3}}, where τ(b1)=7\tau(b_{1})=7, τ(b2)=5\tau(b_{2})=5, and τ(b3)=4\tau(b_{3})=4. In those configurations, one can move the units above the top border of any of b1b_{1} or b2b_{2}, but it is impossible for b3b_{3}. This solely depends on the top-border table τ\tau and the total number red(τ)=16\mathcal{F}_{\textrm{red}}(\tau)=16 of red units above the top borders. It is independent of how those red units are distributed over the bins in the current configuration.
Corollary 2.5.

An initial configuration S0S_{0} is a yes-instance of 𝖡𝖲𝖯\mathsf{BSP} if and only if it is a yes-instance of 𝖶𝖲𝖯\mathsf{WSP}.

Proof 2.6.

Since any water-move can be simulated by some ball-moves, it is enough to show the converse. We claim that, for any configurations SS and SS^{\prime} such that TBS=TBS\mathrm{TB}_{S}=\mathrm{TB}_{S^{\prime}}, if STS\rightarrow T, then there is TT^{\prime} such that STS^{\prime}\Rightarrow^{*}T^{\prime} and TBT=TBT\mathrm{TB}_{T}=\mathrm{TB}_{T^{\prime}}.

By Theorem 2.3, either TBS=TBT\mathrm{TB}_{S}=\mathrm{TB}_{T} or TBSTBT\mathrm{TB}_{S}\stackrel{{\scriptstyle}}{{\Rrightarrow}}\mathrm{TB}_{T}. If TBS=TBT\mathrm{TB}_{S}=\mathrm{TB}_{T}, then T=ST^{\prime}=S^{\prime} proves the claim. Otherwise, the claim immediately follows from Theorem 2.3. Thus, if S0TS_{0}\rightarrow^{*}T and TT is a goal configuration, then one can have S0TS_{0}\Rightarrow^{*}T^{\prime} for some TT^{\prime} with TBT=TBT\mathrm{TB}_{T}=\mathrm{TB}_{T^{\prime}}. By modifying TT^{\prime} tight by Lemma 2.1, we get a goal configuration.

If τbτ\tau\stackrel{{\scriptstyle b}}{{\Rrightarrow}}\tau^{\prime}, from the values c(τ)\mathcal{F}_{c}(\tau) and 𝒢c(τ)\mathcal{G}_{c}(\tau), one can compute c(τ)\mathcal{F}_{c}(\tau^{\prime}) and 𝒢c(τ)\mathcal{G}_{c}(\tau^{\prime}) in constant time, by preprocessing S0S_{0}, using the following equations:

c(τ)\displaystyle\mathcal{F}_{c}(\tau^{\prime}) ={c(τ)+(τ(b)τ(b))if S0(b)[τ(b)+1]=c,c(τ)otherwise,\displaystyle=\begin{cases}\mathcal{F}_{c}(\tau)+(\tau(b)-\tau^{\prime}(b))&\text{if $S_{0}(b)[\tau^{\prime}(b)+1]=c$,}\\ \mathcal{F}_{c}(\tau)&\text{otherwise,}\end{cases}
𝒢c(τ)\displaystyle\mathcal{G}_{c}(\tau^{\prime}) ={𝒢c(τ)(hτ(b))if S0(b)[τ(b)+1]=c,𝒢c(τ)+(hτ(b))if S0(b)[τ(b)+1]=c and τ(b)>0,𝒢c(τ)otherwise.\displaystyle=\begin{cases}\mathcal{G}_{c}(\tau)-(h-\tau(b))&\text{if $S_{0}(b)[\tau(b)+1]=c$,}\\ \mathcal{G}_{c}(\tau)+(h-\tau^{\prime}(b))&\text{if $S_{0}(b)[\tau^{\prime}(b)+1]=c$ and $\tau^{\prime}(b)>0$,}\\ \mathcal{G}_{c}(\tau)&\text{otherwise.}\end{cases}
Corollary 2.7.

𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} belong to NP\mathrm{NP}.

Proof 2.8.

By Theorem 2.3, S0S_{0} is a yes-instance if and only if there is a bin sequence (b1,,bm)(b_{1},\dots,b_{m}) with m=D(S0)m=D(S_{0}) (=bB|𝒟(S0(b)){0}|=\sum_{b\in B}|\mathcal{D}(S_{0}(b))\setminus\{0\}|) that admits a top-border table sequence (τ0,,τm)(\tau_{0},\dots,\tau_{m}) such that τ0=TBS0\tau_{0}=\mathrm{TB}_{S_{0}} and τi1biτi\tau_{i-1}\stackrel{{\scriptstyle b_{i}}}{{\Rrightarrow}}\tau_{i} for all ii. Each τi\tau_{i} is uniquely determined by τi1\tau_{i-1} and bib_{i} and one can verify τi1biτi\tau_{i-1}\stackrel{{\scriptstyle b_{i}}}{{\Rrightarrow}}\tau_{i} in constant time, by maintaining the values of c\mathcal{F}_{c} and 𝒢c\mathcal{G}_{c} (or just c𝒢c\mathcal{F}_{c}-\mathcal{G}_{c}).

A solution for an instance is essentially an order of borders of the instance to remove.

Corollary 2.9.

𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} can be solved in O(hn)O(h^{n}) time.

Proof 2.10.

There are at most bB|𝒟(S0(b))|hn\prod_{b\in B}|\mathcal{D}(S_{0}(b))|\leq h^{n} distinct top-border tables.

Corollary 2.11.

For every yes-instance of 𝖡𝖲𝖯\mathsf{BSP}, there exists a sequence of ball-moves to a goal configuration of length at most (2h1)hn(2h-1)hn.

Proof 2.12.

In 𝖡𝖲𝖯\mathsf{BSP}, h1h-1 ball-moves are enough to eliminate a border from a tight configuration, if there is a way to eliminate the border. To make the obtained configuration tight takes at most hh moves, since the obtained configuration may have at most one monochrome bin to empty when the previous configuration is tight. Since there can be at most (h1)n(h-1)n borders to eliminate, every yes-instance has a solution consisting of at most (2h1)hn(2h-1)hn ball-moves.

Corollary 2.13.

For every yes-instance of 𝖶𝖲𝖯\mathsf{WSP}, there exists a sequence of water-moves to a goal configuration of length at most 2(h1)nl2(h-1)nl where l=min{h,n}l=\min\{h,n\}.

Proof 2.14.

In 𝖶𝖲𝖯\mathsf{WSP}, l=min{h,n}l=\min\{h,n\} water-moves are enough to eliminate a border from a tight configuration, if there is a way to eliminate the border. To make the obtained configuration tight, it takes again ll moves. To see this, observe that the obtained configuration may have at most one monochrome bin to empty as in the ball case. If one water-move does not empty the source monochrome bin, it must make the target bin full. Since one cannot make more than nn full bins, to empty a monochrome bin takes at most ll water-moves. Therefore, every yes-instance has a solution consisting of at most 2(h1)nl2(h-1)nl water-moves.

3 NP-completeness

In this section, we show that 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} are NP-complete even with two colors. By Corollaries 2.5 and 2.7, it suffices to show that 𝖶𝖲𝖯\mathsf{WSP} with two colors is NP-hard. By slightly modifying the proof, we also show that given a trivial yes-instance of 𝖶𝖲𝖯\mathsf{WSP}, it is NP-complete to find a shortest sequence of water-moves to a sorted configuration, where an instance is trivial in the sense that it contains many (say hnhn) empty bins.

Theorem 3.1.

𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP} are NP-complete even with two colors.

Proof 3.2.

As mentioned above, it suffices to show that 𝖶𝖲𝖯\mathsf{WSP} with two colors is NP-hard. We prove it by a reduction from the following problem 3-Partition, which is known to be NP-complete even if BB is bounded from above by some polynomial in mm [1].

3-Partition

  • Input:

    Positive integers a1,a2,a3,,a3ma_{1},a_{2},a_{3},\dots,a_{3m} such that i=13mai=mB\sum_{i=1}^{3m}a_{i}=mB for some positive integer BB and B/4<ai<B/2B/4<a_{i}<B/2 for 1i3m1\leq i\leq 3m.

  • Question:

    Is there a partition of {1,2,,3m}\{1,2,\dots,3m\} into mm subsets A1,A2,,AmA_{1},A_{2},\dots,A_{m} such that iAjai=B\sum_{i\in A_{j}}a_{i}=B for 1jm1\leq j\leq m?

In the reduction, we use two colors red and blue. A non-empty bin is red (blue) if it contains red (blue, resp.) units only. A non-empty bin is red-blue if its top units are red and the other units are blue. From an instance a1,,a3m\langle a_{1},\dots,a_{3m}\rangle of 3-Partition, we define an instance SS of 𝖶𝖲𝖯\mathsf{WSP} as follows (see Figure 5):

  • the capacity of each bin is BB;

  • for each i[3m]i\in[3m], it contains a full red-blue bin bib_{i} with aia_{i} red units and BaiB-a_{i} blue units;

  • it contains one empty bin.

We show that SS is a yes-instance of 𝖶𝖲𝖯\mathsf{WSP} if and only if a1,,a3m\langle a_{1},\dots,a_{3m}\rangle is a yes-instance of 3-Partition.

Refer to caption
Figure 5: The reduction from 3-Partition to 𝖶𝖲𝖯\mathsf{WSP}.

To show the if direction, assume that a1,,a3m\langle a_{1},\dots,a_{3m}\rangle is a yes-instance of 3-Partition. We can construct a sequence of water-moves from SS to a sorted configuration as follows. Let A1,,AmA_{1},\dots,A_{m} be a partition of {1,,3m}\{1,\dots,3m\} such that iAjai=B\sum_{i\in A_{j}}a_{i}=B for 1jm1\leq j\leq m. Using one empty bin, we can sort bj1,bj2,bj3b_{j_{1}},b_{j_{2}},b_{j_{3}} with Aj={j1,j2,j3}A_{j}=\{j_{1},j_{2},j_{3}\} as follows (see Figure 5). We first move all red units to the empty bin. Since the number of red units is aj1+aj2+aj3=Ba_{j_{1}}+a_{j_{2}}+a_{j_{3}}=B, the bin becomes full. Now bj1b_{j_{1}}, bj2b_{j_{2}}, and bj3b_{j_{3}} are blue bins containing (Baj1)+(Baj2)+(Baj3)=2B(B-a_{j_{1}})+(B-a_{j_{2}})+(B-a_{j_{3}})=2B blue units in total. We move the units in bj3b_{j_{3}} to bj1b_{j_{1}} and bj2b_{j_{2}}. After that, bj1b_{j_{1}} and bj2b_{j_{2}} become full and bj3b_{j_{3}} is empty. Using this new empty bin, we can continue and sort all bins.

To show the only-if direction, we prove the following slightly modified statement. (What we need is the case of ρ=β=0\rho=\beta=0.)

Let ρ\rho and β\beta be non-negative integers, and S0S_{0} be the instance of 𝖶𝖲𝖯\mathsf{WSP} obtained from SS by adding ρ\rho full red bins and β\beta full blue bins. If S0S_{0} is a yes-instance of the decision problem, then a1,,a3m\langle a_{1},\dots,a_{3m}\rangle is a yes-instance of 3-Partition.

Assume that there is a sorting sequence S0,,SS_{0},\dots,S_{\ell}, where SS_{\ell} is a sorted configuration. We use induction on mm. (We need the dummy bins for the induction step.) The case of m=1m=1 is trivial since all instances of 3-Partition with m=1m=1 are yes-instances. Assume that m2m\geq 2 and that the statement holds for strictly smaller instances of 3-Partition.

Observe that S0S_{0} contains 3m+ρ+β+13m+\rho+\beta+1 bins and (3m+ρ+β)B(3m+\rho+\beta)B units of water ((m+ρ)B(m+\rho)B red units and (2m+β)B(2m+\beta)B blue units). Since the capacity of bins is BB, every SiS_{i} contains at most one empty bin, and if there is one, then the other bins are full. Observe also that each bin in each SiS_{i} is either red, blue, red-blue, or empty.

We say that a move opens a bin bib_{i} if the move makes bib_{i} red units free for the first time. Observe that each red unit in b1,,b3mb_{1},\dots,b_{3m} has to move at most once since otherwise non-monochromatic bins will remain. Thus, all b1,,b3mb_{1},\dots,b_{3m} will eventually be opened. Let bpb_{p}, bqb_{q}, and brb_{r} be the first three bins opened in the sorting sequence in this order.

Assume that brb_{r} is opened by the move from Sj1S_{j-1} to SjS_{j}. Let α=ap+aq+ar\alpha=a_{p}+a_{q}+a_{r}.

Claim 1.

SjS_{j} satisfies the following conditions.

  1. 1.

    There are ρ+1\rho+1 red bins, and they contain at least ρB+α\rho B+\alpha units.

  2. 2.

    There are β+3\beta+3 blue bins, and they contain (β+3)Bα(\beta+3)B-\alpha units.

  3. 3.

    There are 3m33m-3 red-blue bins, and they contain (2m3)B+α(2m-3)B+\alpha blue units and at most mBαmB-\alpha red units.

  4. 4.

    There is no empty bin. (This follows for free from the other conditions.)

Proof 3.3.

Observe that we cannot create a new red-blue bin by any sequence of moves. Thus the number of red-blue bins is 3m33m-3. Since each red-blue bin bib_{i} contains exactly BaiB-a_{i} blue units and at most aia_{i} red units, the red-blue bins contain exactly (2m3)B+α(2m-3)B+\alpha blue units and at most mBαmB-\alpha red units in total.

The blue bins contain exactly (β+3)Bα(\beta+3)B-\alpha units. Since brb_{r} contains exactly BarB-a_{r} units, the other blue bins contain (β+2)B(ap+aq)(\beta+2)B-(a_{p}+a_{q}) units, and thus the number of blue bins is at least ((β+2)B(ap+aq))/B+1β+3\lceil((\beta+2)B-(a_{p}+a_{q}))/B\rceil+1\geq\beta+3, where the inequality holds as ap+aq<Ba_{p}+a_{q}<B.

Since the red bins contain at least ρB+α\rho B+\alpha units, the number of red bins is at least ρ+α/Bρ+1\rho+\lceil\alpha/B\rceil\geq\rho+1. Recall that the total number of bins is 3m+ρ+β+13m+\rho+\beta+1. Thus, the number of red bins is exactly ρ+1\rho+1 and the number of blue bins is exactly β+3\beta+3.

Claim 2.

α=B\alpha=B.

Proof 3.4.

In SjS_{j}, the ρ+1\rho+1 red bins contain at least ρB+α\rho B+\alpha units. This implies that αB\alpha\leq B. Suppose to the contrary that α<B\alpha<B. We show that under this assumption, the conditions in 1 cannot be violated by any sequence of moves. This contradicts that the final state SS_{\ell} does not contain any red-blue bin.

Let TT be a configuration satisfying the conditions in 1 and TT^{\prime} be a configuration obtained from TT by one move. It suffices to show that TT^{\prime} still satisfies the conditions in 1. Assume that the move is from a bin bb to another bin bb^{\prime}.

First consider the case where we moved blue units. In this case, bb and bb^{\prime} have to be blue bins. Hence, to show that all properties are satisfied, it suffices to show that bb is not empty after the move. Indeed, if bb becomes empty after the move, then β+2\beta+2 blue bins contain (β+3)Bα>(β+2)B(\beta+3)B-\alpha>(\beta+2)B units in total. This contradicts the capacity of bins.

Next consider the case where we moved red units. The bins bb and bb^{\prime} are red or red-blue. If bb becomes empty (when it was red) or blue (when it was red-blue), then the total number of red bins and red-blue bins becomes 3m+ρ33m+\rho-3 but they still contain (3m+ρ3)B+α(3m+\rho-3)B+\alpha units ((m+ρ)B(m+\rho)B red and (2m3)B+α(2m-3)B+\alpha blue). This contradicts the capacity of bins. Thus, we know that the type of bb does not change by the move. If the move is either from red to red, from red-blue to red-blue, or from red-blue to red, then the all conditions are satisfied. Assume that the move is from red to red-blue and that the red-blue bins contain more than mBαmB-\alpha red units after the move. Now the 3m33m-3 red-blue bins contain more than (2m3)B+α+mBα=(3m3)B(2m-3)B+\alpha+mB-\alpha=(3m-3)B units. This again contradicts the capacity of bins.

By the claims above and the capacity of bins, SjS_{j} satisfies the following conditions.

  1. 1.

    There are ρ+1\rho+1 full red bins.

  2. 2.

    There are β+3\beta+3 blue bins, and they contain (β+2)B(\beta+2)B units.

  3. 3.

    There are 3m33m-3 full red-blue bins, and they contain (2m2)B(2m-2)B blue units and (m1)B(m-1)B red units.

  4. 4.

    There is no empty bin.

Observe that each red-blue bin bib_{i} in SjS_{j} is not opened so far, and thus contains BaiB-a_{i} blue units (and aia_{i} red units as it is full).

Let us take a look at the sorting sequence from SjS_{j} to SS_{\ell}. Since there is no empty bin and all bins containing red units are full in SjS_{j}, we need to move blue units in blue bins first. Unless we make an empty bin, the situation does not change. Let ShS_{h} be the first configuration after SjS_{j} that contains an empty bin. By the capacity of bins and the discussion so far, ShS_{h} satisfies the following conditions.

  1. 1.

    There are ρ+1\rho+1 full red bins.

  2. 2.

    There are β+2\beta+2 full blue bins.

  3. 3.

    For each i{1,,3m}{p,q,r}i\in\{1,\dots,3m\}\setminus\{p,q,r\}, the bin bib_{i} is a full red-blue bin that contains aia_{i} red units and BaiB-a_{i} blue units.

  4. 4.

    There is one empty bin.

Without loss of generality, assume that {p,q,r}={3m2,3m1,3m}\{p,q,r\}=\{3m-2,3m-1,3m\}. Let SS^{\prime} be the instance of 𝖶𝖲𝖯\mathsf{WSP} obtained from a1,,a3(m1)a_{1},\dots,a_{3(m-1)} by the reduction above, and S0S^{\prime}_{0} be the one obtained from SS^{\prime} by adding ρ+1\rho+1 full red bins and β+2\beta+2 full blue bins. Observe that S0S^{\prime}_{0} can be obtained from ShS_{h} by renaming the bins. Thus the sorting sequence from ShS_{h} to SS_{\ell} can be applied to S0S^{\prime}_{0} by appropriately renaming the bins. Therefore, we can apply the induction hypothesis to S0S^{\prime}_{0} and thus a1,,a3(m1)a_{1},\dots,a_{3(m-1)} is a yes-instance of 3-Partition. Since α=a3m2+a3m1+a3m=B\alpha=a_{3m-2}+a_{3m-1}+a_{3m}=B, the instance a1,,a3ma_{1},\dots,a_{3m} is also a yes-instance of 3-Partition.

Corollaries 2.11, 2.13 and 3.1 imply that given an integer tt and a configuration SS, it is NP-complete to decide whether there is a sequence of length at most tt from SS to a sorted configuration under both settings in 𝖡𝖲𝖯\mathsf{BSP} and 𝖶𝖲𝖯\mathsf{WSP}. We observe a slightly stronger result for 𝖶𝖲𝖯\mathsf{WSP}.

Corollary 3.5.

Given an integer tt and an instance SS of 𝖶𝖲𝖯\mathsf{WSP}, it is NP-complete to decide whether there is a sorting sequence for SS with length at most tt even if SS is guaranteed to be a yes-instance.

Proof 3.6.

From an instance a1,,a3m\langle a_{1},\dots,a_{3m}\rangle of 3-Partition, we first construct an instance of 𝖶𝖲𝖯\mathsf{WSP} as described in the proof of Theorem 3.1, and then add a sufficient number of empty bins to guarantee that the resultant instance is a yes-instance. This is always possible with a polynomial number of empty bins as we see in Section 5.222It is not difficult to show that this instance only needs a constant number of bins. In fact, two empty bins are enough to perform a greedy algorithm for sorting. Let SS denote the constructed instance. We set t=5mt=5m

The proof of Theorem 3.1 implies that if a1,,a3m\langle a_{1},\dots,a_{3m}\rangle is a yes-instance, then SS admits a sorting sequence of length 5m5m. (See Figure 5.)

Conversely, assume that SS admits a sorting sequence of length 5m5m. Recall that each of the 3m3m red-blue bins in SS contains aia_{i} red units at the top and BaiB-a_{i} blue units at the bottom for some ii. Since each full blue bin in the final configuration contains units from at least two original bins, we need at least one move for it. Thus we need at least 2m2m moves to make 2m2m full blue bins. This implies that we have at most 3m3m moves that involve red units. Actually, the number of such moves is exactly 3m3m since each red unit has to move at least once. Since B/4<ai<B/2B/4<a_{i}<B/2 for all ii, each of the mm full red bins in the final configuration contains units from at least three original bins, and thus it needs at least three moves. If some red bin contains units from more than three original bins, then it needs at least four moves. This contradicts the assumption that we have at most 3m3m moves for red units. Thus we can conclude that each red bin in the final configuration contains units from exactly three original bins. This gives a solution to the instance of 3-Partition as the capacity of the bins is BB.

4 Polynomial-time algorithms when h=2h=2 and |C|=n|C|=n

In this section, we focus on the special case of h=2h=2 and |C|=n|C|=n. In popular apps, it is often the case that hh is a small constant and |C|=n|C|=n. The case h=2h=2 is the first nontrivial case in this setting. Under this setting, every ball-move is a water-move and vice versa, except for moving (a) unit(s) from a bin with two units of the same color to an empty bin, which is a vacuous move. Therefore, in this section we do not distinguish water-moves and ball-moves and simply call them moves. We prove that in this setting, all instances with k2k\geq 2 are yes-instances. We also show that we can find a shortest sorting sequence in O(n)O(n) time (if any exists).

We say that a bin of capacity 22 is a full bin if it contains two units, and a half bin if it contains one unit.

G(S)\vec{G}(S):12345678910111213141516SS:1223314565677848910910111211121314131415151616
Figure 6: Configuration SS and its graph representation G(S)\vec{G}(S).
Theorem 4.1.

If h=2h=2, |C|=n|C|=n, and k2k\geq 2, then all instances of 𝖶𝖲𝖯\mathsf{WSP} are yes-instances. Moreover, a shortest sorting sequence can be found in O(n)O(n) time.

Proof 4.2.

For the first claim of the theorem, it is enough to consider the case k=2k=2. Under a configuration SS, we say a color cCc\in C is sorted if the two units of color cc are in the same bin. For a configuration SS, we define a directed multigraph G(S)=(V,A)\vec{G}(S)=(V,A) as follows (see Figure 6). The vertex set VV is the color set CC. We add one directed edge from cc to cc^{\prime} for each full bin that contains a unit of color cc at the bottom and a unit of color cc^{\prime} at the top. That is, AA consists of S(b)S(b) for all full bins bb. (We may add self-loops here.) For the directed multigraph G(S)\vec{G}(S), we denote its underlying multigraph by G(S)=(V,E)G(S)=(V,E), where EE is a multiset obtained from AA by ignoring the directions. Since |C|=n|C|=n, each color appears twice in SS. When SS consists of full bins, G(S)G(S) is a 2-regular graph with nn edges, which means that G(S)G(S) is a set of cycles. We call a self-loop a trivial cycle. If SS also contains half bins, then G(S)G(S) is a disjoint union of cycles and paths.

Let pSp_{S} denote the number of nontrivial directed cycles in G(S)=(V,A)\vec{G}(S)=(V,A), qSq_{S} the number of vertices of indegree 2 in G(S)=(V,A)\vec{G}(S)=(V,A), and rSr_{S} the number of vertices with no self-loop, i.e., the number of unsorted colors. We prove that if SS has either two empty bins or one empty and two half bins, then a shortest sorting sequence has length pS+qS+rSp_{S}+q_{S}+r_{S}. Clearly the initial configuration, which has two empty bins, satisfies the condition.

We first prove that there exists a sorting sequence of length pS+qS+rSp_{S}+q_{S}+r_{S}. We use an induction on pS+qS+rSp_{S}+q_{S}+r_{S}. When pS+qS+rS=0p_{S}+q_{S}+r_{S}=0, all colors are sorted and thus SS is a goal. Now we turn to the inductive step, which consists of four cases.

(Case 1) Suppose SS has no half bins and qS=0q_{S}=0. Note that when SS has no half bins, it has two empty bins. Then one can move the top unit of an arbitrary bin to an empty bin. This eliminates a directed cycle in the graph. Then we have one empty and two half bins. The claim follows from the induction hypothesis.

(Case 2) Suppose SS has no half bins and there is a vertex cc whose indegree is 2. By moving the two units of cc to an empty bin, we use two moves, while decreasing both qSq_{S} and rSr_{S}. Then we have one empty and two half bins. The claim follows from the induction hypothesis.

(Case 3) Suppose SS has two half bins one of which has a color of outdegree 0. Then, the other unit of that color is not below another unit in some bin. By moving that unit on top of the half bin, we can sort the color by one move. This decreases rSr_{S} by one, and does not change qSq_{S}. After this move, still we have two half bins. The claim follows from the induction hypothesis.

(Case 4) Otherwise, SS has two half bins bb and bb^{\prime} and both colors in the half bins have outdegree 1. Let c0c_{0} be the color in bb and (c0,c1,,cm)(c_{0},c_{1},\ldots,c_{m}) the sequence of vertices such that (ci1,ci)A(c_{i-1},c_{i})\in A for all ii where cmc_{m} has outdegree 0. By the assumption, cmc_{m} is not the color of the unit of bb^{\prime}. Therefore, cmc_{m} has indegree 2, i.e., both units of cmc_{m} appear as the top units of full bins. We sort the color cmc_{m} using the empty bin. It takes two moves and decreases both rSr_{S} and qSq_{S} by one. We then sort cm1,,c0c_{m-1},\ldots,c_{0} in this order by mm moves, which decreases rSr_{S} by mm, where we require no additional empty bins and finally the bin bb will be empty. We have one empty and two half bins. The claim follows from the induction hypothesis.

We next prove that there exists no sorting sequence of length less than pS+qS+rSp_{S}+q_{S}+r_{S} with regardless of the number kk of empty bins. Note that if all colors are sorted, pS+qS+rS=0p_{S}+q_{S}+r_{S}=0. We show that any move decreases the potential by at most one.

(Case 1) If the color of the moved unit belongs to a directed cycle, this reduces pSp_{S} by one but not qSq_{S}. The other unit of the same color has a unit of another color on its top. Therefore this cannot reduce rSr_{S}.

(Case 2) If the color of the moved unit has indegree 2, this reduces qSq_{S} by one but none of pSp_{S} or rSr_{S}.

(Case 3) Otherwise, any other kind of moves cannot reduce pSp_{S} or qSq_{S}. Clearly one move cannot reduce rSr_{S} by two.

Since pS+qS+rS=O(n)p_{S}+q_{S}+r_{S}=O(n), we can sort any instance in O(n)O(n) moves. Each feasible move can be found in O(1)O(1) time, which completes the proof.

Theorem 4.3.

If h=2h=2, |C|=n|C|=n, and k=1k=1, then 𝖶𝖲𝖯\mathsf{WSP} can be solved in O(n)O(n) time. For a yes-instance, a shortest sorting sequence can be found in O(n)O(n) time.

Proof 4.4.

For a configuration SS, we again use the directed multigraph G(S)=(V,A)\vec{G}(S)=(V,A) and its underlying multigraph by G(S)=(V,E)G(S)=(V,E) defined in the proof of Theorem 4.1. Let S0S_{0} be a given instance of 𝖶𝖲𝖯\mathsf{WSP} with h=2h=2, k=1k=1, and |C|=n|C|=n. To simplify, we assume that S0S_{0} has no full bin that contains two units of the same color without loss of generality. (Hereafter, when we have a full bin that contains two units of the same color, we remove it from the configuration and never touch. In terms of G(S)\vec{G}(S), G(S)\vec{G}(S) has no self-loop, and once G(S)\vec{G}(S) produces a vertex with a self-loop, we remove it from G\vec{G}.) Observe that if a configuration SS consists of nn^{\prime} full bins and one empty bin, then G(S)G(S) is a 2-regular multigraph with nn^{\prime} edges. That is, G(S)G(S) is a set of cycles of size at least 2 since we remove vertices with self-loops.

Now we focus on a cycle CC^{\prime} in GG. When CC^{\prime} is a directed cycle in G\vec{G} (as C3C_{3} in Figure 6), it is easy to remove the vertices in CC^{\prime} from GG by using one empty bin. When CC^{\prime} contains exactly one vertex vv of outdegree 22 in G\vec{G}, CC^{\prime} contains another vertex uu of indegree 22. In this case, we first move two units of color uu to the empty bin, and we can sort the other bins using two half bins. Lastly, we can get two units of color vv together, and obtain an empty bin again. Now we consider the case that CC^{\prime} contains (at least) two vertices vv and vv^{\prime} of outdegree 22 in G\vec{G} (as C5C_{5} in Figure 6). In this case, we can observe that we eventually get stuck with two half bins of colors vv and vv^{\prime}. That is, SS is a no-instance if and only if G(S)\vec{G}(S) contains at least one cycle CC^{\prime} that has at least two vertices of outdegree 22 in G(S)\vec{G}(S).

In summary, we can conclude that the original instance S0S_{0} can be sorted if and only if every cycle in G(S0)G(S_{0}) contains at most one vertex of outdegree 22 in G(S0)\vec{G}(S_{0}). The construction of G(S0)\vec{G}(S_{0}) and G(S0)G(S_{0}) can be done in O(n)O(n) time, and this condition can be checked in O(n)O(n) time. Moreover, it is easy to observe that sorting items in each cycle of length nn^{\prime} in G(S0)G(S_{0}) requires n+1n^{\prime}+1 moves, and n+1n^{\prime}+1 moves are sufficient. Therefore, the optimal sorting requires n+n+\ell moves, where \ell is the number of cycles in G(S0)G(S_{0}), which can be computed in O(n)O(n) time.

By the proofs of Theorems 4.1 and 4.3, we have the following corollary:

Corollary 4.5.

If h=2h=2 and |C|=n|C|=n, any yes-instance of 𝖶𝖲𝖯\mathsf{WSP} has a solution of length O(n)O(n).

5 The number of sufficient bins

In this section, we consider the minimum number k(n,h)k(n,h) for nn and hh such that all instances of 𝖶𝖲𝖯\mathsf{WSP} with nn full bins of capacity hh with k(n,h)k(n,h) empty bins are yes-instances. We can easily see that such a number exists and that k(n,h)nk(n,h)\leq n as follows. Let SS be an instance with nn full and kk empty bins of capacity hh such that knk\geq n. For each color cc used in SS, let jcj_{c} be the positive integer such that SS contains jchj_{c}\cdot h units of color cc. We assign jcj_{c} empty bins to each color cc. Since cjc=n\sum_{c}j_{c}=n, we can easily “bucket sort” SS by moving water units in the nn bins to empty bins following the assignment of the empty bins to the colors. On the other hand, as shown in Theorem 4.3, we have no-instances when k=1k=1 even if h=2h=2 (see Section 6 for a larger no-instance).

In the following, we improve the upper bound of k(n,h)k(n,h) and show a lower bound.

hhnnkkjj
Figure 7: Refinement of the naive bucket sort.
Theorem 5.1.

k(n,h)h1hnk(n,h)\leq\lceil\frac{h-1}{h}n\rceil.

Proof 5.2.

It suffices to show the lemma for the case where n=|C|n=|C|. We refine the bucket-sort base algorithm described above. See Figure 7. Suppose that we have at least k=h1hnk=\lceil\frac{h-1}{h}n\rceil empty bins. Then, using those empty bins, one can make j=nhj=\lfloor\frac{n}{h}\rfloor other bins monochrome, since k(h1)jk\geq(h-1)j. Now we have k+j=nk+j=n monochrome bins. When some of those bins have the same color, we merge them. Then we can dedicate different nn bins to pairwise different colors, and we can use them to sort the remaining bins by the bucket sort.

We note that h1hn=n\lceil\frac{h-1}{h}n\rceil=n when h>nh>n, which coincides with the naive bucket sort algorithm.

Theorem 5.3.

k(n,h)1964min{n,h}k(n,h)\geq\lceil\frac{19}{64}\min\{n,h\}\rceil.

Proof 5.4.

We show a configuration of h=nh=n bins where every bin has the same contents (1,,n)(1,\dots,n). In the case where n>hn>h, one can add nhn-h extra bins whose contents are already sorted. In the case where n<hn<h, one can arbitrarily pad the non-empty bins of the above configuration with hnh-n extra units of each color at the bottom.

4504494414404014003213201451640444332211111191191640640190
Figure 8: When the first unit of color 11 has been moved, where n=640n=640 and k=190k=190.

Suppose that it is solvable with kk empty bins. We consider the very first moment when a unit of color 11 has been moved. Figure 8 shows an example configuration with n=640n=640 and k=190k=190. Note that any initially empty bin will always be monochrome. Let UU be the set of colors cc such that at least one unit of the color cc occupies a bin which was initially empty, where |U|k|U|\leq k, and rcr_{c} be the number of units of color cc which have been removed from the initial location. Particularly r1=1r_{1}=1. Then we have rc+1rcr_{c+1}\geq r_{c} for all colors c1c\geq 1, since a unit of color cc can be removed only after all the units above have been removed. Moreover, if cUc\notin U, units of color cc which can be put only on units of the same color. That is, from both the source and target bins involved in the move, the unit of color c+1c+1 must have been removed. Each of the target bins accepts at most (nc)(n-c) units of water color cc on top of the unit initially located there. Therefore, for cUc\notin U, rc(rc+1rc)(nc)r_{c}\leq(r_{c+1}-r_{c})(n-c) (see Figure 9), i.e.,

rc+1rcrcnc.r_{c+1}-r_{c}\geq\left\lceil\frac{r_{c}}{n-c}\right\rceil\,. (2)

We will give a lower bound of the size of UU that admits a sequence of integers (r1,,rn)(r_{1},\dots,r_{n}) with 1r1rnn1\leq r_{1}\leq\dots\leq r_{n}\leq n that satisfies Eq. (2) for cUc\notin U. Suppose that (r1,,rn)(r_{1},\dots,r_{n}) satisfies the condition with cUc\in U and cc+1Uc_{c+1}\notin U. Then, one can easily see that U=U{c}{c+1}U^{\prime}=U\setminus\{c\}\cup\{c+1\} admits a sequence (r1,,rc1,rc,rc+1,rn)(r_{1},\dots,r_{c-1},r_{c}^{\prime},r_{c+1},r_{n}) with rc=rc+1r_{c}^{\prime}=r_{c+1} that satisfies Eq. (2). Therefore, we may and will assume that U={nk+1,,n}U=\{n-k+1,\dots,n\}.

ncn-crc+1r_{c+1}rcr_{c}c+1c+1cc
Figure 9: If cUc\notin U, rc(rc+1rc)(nc)r_{c}\leq(r_{c+1}-r_{c})(n-c).

For cnkc\leq n-k, r1=1r_{1}=1 and rc+1rc1r_{c+1}-r_{c}\geq 1 implies rccr_{c}\geq c. Hence, for n/2<cnkn/2<c\leq n-k,

rc+1rcrcnc>n/2nn/2=1r_{c+1}-r_{c}\geq\left\lceil\frac{r_{c}}{n-c}\right\rceil>\frac{n/2}{n-n/2}=1

implies rc2cn/2r_{c}\geq 2c-n/2. Hence, for (5/8)n<cnk(5/8)n<c\leq n-k,

rc+1rcrcnc>(5/4)nn/2n(5/8)n=2r_{c+1}-r_{c}\geq\left\lceil\frac{r_{c}}{n-c}\right\rceil>\frac{(5/4)n-n/2}{n-(5/8)n}=2

implies rc3c(9/8)nr_{c}\geq 3c-(9/8)n. Hence, for (11/16)n<cnk(11/16)n<c\leq n-k,

rc+1rcrcnc>(33/16)n(9/8)nn(11/16)n=3r_{c+1}-r_{c}\geq\left\lceil\frac{r_{c}}{n-c}\right\rceil>\frac{(33/16)n-(9/8)n}{n-(11/16)n}=3

implies rc4c(29/16)nr_{c}\geq 4c-(29/16)n. On the other hand, rcnr_{c}\leq n. That is, cnkc\leq n-k implies c(45/64)nc\leq(45/64)n, i.e., k(19/64)nk\geq(19/64)n.

6 Concluding remarks

In this paper, we investigate the problems of solvability of ball and water sort puzzles. We show that both problems are equivalent and NP-complete. We also show that even for trivial instances of the water sort puzzle, it is NP-complete to find a shortest sorting sequence.

741841941752852952763863963(a)
104411055111661117711188211992124421255212663127731088310993(b)
Figure 10: No-instances for (a) h=3h=3, k=2k=2, n=9n=9 and (b) h=4h=4, k=3k=3, n=12n=12.

As discussed in Section 5, any instance can be sorted when kk is large enough. Especially, as discussed in Section 4, k=2k=2 is enough for h=2h=2, while we have a no-instance for k=1k=1 and h=2h=2. On the other hand, there exist no-instances with k=2k=2 for h=3h=3 and k=3k=3 for h=4h=4 (Figure 10). We constructed them by hand and verified that they are no-instances by using a computer program. It is interesting to give the boundary of kk for a given hh, especially, h=3h=3, that allows us to sort any input. Does kk depend on both hh and nn, or is it independent from nn?

Recently, a Japanese puzzle maker produces a commercial product of the ball sort puzzle.333https://www.hanayamatoys.co.jp/product/products/product-cat/puzzle/ They extend the ball sort puzzle by introducing three different aspects as follows. (1) It contains three different capacities of bins. That is, it allows to use different size of bins. (2) The balls are not only colored, but also numbered. In some problems, the puzzle asks us to not only arrange the balls of the same color into a bin, but also they should be in order in each bin. In some other problems, the puzzle asks us to arrange the balls so that the sums of the numbers in every non-empty bins are equal. (3) It contains transparent balls. That is, these transparent balls are just used for arranging balls, and their final positions do not matter in the final state. Those extensions seems to be reasonable future work.

References

  • [1] Michael R. Garey and David S. Johnson. Computers and Intractability — A Guide to the Theory of NP-Completeness. Freeman, 1979.
  • [2] William H. Gates and Christos H. Papadimitriou. Bounds for sorting by prefix reversal. Discret. Math., 27(1):47–57, 1979. doi:10.1016/0012-365X(79)90068-2.
  • [3] R. A. Hearn and E. D. Demaine. Games, Puzzles, and Computation. A K Peters Ltd., 2009.
  • [4] Hiro Ito, Junichi Teruyama, and Yuichi Yoshida. An almost optimal algorithm for Winkler’s sorting pairs in bins. Progress in Informatics, 9:3–7, 2012.
  • [5] Takehiro Ito, Erik D. Demaine, Nicholas J.A. Harvey, Christos H. Papadimitriou, Martha Sideri, Ryuhei Uehara, and Yushi Uno. On the complexity of reconfiguration problems. Theoretical Computer Science, 412:1054–1065, 2011.
  • [6] Donald E. Knuth. The Art of Computer Programming: Sorting and Searching, volume 3. Addison-Wesley Publishing Company, 2nd edition, 1998.
  • [7] Peter Winkler. Mathematical puzzles: A Connoisseur’s collection, volume 143, pages 149–151. A K Peters, 2004.
  • [8] Katsuhisa Yamanaka, Shin-ichi Nakano, Yasuko Matsui, Ryuhei Uehara, and Kento Nakada. Efficient Enumeration of All Ladder Lotteries and Its Application. Theoretical Computer Science, 411:1714–1722, 2010.