This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Minimizing the Number of Unions

Žarko Ranđelović Centre for Mathematical Sciences
Wilberforce Road
Cambridge CB3 0WB, U.K.
((June 2023))
Abstract

For a given number of kk-sets, how should we choose them so as to minimize the union-closed family that they generate? Our main aim in this paper is to show that, if 𝒜\mathcal{A} is a family of kk-sets of size (tk)\binom{t}{k}, and tt is sufficiently large, then the union-closed family generated by 𝒜\mathcal{A} has size at least that generated by the family of all kk-sets from a tt-set. This proves (for this size of family) a conjecture of Roberts. We also make some related conjectures, and give some other results, including a new proof of the result of Leck, Roberts and Simpson that exactly determines this minimum (for all sizes of the family) when k=2k=2.

1 Introduction

Let 𝒜={A1,,An}\mathcal{A}=\{A_{1},...,A_{n}\} be a family of kk-sets where the AiA_{i} are distinct. We are interested in minimizing the size of the union-closed family generated by 𝒜\mathcal{A}, namely

<𝒜>={Ai1Ai2Ais|1sn,1i1,i2,..,isn}.\text{\text{\textless}}\mathcal{A}\text{\textgreater}=\{A_{i_{1}}\cup A_{i_{2}}...\cup A_{i_{s}}|1\leq s\leq n,1\leq i_{1},i_{2},..,i_{s}\leq n\}.

It is natural to imagine that it is best to take the sets to be ‘as close together as possible’, so for example all kk-sets from a tt-set if the size is (tk)\binom{t}{k}. Note that the union-closed family generated by this family has size |[t](k)||[t]^{(\geq k)}|, where as usual [t][t] denotes the set {1,2,,t}\{1,2,...,t\} and for any set XX we write X(k)X^{(\geq k)} for the family of all subsets of XX of size at least kk. Our main aim is to prove this when tt is large.

Theorem 1.

Let 𝒜\mathcal{A} be a family of kk-sets of size (tk)\binom{t}{k}. Then for tt sufficiently large we have |<𝒜>||[t](k)||\text{\text{\textless}}\mathcal{A}\text{\textgreater}|\geq|[t]^{(\geq k)}|.

This was conjectured by Roberts [7]. In fact, he conjectured a result that should hold for all sizes of 𝒜\mathcal{A} : we discuss this in the final section of the paper. In principle the answer could have depended on the size of the ground set, say NN, but in fact it does not. The first possible case would be k=2k=2, and for this Leck, Roberts and Simpson showed an exact result. Initial segments of colex are best (colex on (k)\mathbb{N}^{(k)} is the order in which A<BA<B if max(A\B)<max(B\A)\max(A\backslash B)<\max(B\backslash A) where (k)\mathbb{N}^{(k)} is the collection of all subsets of \mathbb{N} of size kk).

Theorem 2.

(Leck, Roberts and Simpson [6]) Let 𝒜\mathcal{A} be a family of 22-sets, and let \mathcal{B} be the first |𝒜||\mathcal{A}| 22-sets in the colex order on (2)\mathbb{N}^{(2)}. Then |<𝒜>||<>||\text{\text{\textless}}\mathcal{A}\text{\textgreater}|\geq|\text{\text{\textless}}\mathcal{B}\text{\textgreater}|.

We give a new proof of this result. When k3k\geq 3 colex is not best. We discuss this at the end of the paper. We will use standard notation for set systems and graphs and their parameters. See Bollobás [1] for general background.

2 Proofs of Results

We will start by giving an overview of the proof of Theorem 11. Let NN be the size of the ground set. The idea of the proof is that we will first reduce the ground set to size close to tt. To reduce the size of the ground set we first show that it cannot be ”too big” and then we remove points belonging to few sets. If NN is not close to tt then after removing points belonging to few sets we get a set XX of size NN^{\prime} none of whose elements belong to too few of the remaining sets. Then we show that most subsets of XX are indeed unions. We can show that NN^{\prime} is close to tt. Then unless NN is close to tt starting with XX we can add elements from the ground set one by one making many more unions. This will give more unions than the lower bound we are trying to prove. For the close to tt case, similar to above removing elements belonging to few sets we get a set YY such that nearly all subsets of YY are unions, but this time we will have |Y|t|Y|\geq t. We will get too many unions unless we have both |Y|=t|Y|=t and YY being the whole ground set which is then the [t](k)[t]^{(k)} case.

We will first prove a few lemmas. For the following lemmas and the proof of Theorem 1 we will always use the notation that n=(tk)n=\binom{t}{k} where n,t,kn,t,k are integers such that tk>1t\geq k>1 and that 𝒜={A1,..,An}\mathcal{A}=\{A_{1},..,A_{n}\} is our family of kk-sets. We will also define G=i=1nAiG=\cup_{i=1}^{n}A_{i}. We may view GG as the ”ground set”.

The following lemma will be used to show that the ground set size cannot be too big.

Lemma 3.

If ss\in\mathbb{N} and |G|sk|G|\geq sk then |<𝒜>|2s1|\text{\text{\textless}}\mathcal{A}\text{\textgreater}|\geq 2^{s}-1.

Proof. We will prove a more general statement. Suppose that ll\in\mathbb{N} and B1,..,BlB_{1},..,B_{l} are subsets of \mathbb{N} such that |Bi|k|B_{i}|\leq k for all ii. Let T=i=1lBiT=\cup_{i=1}^{l}B_{i} and suppose |T|sk|T|\geq sk. Finally let

={Bi1Bi2Bir|1rl,1i1,i2,..,irl}.\mathcal{F}=\{B_{i_{1}}\cup B_{i_{2}}...\cup B_{i_{r}}|1\leq r\leq l,1\leq i_{1},i_{2},..,i_{r}\leq l\}.

We will show that ||2s1|\mathcal{F}|\geq 2^{s}-1. We will prove that there is some STS\subset T such that |S|=s|S|=s and for any non-empty XSX\subset S there is some BB\in\mathcal{F} such that BS=XB\cap S=X. This will prove the above statement and hence the lemma. To show this we will induct on ss.

For s=1s=1 it is trivial as we may choose any xB1x\in B_{1} and set S={x}S=\{x\}.

Suppose it is true for some s11s-1\geq 1. Now consider the sets B1,..,BlB_{1},..,B_{l} and TT. For every xTx\in T we define Tx={i|xBi}T_{x}=\{i|x\in B_{i}\}. Pick an xTx\in T with the smallest |Tx||T_{x}|. Now without loss of generality xB1x\in B_{1}. Let T=T\B1T^{\prime}=T\backslash B_{1}. We have |T|(s1)k|T^{\prime}|\geq(s-1)k. Now consider the collection of sets

={BiT|iTx}.\mathcal{H}=\{B_{i}\cap T^{\prime}|i\not\in T_{x}\}.

Suppose that yTy\in T^{\prime}. We cannot have Tx=TyT_{x}=T_{y} since yB1y\not\in B_{1} but |Tx||Ty||T_{x}|\leq|T_{y}| so there must be some iTy\Txi\in T_{y}\backslash T_{x}. But then BiTB_{i}\cap T^{\prime}\in\mathcal{H} so yBBy\in\cup_{B\in\mathcal{H}}B. Therefore

BB=T.\cup_{B\in\mathcal{H}}B=T^{\prime}.

Now by induction there is some STS^{\prime}\subset T^{\prime} of size s1s-1 such that for every non-empty XSX\subset S^{\prime} there is some BB which is a union of some sets in \mathcal{H} such that BS=XB\cap S^{\prime}=X. In other words the projection of a union-closed family containing \mathcal{H} onto SS^{\prime} contains all non-empty subsets of SS^{\prime}. From the definition of \mathcal{H} this means that there is some BB\in\mathcal{F} such that BS=XB\cap S^{\prime}=X and xBx\not\in B. Now consider all possible unions of sets in the family

{Bi|iTx}{B1}\{B_{i}|i\not\in T_{x}\}\cup\{B_{1}\}

and let S=S{x}S=S^{\prime}\cup\{x\}. We get that the projection of \mathcal{F} onto SS contains all non-empty subsets of SS and |S|=s|S|=s. This completes the induction step and the proof. ∎

We can see that |[t](k)|<2t|[t]^{(\geq k)}|<2^{t} and so if |<𝒜>|<|[t](k)||\text{\text{\textless}}\mathcal{A}\text{\textgreater}|<|[t]^{(\geq k)}| then by Lemma 3 we must have |G|tk|G|\leq tk. This is good as we have restricted the ground set which will make it easier to work with. The next step is showing that we can essentially remove all elements in the ground set that are in too few sets. For any xGx\in G define dx=|{i|xAi}|d_{x}=|\{i|x\in A_{i}\}|. Also let

𝒜x={AG|xA,|A|k,i(xAiAiA)}.\mathcal{A}_{x}=\{A\subset G|x\in A,|A|\geq k,\forall i\ (x\in A_{i}\Rightarrow A_{i}\not\subset A)\}.

In other words 𝒜x\mathcal{A}_{x} is the set of all subsets of GG of size at least kk for whom xx is one of the reasons that they are not in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater}.

The following lemma will be used to show that after removing elements in too few sets, most subsets of the remaining elements are in fact in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater}.

Lemma 4.

Suppose that ss\in\mathbb{N} and xGx\in G such that dxs(|G|k2)d_{x}\geq s\binom{|G|}{k-2}. Then we must have |𝒜x|2|G|s.|\mathcal{A}_{x}|\leq 2^{|G|-s}.

Proof. Consider the family 𝒯={Bi|xAi,Bi=Ai\{x}}\mathcal{T}=\{B_{i}|x\in A_{i},B_{i}=A_{i}\backslash\{x\}\}. We have that |𝒯|=dx|\mathcal{T}|=d_{x}. We may assume that G\{x}=[p]G\backslash\{x\}=[p] for some pp. Now we have 𝒯[p](k1)\mathcal{T}\subset[p]^{(k-1)}. Notice that 𝒜x\mathcal{A}_{x} consists of all subsets of GG containing xx that do not have a subset in 𝒯\mathcal{T}. For 0rp10\leq r\leq p-1 we define the upper shadow of a family [p](r)\mathcal{F}\subset[p]^{(r)} to be

+={A{i}|A,i[p]\A}.\partial_{+}\mathcal{F}=\{A\cup\{i\}|A\in\mathcal{F},i\in[p]\backslash A\}.

Also define +k=+++\partial_{+}^{k}\mathcal{F}=\partial_{+}\partial_{+}...\partial_{+}\mathcal{F} where +\partial_{+} is applied kk times (+0=\partial_{+}^{0}\mathcal{F}=\mathcal{F}). For N,rN,r\in\mathbb{N} define the lex order [N](r)[N]^{(r)} to be the order in which A<BA<B if min(A\B)<min(B\A)\min(A\backslash B)<\min(B\backslash A). Now let \mathcal{I} be the initial segment of lex on [p](k1)[p]^{(k-1)} of size |𝒯||\mathcal{T}|. For 1rp1\leq r\leq p we define the lower shadow of a family [p](r)\mathcal{F}\subset[p]^{(r)} to be

={A\{i}|A,iA}.\partial\mathcal{F}=\{A\backslash\{i\}|A\in\mathcal{F},i\in A\}.

Consider the map g:[p][p]g:[p]\rightarrow[p] given by g(i)=p+1ig(i)=p+1-i. If F:𝒫([p])𝒫([p])F:\mathcal{P}([p])\rightarrow\mathcal{P}([p]) is given by F(A)=g(Ac)F(A)=g(A^{c}) then FF is a bijection that takes any initial segment of lex into a corresponding initial segment of colex. For any [p](r)\mathcal{B}\subset[p]^{(r)} we have F(+())=F()F(\partial_{+}(\mathcal{B}))=\partial F(\mathcal{B}). Applying the Kruskal-Katona theorem (Kruskal [5], Katona [4] and see Bollobás [1]) we see that |F(𝒯)||F()||\partial F(\mathcal{T})|\geq|\partial F(\mathcal{I})| and hence |+𝒯||+||\partial_{+}\mathcal{T}|\geq|\partial_{+}\mathcal{I}|.

Note that the upper shadow of an initial segment of lex is an initial segment of lex. We can now show by induction that for any 1rpk+11\leq r\leq p-k+1 we have that |+r𝒯||+r||\partial_{+}^{r}\mathcal{T}|\geq|\partial_{+}^{r}\mathcal{I}|. We have by above that this is true for r=1r=1. If it is true for r<pk+1r<p-k+1 then if \mathcal{I^{\prime}} is the initial segment of lex of length |+r𝒯||\partial_{+}^{r}\mathcal{T}| on [p](k1+r)[p]^{(k-1+r)} then we know that |||+r||\mathcal{I^{\prime}}|\geq|\partial_{+}^{r}\mathcal{I}|. Also since the upper shadow of an initial segment of lex is an initial segment of lex we have that +r\partial_{+}^{r}\mathcal{I} is an initial segment of lex on [p](k1+r)[p]^{(k-1+r)} and hence +r\partial_{+}^{r}\mathcal{I}\subset\mathcal{I}^{\prime}. Now applying the Kruskal-Katona theorem

|+(r+1)𝒯||+||+(r+1)|.|\partial_{+}^{(r+1)}\mathcal{T|}\geq|\partial_{+}\mathcal{I^{\prime}}|\geq|\partial_{+}^{(r+1)}\mathcal{I}|.

So by induction |+r𝒯||+r||\partial_{+}^{r}\mathcal{T}|\geq|\partial_{+}^{r}\mathcal{I}| for all 1rpk+11\leq r\leq p-k+1. Now if we define the total upper shadow of a family [p](r)\mathcal{F}\subset[p]^{(r)}

U={A[p]|XAfor someX}U_{\mathcal{F}}=\{A\subset[p]|X\subset A\ \text{for some}\ X\in\mathcal{F}\}

then by above |U𝒯||U||U_{\mathcal{T}}|\geq|U_{\mathcal{I}}|. Notice also that |𝒜x|=|[p](k1)||U𝒯||\mathcal{A}_{x}|=|[p]^{(\geq k-1)}|-|U_{\mathcal{T}}|. Since we want an upper bound on |𝒜x||\mathcal{A}_{x}| we just need to show that UU_{\mathcal{I}} contains almost all sets in [p](k1)[p]^{(\geq k-1)}.

We know that in [p][p] the number of (k1)(k-1)-sets containing a fixed element is (p1k2)\binom{p-1}{k-2}. We also know that p=|G|1p=|G|-1 So the number of (k1)(k-1)-sets containing at least one element from [s][s] is at most s(|G|k2)dxs\binom{|G|}{k-2}\leq d_{x} which also means that sps\leq p. Note that in lex these sets are all before sets not containing any elements in [s][s]. so we must have that \mathcal{I} contains all sets with at least one element from [s][s] which means that the complement of UU_{\mathcal{I}} in [p](k1)[p]^{(\geq k-1)} is a subset of 𝒫([p]\[s])\mathcal{P}([p]\backslash[s]). So we have

|𝒜x||[p](k1)||U||𝒫([p]\[s])|2|G|s.|\mathcal{A}_{x}|\leq|[p]^{(\geq k-1)}|-|U_{\mathcal{I}}|\leq|\mathcal{P}([p]\backslash[s])|\leq 2^{|G|-s}.



The following will be a useful fact about getting many new unions.

Lemma 5.

Suppose that \mathcal{H} is a family of sets and AA is a set such that ASSA\not\subset\cup_{S\in\mathcal{H}}S. Then |{AS|S}|(1+12|A|1)|||\mathcal{H}\cup\{A\cup S|S\in\mathcal{H}\}|\geq(1+\frac{1}{2^{|A|-1}})|\mathcal{H}|.

Proof. Suppose that xA\SSx\in A\backslash\cup_{S\in\mathcal{H}}S. If two sets S1,S2S_{1},S_{2} in \mathcal{H} have AS1=AS2A\cup S_{1}=A\cup S_{2} then S1S_{1} and S2S_{2} can only differ on A(SS)A\cap(\cup_{S\in\mathcal{H}}S) which is a fixed set of size at most |A|1|A|-1. So at most 2|A|12^{|A|-1} different sets in \mathcal{H} can give the same union with AA. This means that

|{AS|S}|12|A|1||.|\{A\cup S|S\in\mathcal{H}\}|\geq\frac{1}{2^{|A|-1}}|\mathcal{H}|.

However, no set containing AA is in \mathcal{H} so we get the desired result. ∎

We now come to the heart of the proof.

Lemma 6.

For any real Dk1D_{k}\geq 1 there is a large enough constant BB, depending only on kk and DkD_{k}, such that if t>Bt>B and |G|<t+Dklogt|G|<t+D_{k}\log t then |<𝒜>|2ti=0k1(ti)|\text{\textless}\mathcal{A}\text{\textgreater}|\geq 2^{t}-\sum_{i=0}^{k-1}\binom{t}{i} and equality holds if and only if 𝒜=X(k)\mathcal{A}=X^{(k)} where XX is a set with |X|=t|X|=t.

Proof. Given DkD_{k} suppose that t>Bt>B and |G|<t+Dklogt|G|<t+D_{k}\log t where BB will be chosen later. Starting with GG and all of the AiA_{i} we remove elements from GG one by one (and hence removing any of the AiA_{i} containing those elements) if they are in less than 3log(2tk)(2tk2)3\log(2tk)\binom{2t}{k-2} of the remaining sets AiA_{i} until we get a GG^{\prime} which satisfies that no xGx\in G^{\prime} is in less than 3log(2tk)(2tk2)3\log(2tk)\binom{2t}{k-2} of the remaining sets. Notice that if |G|t1|G^{\prime}|\leq t-1 then at some point in our process we were left with G′′GG^{\prime\prime}\subset G such that |G′′|=t1|G^{\prime\prime}|=t-1. We can see that at that point we have removed at most DKlogt+1D_{K}\log t+1 elements. This means that we have not removed that many sets either. In fact we have removed at most 3(Dklogt+1)log(2tk)(2tk2)3(D_{k}\log t+1)\log(2tk)\binom{2t}{k-2} sets. We will make sure that B>3kB>3k to have that log(2tk)=logt+log(2k)2logt\log(2tk)=\log t+\log(2k)\leq 2\log t, Dklogt+12DklogtD_{k}\log t+1\leq 2D_{k}\log t and 2t3(tk)2t\leq 3(t-k). Now we just need to make sure that

t1log2t>12t1/23(k2)(k1)Dk\frac{t-1}{\log^{2}t}>12t^{1/2}\cdot 3^{(k-2)}(k-1)D_{k}

which is possible as (t1)/t1/2t1=e12logt1>log3t/48(t-1)/t^{1/2}\geq\sqrt{t}-1=e^{\frac{1}{2}\log t}-1>\log^{3}t/48 so we ensure that B>e6003(k2)(k1)DkB>e^{600\cdot 3^{(k-2)}(k-1)D_{k}}. Now the number of sets removed when we are left with G′′G^{\prime\prime} is at most

3(Dklogt+1)log(2tk)(2tk2)123(k2)Dklog2t(tk)k2(k2)!<t1/2(t1k1).3(D_{k}\log t+1)\log(2tk)\binom{2t}{k-2}\leq\frac{12\cdot 3^{(k-2)}D_{k}\log^{2}t(t-k)^{k-2}}{(k-2)!}<t^{-1/2}\binom{t-1}{k-1}.

This means that for large enough BB we could have only removed less than (t1k1)\binom{t-1}{k-1} sets and hence when we get G′′G^{\prime\prime} we still have more than (tk)(t1k1)=(t1k)\binom{t}{k}-\binom{t-1}{k-1}=\binom{t-1}{k} sets remaining. This is a contradiction as |G′′|=t1|G^{\prime\prime}|=t-1 and hence we have that |G|t|G^{\prime}|\geq t. Now we will show that if BB is large enough we can make sure that the vast majority of subsets of GG^{\prime} are indeed in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater}. Define for any xGx\in G^{\prime}

𝒜x=𝒜x𝒫(G).\mathcal{A}_{x}^{\prime}=\mathcal{A}_{x}\cap\mathcal{P}(G^{\prime}).

From the proof of Lemma 4, we can see that the lemma is true for any nn, not just n=(tk)n=\binom{t}{k}. We will apply Lemma 4 on GG^{\prime} and the remaining sets when we are left with GG^{\prime}. For large enough BB we have |G||G|<2t|G^{\prime}|\leq|G|<2t so if we define dG(x)=|{Ai|xAi,AiG}|d_{G^{\prime}}(x)=|\{A_{i}|x\in A_{i},A_{i}\subset G^{\prime}\}| we have dG(x)3log(2tk)(|G|k2)d_{G^{\prime}}(x)\geq 3\log(2tk)\binom{|G^{\prime}|}{k-2} for all xGx\in G^{\prime}. Now by Lemma 4

|𝒜x|2|G|[3log(2tk)]2|G|2log(2tk)2|G|2tk.|\mathcal{A}_{x}^{\prime}|\leq 2^{|G^{\prime}|-[3\log(2tk)]}\leq 2^{|G^{\prime}|-2\log(2tk)}\leq\frac{2^{|G^{\prime}|}}{2tk}.

Now since every set in G(k)G^{\prime(\geq k)} that is not in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater} must be in some 𝒜x\mathcal{A}_{x}^{\prime} we have by the union bound

|<𝒜>||G(k)||xG𝒜x|2|G|i=0k1(|G|i)|G|2|G|2tk.\displaystyle|\text{\textless}\mathcal{A}\text{\textgreater}|\geq|G^{\prime(\geq k)}|-|\cup_{x\in G^{\prime}}\mathcal{A}_{x}^{\prime}|\geq 2^{|G^{\prime}|}-\sum_{i=0}^{k-1}\binom{|G^{\prime}|}{i}-|G^{\prime}|\frac{2^{|G^{\prime}|}}{2tk}. (1)

Since |G|t>3k|G^{\prime}|\geq t>3k we have (|G|i)<12(|G|i+1)\binom{|G^{\prime}|}{i}<\frac{1}{2}\binom{|G^{\prime}|}{i+1} for 0ik10\leq i\leq k-1 because |G|i>2(i+1)|G^{\prime}|-i>2(i+1). Since exponential beats polynomial, for large enough tt we have |G|k<2|G|2k+1|G^{\prime}|^{k}<\frac{2^{|G^{\prime}|}}{2^{k+1}}. We will take a large enough BB to have that i=0k1(|G|i)(|G|k)<|G|k<2|G|2k+1\sum_{i=0}^{k-1}\binom{|G^{\prime}|}{i}\leq\binom{|G^{\prime}|}{k}<|G^{\prime}|^{k}<\frac{2^{|G^{\prime}|}}{2^{k+1}}. Now if |G|>t+1|G^{\prime}|>t+1 by (1)(1) we have

|<𝒜>|2|G|2|G|2k+12|G|22|G|42t|\text{\textless}\mathcal{A}\text{\textgreater}|\geq 2^{|G^{\prime}|}-\frac{2^{|G^{\prime}|}}{2^{k+1}}-\frac{2^{|G^{\prime}|}}{2}\geq\frac{2^{|G^{\prime}|}}{4}\geq 2^{t}

so we are done and if |G|=t+1|G^{\prime}|=t+1 then 3|G|<4t2tk3|G^{\prime}|<4t\leq 2tk so again by (1)(1)

|<𝒜>|2|G|2|G|2k+12|G|3(11813)2|G|>122|G|=2t|\text{\textless}\mathcal{A}\text{\textgreater}|\geq 2^{|G^{\prime}|}-\frac{2^{|G^{\prime}|}}{2^{k+1}}-\frac{2^{|G^{\prime}|}}{3}\geq\bigg{(}1-\frac{1}{8}-\frac{1}{3}\bigg{)}2^{|G^{\prime}|}>\frac{1}{2}2^{|G^{\prime}|}=2^{t}

and we are also done. So we may assume that |G|=t|G^{\prime}|=t and that G=[t]G^{\prime}=[t]. Similar to when we considered G′′G^{\prime\prime} we have now removed at most 1t1/2(t1k1)\frac{1}{t^{1/2}}\binom{t-1}{k-1} sets and since n=(tk)n=\binom{t}{k} we know that at most 1t1/2(t1k1)\frac{1}{t^{1/2}}\binom{t-1}{k-1} sets in [t](k)[t]^{(k)} are not in 𝒜\mathcal{A} so for any x[t]x\in[t] we have

dG(x)t1t(t1k1)t12k(t2k2)t1/2(t2k2)d_{G^{\prime}}(x)\geq\frac{\sqrt{t}-1}{\sqrt{t}}\binom{t-1}{k-1}\geq\frac{t-1}{2k}\binom{t-2}{k-2}\geq\lceil t^{1/2}\rceil\binom{t-2}{k-2}

for large enough BB. Now by Lemma 4 on GG^{\prime} we have that

|𝒜x|2tt1/2<2tt2k+1|\mathcal{A}_{x}^{\prime}|\leq 2^{t-t^{1/2}}<\frac{2^{t}}{t2^{k+1}}

for large enough BB since t1/2>logtlog2+(k+1)t^{1/2}>\frac{\log t}{\log 2}+(k+1) for large enough tt. Now we have that

|xG𝒜x|xG|𝒜x|2t2k+1.|\cup_{x\in G^{\prime}}\mathcal{A}_{x}^{\prime}|\leq\sum_{x\in G^{\prime}}|\mathcal{A}_{x}^{\prime}|\leq\frac{2^{t}}{2^{k+1}}.

From before we have i=0k1(ti)2t2k+1\sum_{i=0}^{k-1}\binom{t}{i}\leq\frac{2^{t}}{2^{k+1}} and thus

|<𝒜>[t](k)|2t2t2k+12t2k+1=2t(112k).|\text{\textless}\mathcal{A}\text{\textgreater}\cap[t]^{(\geq k)}|\geq 2^{t}-\frac{2^{t}}{2^{k+1}}-\frac{2^{t}}{2^{k+1}}=2^{t}\bigg{(}1-\frac{1}{2^{k}}\bigg{)}.

Now if GGG\neq G^{\prime} there is some Ai𝒜,AiGA_{i}\in\mathcal{A},A_{i}\not\subset G^{\prime} so by Lemma 5

|<𝒜>|(1+12k1)|<𝒜>[t](k)|(1+12k1)2t(112k)>2t|\text{\textless}\mathcal{A}\text{\textgreater}|\geq\bigg{(}1+\frac{1}{2^{k-1}}\bigg{)}|\text{\textless}\mathcal{A}\text{\textgreater}\cap[t]^{(\geq k)}|\geq\bigg{(}1+\frac{1}{2^{k-1}}\bigg{)}2^{t}\bigg{(}1-\frac{1}{2^{k}}\bigg{)}>2^{t}

and we are done. If G=GG=G^{\prime} we must indeed have that 𝒜=[t](k)\mathcal{A}=[t]^{(k)} and so |<𝒜>|=2ti=0k1(ti)|\text{\textless}\mathcal{A}\text{\textgreater}|=2^{t}-\sum_{i=0}^{k-1}\binom{t}{i}. By the above we only have equality when 𝒜=X(k)\mathcal{A}=X^{(k)} for some set XX where |X|=t|X|=t. This proves the lemma. ∎

We now prove our main result. All we need to do is reduce the ground set size enough to apply Lemma 66. To do this we will apply Lemmas 3,43,4 and 55.

Proof of Theorem 1. We will show that there is a CkC_{k} such that if t>Ckt>C_{k} we have |<𝒜>||[t](k)|=2ti=0k1(ti)|\text{\textless}\mathcal{A}\text{\textgreater}|\geq|[t]^{(\geq k)}|=2^{t}-\sum_{i=0}^{k-1}\binom{t}{i}. If |G|tk|G|\geq tk then by Lemma 3 we have |<𝒜>|2t1|\text{\textless}\mathcal{A}\text{\textgreater}|\geq 2^{t}-1 so we may assume that |G|<tk|G|<tk. Now let s=2log(2tk)s=2\lceil\log(2tk)\rceil. Keep removing elements from GG one by one (and all of the sets AiA_{i} containing them) until no element that is left is in less than s(tkk2)s\binom{tk}{k-2} of the remaining sets. Suppose that GG^{\prime} is the set of the remaining elements. We have removed at most tks(tkk2)tks\binom{tk}{k-2} sets. Since 2k([t/2]+i)tk2k([t/2]+i)\geq tk for all i1i\geq 1 we have that

tk(tkk2)(tk)k1(k2)!(k1)(2k)k1([t/2]+kk1).tk\binom{tk}{k-2}\leq\frac{(tk)^{k-1}}{(k-2)!}\leq(k-1)(2k)^{k-1}\binom{[t/2]+k}{k-1}.

We will make sure Ck>2kC_{k}>2k. Now let p=2log(2tk)(k1)(2k)k1<3logt(2k)kp=2\lceil\log(2tk)\rceil(k-1)(2k)^{k-1}<3\log t(2k)^{k}. Then we have removed at most p([t/2]+kk1)p\binom{[t/2]+k}{k-1} sets. But we have also removed at least (tk)(|G|k)\binom{t}{k}-\binom{|G^{\prime}|}{k} sets. If t>m[t/2]+kt>m\geq[t/2]+k then

(tk)(mk)=l=mt1(lk1)(tm)([t/2]+kk1).\binom{t}{k}-\binom{m}{k}=\sum_{l=m}^{t-1}\binom{l}{k-1}\geq(t-m)\binom{[t/2]+k}{k-1}.

Since exponential beats polynomial take CkC_{k} large enough so that

t[t2]kt3=elogt3>3logt(2k)k>p.t-\bigg{[}\frac{t}{2}\bigg{]}-k\geq\frac{t}{3}=\frac{e^{\log t}}{3}>3\log t(2k)^{k}>p.

Then if |G|[t/2]+k|G^{\prime}|\leq[t/2]+k then we have removed at least

(tk)([t/2]+kk)(t[t/2]k)([t/2]+kk1)>p([t/2]+kk1)\binom{t}{k}-\binom{[t/2]+k}{k}\geq(t-[t/2]-k)\binom{[t/2]+k}{k-1}>p\binom{[t/2]+k}{k-1}

sets which is impossible. So we must have |G|[t/2]+k|G^{\prime}|\geq[t/2]+k and in fact if |G|=tr|G^{\prime}|=t-r then we must have rp<3logt(2k)kr\leq p<3\log t(2k)^{k}. We will first show that most subsets of GG^{\prime} are actually in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater}.

Notice that if xGx\in G^{\prime} then dG(x)s(tkk2)s(|G|k2)d_{G^{\prime}}(x)\geq s\binom{tk}{k-2}\geq s\binom{|G^{\prime}|}{k-2}. This means we can apply Lemma 4 to GG^{\prime}. Define 𝒜x\mathcal{A}_{x}^{\prime} as in Lemma 6. We obtain that for all xGx\in G^{\prime} we have

|𝒜x|2|G|s2|G|22log(2tk)2|G|2tk.|\mathcal{A}_{x}^{\prime}|\leq 2^{|G^{\prime}|-s}\leq\frac{2^{|G^{\prime}|}}{2^{2\log(2tk)}}\leq\frac{2^{|G^{\prime}|}}{2tk}.

This means that

|xG𝒜x|xG|𝒜x||G|2|G|2tk2|G|1.|\cup_{x\in G^{\prime}}\mathcal{A}_{x}^{\prime}|\leq\sum_{x\in G^{\prime}}|\mathcal{A}_{x}^{\prime}|\leq|G^{\prime}|\frac{2^{|G^{\prime}|}}{2tk}\leq 2^{|G^{\prime}|-1}.

If AGA\subset G^{\prime} and |A|k|A|\geq k then A<𝒜>A\not\in\text{\textless}\mathcal{A}\text{\textgreater} if and only if there is some xGx\in G^{\prime} such that AAxA\in A_{x}^{\prime}. This means that

|<𝒜>𝒫(G)|=2|G||xG𝒜x|i=0k1(|G|i)2|G|2|G|1i=0k1(|G|i).\displaystyle|\text{\textless}\mathcal{A}\text{\textgreater}\cap\mathcal{P}(G^{\prime})|=2^{|G^{\prime}|}-|\cup_{x\in G^{\prime}}\mathcal{A}_{x}^{\prime}|-\sum_{i=0}^{k-1}\binom{|G^{\prime}|}{i}\geq 2^{|G^{\prime}|}-2^{|G^{\prime}|-1}-\sum_{i=0}^{k-1}\binom{|G^{\prime}|}{i}. (2)

We can easily bound the sum of the binomials since if t>6k+2t>6k+2 then we have |G|>t/2>3k+1|G^{\prime}|>t/2>3k+1 so just like in the proof of Lemma 6 we will take CkC_{k} large enough so that i=0k1(|G|i)(|G|k)<|G|k<2|G|k12|G|2\sum_{i=0}^{k-1}\binom{|G^{\prime}|}{i}\leq\binom{|G^{\prime}|}{k}<|G^{\prime}|^{k}<2^{|G^{\prime}|-k-1}\leq 2^{|G^{\prime}|-2} since exponential beats polynomial. Now from (2) we have that |<𝒜>𝒫(G)|2|G|2|\text{\textless}\mathcal{A}\text{\textgreater}\cap\mathcal{P}(G^{\prime})|\geq 2^{|G^{\prime}|-2}. We will now show that there is a constant DkD_{k} dependent on kk such that if |G|t+Dklogt|G|\geq t+D_{k}\log t then we must have |<𝒜>|2t|\text{\textless}\mathcal{A}\text{\textgreater}|\geq 2^{t}. Let 0=<𝒜>𝒫(G)\mathcal{H}_{0}=\text{\textless}\mathcal{A}\text{\textgreater}\cap\mathcal{P}(G^{\prime}) and G0=GG_{0}=G^{\prime}. First of if |G|t+2|G^{\prime}|\geq t+2 then we are done so suppose that |G|<t+2|G^{\prime}|<t+2. We will now show that we can get a lot more unions by adding sets with new elements. If we can pick some xG\Gx\in G\backslash G^{\prime} and a set AiA_{i} containing xx then notice that by taking 1=0{AAi|A0}\mathcal{H}_{1}=\mathcal{H}_{0}\cup\{A\cup A_{i}|A\in\mathcal{H}_{0}\} and G1=G0AiG_{1}=G_{0}\cup A_{i} we see that by Lemma 5

|1|(1+12k1)|0|.|\mathcal{H}_{1}|\geq\bigg{(}1+\frac{1}{2^{k-1}}\bigg{)}|\mathcal{H}_{0}|.

This is useful, because we have multiplied the total number of unions by a constant bigger than 11 but dependent on kk and we have added at most kk new elements. We may keep on going as long as there are new elements and construct 2,3,,q\mathcal{H}_{2},\mathcal{H}_{3},...,\mathcal{H}_{q} and G2,G3,,GqG_{2},G_{3},...,G_{q} with |Hi|(1+12k1)|Hi1||H_{i}|\geq(1+\frac{1}{2^{k-1}})|H_{i-1}| and |Gi\Gi1|k|G_{i}\backslash G_{i-1}|\leq k for 1iq1\leq i\leq q and Gq=GG_{q}=G. This means that if there are too many elements in GG we will get that the total number of unions is too big. First of notice that |G|=tr>t3logt(2k)k|G^{\prime}|=t-r>t-3\log t(2k)^{k}. Now suppose that |G|t+2+(2k+3klogt(2k)k)log1+12k12|G|\geq t+2+(2k+3k\log t(2k)^{k})\log_{1+\frac{1}{2^{k-1}}}2. This means that we can repeat the above process of adding a set and less than kk new elements at least q(2+3logt(2k)k)log1+12k12q\geq(2+3\log t(2k)^{k})\log_{1+\frac{1}{2^{k-1}}}2 times so we have that

|<𝒜>||0|(1+12k1)(2+3logt(2k)k)log1+12k122|G|222+3logt(2k)k2t.|\text{\textless}\mathcal{A}\text{\textgreater}|\geq|\mathcal{H}_{0}|\bigg{(}1+\frac{1}{2^{k-1}}\bigg{)}^{(2+3\log t(2k)^{k})\log_{1+\frac{1}{2^{k-1}}}2}\geq 2^{|G^{\prime}|-2}\cdot 2^{2+3\log t(2k)^{k}}\geq 2^{t}.

This means that there are constants EkE_{k} and Dk1D_{k}\geq 1 dependent only on kk such that if t>Ekt>E_{k} and |G|t+Dklogt|G|\geq t+D_{k}\log t we have that |<𝒜>|2t|\text{\textless}\mathcal{A}\text{\textgreater}|\geq 2^{t}. If |G|<t+Dklogt|G|<t+D_{k}\log t then since DkD_{k} depends only on kk using Lemma 6 we get that there is a BB dependent only on kk such that if t>Bt>B and |G|<t+Dklogt|G|<t+D_{k}\log t we have the desired result. Taking Ck=max(Ek,B)C_{k}=\max(E_{k},B) proves the theorem. ∎

In the rest of this section we will prove Theorem 2 which solves the case k=2k=2 completely. We will show that if nn\in\mathbb{N} and t2t\geq 2 is the smallest positive integer such that n(t2)n\leq\binom{t}{2} then f(n,2)=2t2(t2)ntf(n,2)=2^{t}-2^{\binom{t}{2}-n}-t. We note that t=2n+32t=\lfloor\sqrt{2n}+\frac{3}{2}\rfloor.

Proof of Theorem 2. As above n=|𝒜|n=|\mathcal{A}| and let GG be the graph with edge set 𝒜={A1,..,An}\mathcal{A}=\{A_{1},..,A_{n}\} and vertex set A1A2AnA_{1}\cup A_{2}...\cup A_{n}. Suppose that |<𝒜>||<>||\text{\textless}\mathcal{A}\text{\textgreater}|\leq|\text{\textless}\mathcal{B}\text{\textgreater}|. Let tt\in\mathbb{N} be such that (t12)<n(t2)\binom{t-1}{2}<n\leq\binom{t}{2}. We see that |G|t|G|\geq t. Notice that [t](2)\mathcal{B}\subset[t]^{(2)} so |<>|2tt1<2t1|\text{\textless}\mathcal{B}\text{\textgreater}|\leq 2^{t}-t-1<2^{t}-1. For any vertex xGx\in G let d(x)d(x) be the degree of xx in GG. By considering all edges with xx we can make at least 2d(x)12^{d(x)}-1 distinct unions. Thus we know that the maximal degree is bounded by Δ(G)<t\Delta(G)<t. We also may assume that diamG2\operatorname{diam}G\leq 2 since if there are some x,yGx,y\in G such that the distance between them is d(x,y)3d(x,y)\geq 3 then we may replace the family A1,..,AnA_{1},..,A_{n} with the family obtained from those sets by identifying xx and yy. This will keep the size of the family to be nn and will not increase |<𝒜>||\text{\textless}\mathcal{A}\text{\textgreater}| as it will just identify x,yx,y in all of the unions. Denote by Γ(x)\Gamma(x) the set containing xx and all elements adjacent to xx in GG. We prove the following claim.

Claim. For every xGx\in G we have |G||Γ(x)|<t|G|-|\Gamma(x)|<t.

Proof of Claim. Suppose xGx\in G. Since diamG2\operatorname{diam}G\leq 2 for every yG\Γ(x)y\in G\backslash\Gamma(x) there is a zΓ(x)z\in\Gamma(x) such that yzE(G)yz\in E(G). This means if we consider the family

𝒞={A(G\Γ(x))|A<𝒜>}\mathcal{C}=\{A\cap(G\backslash\Gamma(x))|A\in\text{\textless}\mathcal{A}\text{\textgreater}\}

then we have that each singleton subset of G\Γ(x)G\backslash\Gamma(x) is in 𝒞\mathcal{C}. But 𝒞\mathcal{C} is closed under unions and also has size at most |<𝒜>|<2t1|\text{\textless}\mathcal{A}\text{\textgreater}|<2^{t}-1 meaning that |G\Γ(x)|<t|G\backslash\Gamma(x)|<t which proves the claim. ∎

With the bound on Δ(G)\Delta(G) this gives |G|2t1|G|\leq 2t-1. We will show that |G|t|G|-t must be small. Just like in the proof of Theorem 1 instead of counting how many sets are in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater} we will instead count how many are in =𝒫(G)\<𝒜>\mathcal{F}=\mathcal{P}(G)\backslash\text{\textless}\mathcal{A}\text{\textgreater}. Observe that if SS\in\mathcal{F} and SS\neq\emptyset then for some xSx\in S for all yS\{x}y\in S\backslash\{x\} we have xyE(G)xy\not\in E(G). So for each set in \mathcal{F} there is some element which prevents it from being in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater}. Similar to our above definition of 𝒜x\mathcal{A}_{x} let

𝒯x={SG|xS,yS\{x}xyE(G)}.\mathcal{T}_{x}=\{S\subset G\ |x\in S,\forall y\in S\backslash\{x\}\ xy\not\in E(G)\}.

Denote by sxs_{x} the degree of xx in GcG^{c}. We have |𝒯x|=2sx|\mathcal{T}_{x}|=2^{s_{x}} and by the above claim sx<ts_{x}<t for any xx. Note that =(xG𝒯x){}\mathcal{F}=(\cup_{x\in G}\mathcal{T}_{x})\cup\{\emptyset\} so we have the bound

||1+xG|𝒯x|1+xG2sx.\displaystyle|\mathcal{F}|\leq 1+\sum\limits_{x\in G}|\mathcal{T}_{x}|\leq 1+\sum\limits_{x\in G}2^{s_{x}}. (3)

We also know that xGsx=(|G|2)n\sum\limits_{x\in G}s_{x}=\binom{|G|}{2}-n. If ab>0a\geq b>0 then clearly 2a+1+2b1>2a+2b2^{a+1}+2^{b-1}>2^{a}+2^{b}. Now consider variables sx0s_{x}^{\prime}\in\mathbb{Z}_{\geq 0} for xGx\in G. If there are distinct x,yGx,y\in G such that 0<sxsy<t10<s_{x}^{\prime}\leq s_{y}^{\prime}<t-1 then by above replacing sx,sys_{x}^{\prime},s_{y}^{\prime} with sx1,sy+1s_{x}^{\prime}-1,s_{y}^{\prime}+1 increases xG2sx\sum\limits_{x\in G}2^{s_{x}^{\prime}}. So the biggest the sum xG2sx\sum\limits_{x\in G}2^{s_{x}^{\prime}} could be subject to the constraints

xGsx=(|G|2)n\displaystyle\sum\limits_{x\in G}s_{x}^{\prime}=\binom{|G|}{2}-n (4)
sx<t\displaystyle s_{x}^{\prime}<t (5)

is if all except at most one of the sxs_{x}^{\prime} are either 0 or t1t-1. Let (|G|2)n=d(t1)+m\binom{|G|}{2}-n=d(t-1)+m with non-negative integers d,md,m and 0m<t10\leq m<t-1. Since the sxs_{x} satisfy constraints (4) and (5) we must have |G|d|G|\geq d. By (3) this means that

||d2t1+2m+|G|d.\displaystyle|\mathcal{F}|\leq d2^{t-1}+2^{m}+|G|-d. (6)

Note that |<𝒜>|=2|G||||\text{\textless}\mathcal{A}\text{\textgreater}|=2^{|G|}-|\mathcal{F}| thus

2tt1|<𝒜>|2|G|d2t12m|G|+d.\displaystyle 2^{t}-t-1\geq|\text{\textless}\mathcal{A}\text{\textgreater}|\geq 2^{|G|}-d2^{t-1}-2^{m}-|G|+d. (7)

We now want to bound dd in terms of |G||G| and tt. We know that

d(|G|2)nt1(|G|2)(t12)t1=(|G|t+1)(|G|+t2)2(t1)\displaystyle d\leq\frac{\binom{|G|}{2}-n}{t-1}\leq\frac{\binom{|G|}{2}-\binom{t-1}{2}}{t-1}=\frac{(|G|-t+1)(|G|+t-2)}{2(t-1)}
(|G|t+1)3(t1)2(t1)=32(|G|t+1).\displaystyle\leq\frac{(|G|-t+1)3(t-1)}{2(t-1)}=\frac{3}{2}(|G|-t+1).

We can also get a lower bound on dd since

d=(|G|2)nt1(|G|2)(t2)t1=(|G|t)(|G|+t1)2(t1)(|G|t)2(t1)2(t1)=|G|t.d=\Bigl{\lfloor}\frac{\binom{|G|}{2}-n}{t-1}\Bigr{\rfloor}\geq\Bigl{\lfloor}\frac{\binom{|G|}{2}-\binom{t}{2}}{t-1}\Bigr{\rfloor}=\Bigl{\lfloor}\frac{(|G|-t)(|G|+t-1)}{2(t-1)}\Bigr{\rfloor}\geq\Bigl{\lfloor}\frac{(|G|-t)2(t-1)}{2(t-1)}\Bigr{\rfloor}=|G|-t.

Now rearranging the terms in (7) and using the lower bound on dd we obtain

(d+3)2t12t+2m+d2t12|G|+1+t|G|+d2|G|.(d+3)2^{t-1}\geq 2^{t}+2^{m}+d2^{t-1}\geq 2^{|G|}+1+t-|G|+d\geq 2^{|G|}.

Now let l=|G|t+1l=|G|-t+1. From the upper bound on dd and the previous inequality we obtain

32l+3d+32l\frac{3}{2}l+3\geq d+3\geq 2^{l}

so we have 3l+62l+13l+6\geq 2^{l+1}. Note that for l=3l=3 this does not hold and by induction if 2l+1>3l+62^{l+1}>3l+6 then 2l+2>2(3l+6)>3(l+1)+62^{l+2}>2(3l+6)>3(l+1)+6 so we must have l2l\leq 2.

Now suppose that l=2l=2. Then |G|=t+1|G|=t+1. Since (t+12)(t2)(t+12)n<(t+12)(t12)\binom{t+1}{2}-\binom{t}{2}\leq\binom{t+1}{2}-n<\binom{t+1}{2}-\binom{t-1}{2} we have that td(t1)+m<2t1t\leq d(t-1)+m<2t-1 so either d=1d=1 or m=0,d=2m=0,d=2 which means that d2t1+2m2t+1d2^{t-1}+2^{m}\leq 2^{t}+1. If d=1d=1 then d2t1+2m2td2^{t-1}+2^{m}\leq 2^{t} so in any case d2t1+2md2t1d2^{t-1}+2^{m}-d\leq 2^{t}-1 Now from (6) we have ||2t1+(t+1)=2t+t|\mathcal{F}|\leq 2^{t}-1+(t+1)=2^{t}+t. So we have

|<𝒜>|=2t+1||2t+12tt=2tt>2tt1|<𝒜>||\text{\textless}\mathcal{A}\text{\textgreater}|=2^{t+1}-|\mathcal{F}|\geq 2^{t+1}-2^{t}-t=2^{t}-t>2^{t}-t-1\geq|\text{\textless}\mathcal{A}\text{\textgreater}|

which is a contradiction.

Since |G|t|G|\geq t we must have l=1l=1. We may assume that G=[t]G=[t]. We now have that |<𝒜>||\text{\textless}\mathcal{A}\text{\textgreater}| is minimized exactly when |||\mathcal{F}| is maximized. Now let r=(t2)nr=\binom{t}{2}-n. We know that i=1tsi=r\sum_{i=1}^{t}s_{i}=r but if we look at 𝒯1,,𝒯t\mathcal{T}_{1},...,\mathcal{T}_{t} we see that at least rr sets appear in two of the 𝒯i\mathcal{T}_{i}. This is because all sets corresponding to the edges in GcG^{c} appear in two of the 𝒯i\mathcal{T}_{i}. This means that

||=|(xG𝒯x){}|i=1t2sir+1.\displaystyle|\mathcal{F}|=|(\cup_{x\in G}\mathcal{T}_{x})\cup\{\emptyset\}|\leq\sum_{i=1}^{t}2^{s_{i}}-r+1. (8)

We consider the sum X=i=1t2siX=\sum_{i=1}^{t}2^{s_{i}} over all possible GcG^{c} with rr edges. Without loss of generality we may assume that s1s2sts_{1}\leq s_{2}\leq...\leq s_{t}. Now suppose that some edge in GcG^{c} is not incident with tt. Removing this edge we decrease XX by at most 2st2^{s_{t}} (since two of the sis_{i} decrease by 1. Now we can add an edge incident to tt (that is not already present) because r<(t2)(t12)=t1r<\binom{t}{2}-\binom{t-1}{2}=t-1. This increases XX by at least 2st+12^{s_{t}}+1. So overall we have strictly increased XX. This means that any GcG^{c} which maximizes XX has all edges incident to the highest degree vertex so it is a star. In this case X=2r+r+t1X=2^{r}+r+t-1. So we have that

||2r+r+t1r+1=2r+t.|\mathcal{F}|\leq 2^{r}+r+t-1-r+1=2^{r}+t.

Now we have that

|<𝒜>|2t||2t2rt.\displaystyle|\text{\textless}\mathcal{A}\text{\textgreater}|\geq 2^{t}-|\mathcal{F}|\geq 2^{t}-2^{r}-t.

Notice that a set S[t]S\subset[t] is not in <>\text{\textless}\mathcal{B}\text{\textgreater} if and only if |S|1|S|\leq 1 or S={t}SS=\{t\}\cup S^{\prime} where S{tr,tr+1,,t1}S^{\prime}\subset\{t-r,t-r+1,...,t-1\}. This means that |<>|=2tt1(2r1)=2t2rt|\text{\textless}\mathcal{B}\text{\textgreater}|=2^{t}-t-1-(2^{r}-1)=2^{t}-2^{r}-t. Thus |<𝒜>||<>||\text{\textless}\mathcal{A}\text{\textgreater}|\geq|\text{\textless}\mathcal{B}\text{\textgreater}|. This completes the proof. ∎

We have now shown that

f(n,2)=2t2(t2)nt.f(n,2)=2^{t}-2^{\binom{t}{2}-n}-t.

Which families are extremal? If we have |<𝒜>|=|<>||\text{\textless}\mathcal{A}\text{\textgreater}|=|\text{\textless}\mathcal{B}\text{\textgreater}| then by above we must have |G|=t|G|=t and we must maximize the sum in (8) so GcG^{c} must be a star which means that this is the only case up to isomorphism when we have equality. Therefore the extremal families are only those isomorphic to \mathcal{B}.

3 Related results

A conjecture of Roberts [7] asserts that Theorem 1 should be true for all tt, not just for tt large.

Conjecture 7.

(Roberts [7]) Let k,tk,t\in\mathbb{N} where tkt\geq k. Let n=(tk)n=\binom{t}{k} and let 𝒜={A1,..,An}\mathcal{A}=\{A_{1},..,A_{n}\} be a family of nn distinct kk-sets. Then |<𝒜>||[t](k)|.|\text{\textless}\mathcal{A}\text{\textgreater}|\geq|[t]^{(\geq k)}|.

In a different direction, what happens for values between the binomial coefficients? Suppose that (tk)<n<(t+1k)\binom{t}{k}<n<\binom{t+1}{k}. It is natural to assume that to minimize |<𝒜>||\text{\textless}\mathcal{A}\text{\textgreater}| over all families 𝒜\mathcal{A} of nn distinct kk-sets we can pick a family on the ground set [t+1][t+1] that contains [t](k)[t]^{(k)} and some sets containing t+1t+1. We can see that our proof of Theorem 1 will give that if |<𝒜>||\text{\textless}\mathcal{A}\text{\textgreater}| is minimized then the size of the ground set GG must be close to tt if tt is sufficiently large. We can also see that for most values of nn we can apply the idea of Lemma 6 as well to show that the ground set must have size t+1t+1. With these ideas in mind we are able to solve a few special cases when nn is very close to (tk)\binom{t}{k}. If n=(tk)ln=\binom{t}{k}-l where l<(k+12)l<\binom{k+1}{2} then for sufficiently large tt (even larger than we needed for (tk)\binom{t}{k}) following the same reasoning as before if |<𝒜>|=f(n,k)|\text{\textless}\mathcal{A}\text{\textgreater}|=f(n,k) then we may assume that 𝒜[t](k)\mathcal{A}\subset[t]^{(k)}.

Now we will show that |<𝒜>||\text{\textless}\mathcal{A}\text{\textgreater}| is minimized when 𝒜=[t]k)\𝒞l\mathcal{A}=[t]^{k)}\backslash\mathcal{C}_{l}^{\prime} where 𝒞l\mathcal{C}_{l} is the initial segment of colex on [t1](k1)[t-1]^{(k-1)} of length ll and 𝒞l={A{t}|A𝒞l}\mathcal{C}_{l}^{\prime}=\{A\cup\{t\}|A\in\mathcal{C}_{l}\} . First notice that the only sets in [t](k)[t]^{(\geq k)} that are not in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater} of size at least k+1k+1 must have size exactly k+1k+1. We then show the following proposition.

Proposition 8.

Let l<(k+12)l<\binom{k+1}{2} be a positive integer and 𝒳\mathcal{X} be a family of kk-sets where |𝒳|=l|\mathcal{X}|=l. Let b𝒳b_{\mathcal{X}} be the number of (k+1)(k+1)-sets with the property that at least kk of their kk-subsets are in 𝒳\mathcal{X}. Then lb𝒳k(b𝒳2)l\geq b_{\mathcal{X}}k-\binom{b_{\mathcal{X}}}{2}.

Proof. Let C1,..,Cb𝒳C_{1},..,C_{b_{\mathcal{X}}} be the (k+1)(k+1)-sets counted in b𝒳b_{\mathcal{X}}. Then if iji\neq j then CiC_{i} and CjC_{j} have at most one subset of size kk in common. Let 𝒳={X1,..,Xl}\mathcal{X}=\{X_{1},..,X_{l}\}. The number of pairs (i,j)(i,j) such that XiCjX_{i}\subset C_{j} must be at least b𝒳kb_{\mathcal{X}}k by counting for each jj. On the other hand for any jjj\neq j^{\prime} there is at most one ii such that both (i,j)(i,j) and (i,j)(i,j^{\prime}) are counted. So ii is counted kik_{i} times where (ki2)\binom{k_{i}}{2} is the number of pairs jjj\neq j^{\prime} such that CjCj=XiC_{j}\cap C_{j^{\prime}}=X_{i} and we have i=1l(ki2)(b𝒳2)\sum_{i=1}^{l}\binom{k_{i}}{2}\leq\binom{b_{\mathcal{X}}}{2}. Since ki1+(ki2)k_{i}\leq 1+\binom{k_{i}}{2} by taking the sum over ii the total number of pairs counted is at most l+(b𝒳2)l+\binom{b_{\mathcal{X}}}{2}. This gives that

lkb𝒳(b𝒳2)l\geq kb_{\mathcal{X}}-\binom{b_{\mathcal{X}}}{2}

as required. ∎

Going back to our problem notice that since 𝒜\mathcal{A} is [t](k)[t]^{(k)} with ll sets taken away we have that every set of size at least k+1k+1 in 𝒫([t])\mathcal{P}([t]) is in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater} except those (k+1)(k+1)-sets that have at least kk of its kk-subsets removed. By the above proposition the number of those (k+1)(k+1)-sets is at most the largest sls_{l} such that lslk(sl2)=k+(k1)++(ksl+1)l\geq s_{l}k-\binom{s_{l}}{2}=k+(k-1)+...+(k-s_{l}+1). Notice that when we have removed exactly the sets of 𝒞l\mathcal{C}_{l}^{\prime} from [t](k)[t]^{(k)} then that removes exactly sls_{l} sets of size k+1k+1 from <𝒜>\text{\textless}\mathcal{A}\text{\textgreater}. This means that it is indeed optimal to take [t](k)\𝒞l[t]^{(k)}\backslash\mathcal{C}_{l}^{\prime} for sufficiently large tt.

What about the case when n=(tk)+ln=\binom{t}{k}+l? In this case we can show that if ll is fixed then there is a large enough tt depending on both k,lk,l such that |<𝒜>||\text{\textless}\mathcal{A}\text{\textgreater}| is minimized when

𝒜=[t](k){Y1,..,Yl}\mathcal{A}=[t]^{(k)}\cup\{Y_{1},..,Y_{l}\}

where Yi={1,2,..,k2,k2+i,t+1}Y_{i}=\{1,2,..,k-2,k-2+i,t+1\}. Suppose that |<𝒜>|=f(n,k)|\text{\textless}\mathcal{A}\text{\textgreater}|=f(n,k). As before using Lemma 3 to get a bound on the ground set and then using the ideas in Lemma 6 and the proof of Theorem 1 we may assume that 𝒜\mathcal{A} contains (tk)o((t1k1))\binom{t}{k}-o(\binom{t-1}{k-1}) sets in [t](k)[t]^{(k)} . Now suppose that X1,..,XsX_{1},..,X_{s} are all of the sets in 𝒜\mathcal{A} that are not in [t](k)[t]^{(k)}. Trivially sls\geq l. Further we let Xi=Xi[t]X_{i}^{\prime}=X_{i}\cap[t] and =𝒜[t](k)\mathcal{B}=\mathcal{A}\cap[t]^{(k)}. Notice that if we have a set B<>B\in\text{\textless}\mathcal{B}\text{\textgreater} that contains one of the XiX_{i}^{\prime} then there is a set X<𝒜>X\in\text{\textless}\mathcal{A}\text{\textgreater} such that X[t]X\not\subset[t] but X[t]=BX\cap[t]=B. This gives a lot of sets in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater} in addition to those that are in 𝒫([t])\mathcal{P}([t]).

Let δs\delta_{s} be the proportion of sets in 𝒫([t])\mathcal{P}([t]) that are in the total upper shadow of an initial segment of lex of length ss on [t](k1)[t]^{(k-1)}. Notice that (for tt sufficiently large) if s=ls=l then δs\delta_{s} is actually the proportion of sets in 𝒫([(l+1)k])\mathcal{P}([(l+1)k]) that are in the total upper shadow of that same initial segment of lex because that segment is inside [(l+1)k][(l+1)k]. The same is true if s=l+1s=l+1. Crucially δl\delta_{l} and δl+1\delta_{l+1} depend only on k,lk,l and notice that δl+1>δl\delta_{l+1}>\delta_{l}. We see that if 𝒞=[t](k){Y1,..,Yl}\mathcal{C}=[t]^{(k)}\cup\{Y_{1},..,Y_{l}\} we have that

|<𝒞>|2t(1+δl)\displaystyle|\text{\textless}\mathcal{C}\text{\textgreater}|\sim 2^{t}(1+\delta_{l}) (9)

for large tt. Notice that the total upper shadow of 𝒳={X1,..,Xs}\mathcal{X}^{\prime}=\{X_{1}^{\prime},..,X_{s}^{\prime}\} contains at least ll sets in [t](k1)[t]^{(k-1)} where it is exactly ll if and only if s=ls=l and |Xi|=k1|X_{i}^{\prime}|=k-1 for all ii. If there are at least l+1l+1 sets then applying the Kruskal-Katona theorem similar to the proof of Lemma 4 the total upper shadow of 𝒳\mathcal{X}^{\prime} in [t][t] has size at least δt+12t\delta_{t+1}2^{t}. For large enough tt we can guarantee that |<>|=(1o(1))2t|\text{\textless}\mathcal{B}\text{\textgreater}|=(1-o(1))2^{t}. This gives that there are at least (δl+1o(1))2t(\delta_{l+1}-o(1))2^{t} sets in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater} that are not subsets of [t][t]. Finally from (9) we have that

|<𝒜>|(1+δl+1o(1))2t>|<𝒞>|f(n,k)|<𝒜>||\text{\textless}\mathcal{A}\text{\textgreater}|\geq(1+\delta_{l+1}-o(1))2^{t}>|\text{\textless}\mathcal{C}\text{\textgreater}|\geq f(n,k)\geq|\text{\textless}\mathcal{A}\text{\textgreater}|

which is a contradiction.

Thus we must have that s=ls=l and |Xi|=k1|X_{i}^{\prime}|=k-1 for all ii. Now let aia_{i} be the element in Xi\XiX_{i}\backslash X_{i}^{\prime}. This means that =[t](k)\mathcal{B}=[t]^{(k)}. Now we know that each set in the total upper shadow of 𝒳\mathcal{X}^{\prime} gives rise to at least 11 set in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater} that is not contained in [t][t]. So

|<𝒜>||[t](k)|+|𝒰𝒳|\displaystyle|\text{\textless}\mathcal{A}\text{\textgreater}|\geq|[t]^{(\geq k)}|+|\mathcal{U}_{\mathcal{X}^{\prime}}|

where 𝒰𝒳\mathcal{U}_{\mathcal{X^{\prime}}} is the total upper shadow of 𝒳\mathcal{X}^{\prime} in [t][t]. If Yi=Yi[t]Y_{i}^{\prime}=Y_{i}\cap[t] and 𝒴={Y1,..,Yl}\mathcal{Y^{\prime}}=\{Y_{1}^{\prime},..,Y_{l}^{\prime}\} we get that |𝒰𝒳||𝒰𝒴||\mathcal{U}_{\mathcal{X}^{\prime}}|\geq|\mathcal{U}_{\mathcal{Y^{\prime}}}|. This means that |<𝒜>||[t](k)|+|𝒰𝒴|=|<[t](k){Y1,..,Yl}>||\text{\textless}\mathcal{A}\text{\textgreater}|\geq|[t]^{(\geq k)}|+|\mathcal{U}_{\mathcal{Y}^{\prime}}|=|\text{\textless}[t]^{(k)}\cup\{Y_{1},..,Y_{l}\}\text{\textgreater}| for sufficiently large tt which is what we wanted to show. Notice that also if some aiaja_{i}\neq a_{j} then [t][t] gives rise to both [t]{ai},[t]{aj}[t]\cup\{a_{i}\},[t]\cup\{a_{j}\} in <𝒜>\text{\textless}\mathcal{A}\text{\textgreater} so we have |<𝒜>|>|<[t](k){Y1,..,Yl}>||\text{\textless}\mathcal{A}\text{\textgreater}|>|\text{\textless}[t]^{(k)}\cup\{Y_{1},..,Y_{l}\}\text{\textgreater}| which is a contradiction. Thus we must have all aia_{i} to be the same.

We have therefore found extremal families for some nn that are very close to (tk)\binom{t}{k} but we need tt to be large enough. Still, we do not know for most values in between the binomial coefficients or for small values of tt.

We will finish with a conjecture for the general result for all nn. Colex is not the right order. As noted by Leck, Roberts and Simpson [6] if we consider

𝒜=[4](3){{1,2,5},{1,3,5},{1,4,5}}\mathcal{A}=[4]^{(3)}\cup\{\{1,2,5\},\{1,3,5\},\{1,4,5\}\}

and let \mathcal{B} be the initial segment of colex on (3)\mathbb{N}^{(3)} of length 77. Then |<𝒜>|=12|\text{\textless}\mathcal{A}\text{\textgreater}|=12, but |<>|=13|\text{\textless}\mathcal{B}\text{\textgreater}|=13. If (tk)<n<(t+1k)\binom{t}{k}<n<\binom{t+1}{k} then as mentioned before for most nn if |<𝒜>|=f(n,k)|\text{\textless}\mathcal{A}\text{\textgreater}|=f(n,k) then the ground set of 𝒜\mathcal{A} has size t+1t+1 so we may assume that 𝒜[t+1](k)\mathcal{A}\subset[t+1]^{(k)}. If we further assume that [t](k)𝒜[t]^{(k)}\subset\mathcal{A} then the we have |<𝒜>|=|[t](k)|+|𝒰𝒳||\text{\textless}\mathcal{A}\text{\textgreater}|=|[t]^{(\geq k)}|+|\mathcal{U}_{\mathcal{X}^{\prime}}| where 𝒳\mathcal{X}^{\prime} is as above and |𝒳|=n(tk)|\mathcal{X}^{\prime}|=n-\binom{t}{k}. As before this is minimized when 𝒳\mathcal{X}^{\prime} is the initial segment of lex on [t](k1)[t]^{(k-1)}. So something like ”colex but lex inside a given maximal element” could be the correct order. Such a mixed ordering has occurred in other problems as well (see Engel and Leck [3] and also Duffus, Howard and Leader [2]). We will define the max-lex order on (k)\mathbb{N}^{(k)} to be the order in which A<BA<B if either maxA<maxB\max A<\max B or else maxA=maxB\max A=\max B and min(AΔB)A\min(A\Delta B)\in A. The following lovely conjecture of Roberts [7] includes Conjecture 7 – we strongly believe it is true.

Conjecture 9.

(Roberts [7]) Let kk\in\mathbb{N} and 𝒜(k)\mathcal{A}\subset\mathbb{N}^{(k)}. If \mathcal{B} is the initial segment of max-lex on (k)\mathbb{N}^{(k)} of size |𝒜||\mathcal{A}| then |<𝒜>||<>|.|\text{\textless}\mathcal{A}\text{\textgreater}|\geq|\text{\textless}\mathcal{B}\text{\textgreater}|.

References

  • [1] B. Bollobás. Combinatorics: Set Systems, Hypergraphs, Families of Vectors, and Combinatorial Probability. Cambridge University Press, USA, 1986.
  • [2] D. Duffus, D. Howard, and I. Leader. The width of downsets. European Journal of Combinatorics, 79:46–59, jun 2019.
  • [3] K. Engel and U. Leck. Optimal antichains and ideals in Macaulay posets. Graph Theory and Combinatorial Biology, 7:199–222, 1999.
  • [4] G. Katona. A theorem of finite sets. Classic Papers in Combinatorics, pages 381–401, 1987.
  • [5] J. B Kruskal. The number of simplices in a complex. Mathematical optimization techniques, 10:251–278, 1963.
  • [6] Uwe Leck, Ian Roberts, and Jamie Simpson. Minimizing the weight of the union-closure of families of two-sets. Australasian Journal of Combinatorics, 52(1):67–73, 2012.
  • [7] Ian T Roberts. Extremal problems and designs on finite sets. PhD thesis, Curtin University, 1999.