Complete Traceability Multimedia Fingerprinting Codes Resistant to Averaging Attack and Adversarial Noise with Optimal Rate

Ilya Vorobyev
Skolkovo Institute of Science and Technology

Abstract

In this paper we consider complete traceability multimedia fingerprinting codes resistant to averaging attacks and adversarial noise. Recently it was shown that there are no such codes for the case of an arbitrary linear attack. However, for the case of averaging attacks complete traceability multimedia fingerprinting codes of exponential cardinality resistant to constant adversarial noise were constructed in [1]. We continue this work and provide an improved lower bound on the rate of these codes.

1 Introduction

Multimedia fingerprinting codes are used to protect digital content from illegal copying and redistribution. The key idea of this technique is to embed a unique signal, called watermark, into every copy, so that it can be tracked to its buyer [2, 3]. Watermarks should be able to protect the dealer from collusion attack, when a coalition of dishonest users (pirates) construct a new file, for example, by averaging their copies of the same content. By gathering a big enough coalition it is possible to sufficiently decrease the impact of each individual fingerprint, which makes it hard for the dealer to identify the pirates. In papers [4, 5] the authors propose to use separable (or signature) codes to track all members of the coalition.

A model of multimedia fingerprinting with an adversarial noise was proposed in [6], i.e. the coalition of dishonest users can add some noise to the content in order to hide their fingerprints. In [7] it was shown that there are no multimedia codes resistant to a general linear attack and an adversarial noise. However, in [1] the authors proved that for the most common case of averaging attack one can construct multimedia codes with a non-vanishing rate. We continue their research and prove a new lower bound on the rate, which has the same order as an upper bound. A detailed survey of state-of-the-art results can be found in [8].

The rest of the paper is structured as follows. In Section 2 we introduce the required notation and definitions and formally describe the problem. Our main result is proved in Section 3. Section 4 concludes the paper and discusses some open problems.

2 Problem statement

Vectors are denoted by bold letters, such as $\bm{x}$ , and the $i$ th entry is referred to as $x_{i}$ . The set of integers $\{1,\,2,\,\ldots,\,M\}$ is abbreviated by $[M]$ . The sign $\lVert\cdot\rVert$ stands for the Euclidean norm. A support $supp(\bm{x})$ of a vector $\bm{x}$ is a set of such coordinates $i$ that $x_{i}\neq 0$ . Scalar (dot) product of vectors $\bm{x}$ and $\bm{y}$ is denoted as $\langle\bm{x},\,\bm{y}\rangle$ , greatest common divisor of integers $a$ and $b$ is referred to as $(a,b)$ . For a given binary $n\times M$ matrix $H$ with columns $\bm{h}_{1},\,\ldots,\,\bm{h}_{M}$ and set $I\subset[M]$ introduce notation for a result of averaging attack

\sigma(H\mid I)=\lvert I\rvert^{-1}\sum\limits_{i\in I}\bm{h}_{i}.

A binary entropy function $h(x)$ is defined as follows

h(x)=-x\log_{2}x-(1-x)\log_{2}(1-x).

Suppose that multimedia content is represented by a vector $\bm{x}\in\mathbb{R}^{N}$ , which is being sold to $M$ users. Vector $\bm{x}$ is often called a host signal. To protect the content from unauthorized copying the dealer constructs a set of watermarks $\bm{w}_{1},\,\ldots,\,\bm{w}_{M}$ , which are also called fingerprints. The dealer fixes $n$ orthonormal vectors $\bm{f}_{1},\,\ldots,\,\bm{f}_{n}$ of length $N,\,\bm{f}_{i}\in\mathbb{R}^{N}$ and forms watermarks $\bm{w}_{i}$ as linear combinations of $\bm{f}_{j}$ with binary coefficients $h_{ij}\in\{0,\,1\}$

\bm{w}_{i}=\sum\limits_{j=1}^{n}h_{ij}\,\bm{f}_{j}\text{ for }i\in[M].

(1)

Then watermarks are added to the host signal to obtain a final copy $\bm{y}_{i}$ for the $i$ -th user

\bm{y}_{i}=\bm{x}+\bm{w}_{i}.

We assume that $\lVert\bm{w}_{i}\rVert\ll\lVert\bm{x}\rVert$ , so the added watermark doesn’t change the content much.

A coalition of dishonest users $I\subset[M]$ may come together to forge a new copy and redistribute it among other users. They can apply a linear attack, i.e., create a new copy $\bm{y}$ as a linear combination of their copies. In addition, they may add a noise vector $\bm{\varepsilon}$ , $\lVert\bm{\varepsilon}\rVert\ll\lVert\bm{x}\rVert$ , to make it harder for the dealer to identify them.

\bm{y}=\sum\limits_{i\in I}\lambda_{i}\,\bm{y}_{i}+\bm{\varepsilon},

where $\lambda_{i}>0$ for each dishonest user in $I$ exactly participates in the attack, $\lambda_{i}\in\mathbb{R}$ and $\sum_{i\in I}\lambda_{i}=1$ to ensure the multimedia content $\bm{x}$ not be changed. Especially in averaging attack, the last condition is $\lambda_{i}=1/\lvert I\rvert$ for every $i\in I$ and it implies that

\bm{y}=\sum\limits_{i\in I}\lambda_{i}\,\bm{y}_{i}+\bm{\varepsilon}=\bm{x}+\sum\limits_{i\in I}\lambda_{i}\,\bm{w}_{i}+\bm{\varepsilon}.

Note that

\lVert\bm{y}-\bm{x}\rVert=\lVert\sum\limits_{i\in I}\lambda_{i}\,\bm{w}_{i}+\bm{\varepsilon}\rVert\leq\max\lVert\bm{w}_{i}\rVert+\lVert\bm{\varepsilon}\rVert\ll\lVert\bm{x}\rVert,

therefore, $\bm{y}$ is close enough to the original signal $\bm{x}$ .

In order to find the coalition of dishonest users based on the forged copy $\bm{y}$ , the dealer evaluates

	$\displaystyle s_{k}$	$\displaystyle=\langle\bm{y}-\bm{x},\bm{f}_{k}\rangle$
		$\displaystyle=\langle\sum\limits_{i\in I}\lambda_{i}\,\sum\limits_{j=1}^{n}h_{ij}\,\bm{f}_{j}+\bm{\varepsilon},\bm{f}_{k}\rangle=\sum\limits_{i\in I}\lambda_{i}\,h_{ik}+e_{k},$

where $e_{k}=\langle\bm{\varepsilon},\bm{f}_{k}\rangle$ , and forms a syndrome vector $\bm{S}=(s_{1},\,\ldots,\,s_{n})$ . The syndrome vector $\bm{S}$ can be equivalently defined through the matrix equation

\bm{S}=H\bm{\Lambda}^{T}+\bm{e},

where $\Lambda=(\lambda_{1},\,\ldots,\,\lambda_{M})$ , $\lambda_{i}=0$ for $i\notin I$ , and $\bm{e}=(e_{1},\,\ldots,\,e_{n})$ , $\lVert\bm{e}\rVert\leq\lVert\bm{\varepsilon}\rVert$ .

The dealer wants to design a matrix $H$ in such a way, that by observing $\bm{S}$ he always can find the support $supp(\bm{\Lambda})$ if the size of the coalition $I$ is at most $t$ . The following definition for a noiseless scenario was introduced in [6].

Definition 1.

A binary $n\times M$ matrix $H$ is called a $t$ -multimedia digital fingerprinting code with complete traceability ( $t$ -MDF code for short) if for any two distinct coalitions $I$ , $I^{\prime}$ , $\lvert I\rvert,\lvert I^{\prime}\rvert\leq t$ , we have

H\bm{\Lambda}^{T}\neq H\bm{\Lambda}^{\prime T}

for any real vectors $\bm{\Lambda}=(\lambda_{1},\,\ldots,\,\lambda_{M})$ and $\bm{\Lambda}^{\prime}=(\lambda_{1}^{\prime},\,\ldots,\,\lambda_{M}^{\prime})$ , such that $\lambda_{i}\geq 0$ , $\lambda_{i}^{\prime}\geq 0$ , $\sum\limits_{i=1}^{M}\lambda_{i}=\sum\limits_{i=1}^{M}\lambda_{i}^{\prime}=1$ , $supp(\bm{\Lambda})=I$ , $supp(\bm{\Lambda}^{\prime})=I^{\prime}$ .

Denote the maximal cardinality and the maximal rate of $t$ -MDF code of length $n$ as $M(n,t)$ and $R(n,t)=n^{-1}\log_{2}M(n,t)$ . Denote by $R^{*}(t)$ and $R_{*}(t)$ an upper and a lower limits of $R(n,t)$ as $n\to\infty$ . It is known that

\Omega\left(\frac{\log_{2}t}{t}\right)\leq R_{*}(t)\leq R^{*}(t)\leq\frac{\log_{2}t}{2t}(1+o(1)).

(2)

The upper bound of (2) can be derived from an upper bound for a binary adder channel from [9]. The lower bound is based on the following observation from [6]. If any $2t$ columns of a binary matrix $H$ are independent over the field of real numbers $\mathbb{R}$ , then $H$ is a $t$ -MDF code. Since parity check matrices of binary codes with a distance $d>2t$ poses this property, application of Goppa or BCH codes gives an explicit construction with a rate $R_{*}(t)\geq 1/t$ [6]. An improved lower bound $\Omega\left(\frac{\log_{2}t}{t}\right)$ can be derived from the results of the paper [10], where the authors proved the existence of binary $n\times M$ matrices, $n^{-1}\log_{2}M=\Omega(\log_{2}t/t)$ , such that any $2t$ columns are independent over the field $\mathds{Z}_{p}$ , $p>2t$ . We note that the latter result was proved with a probabilistic method, i.e. it’s not explicit.

Now we discuss a noisy scenario. In [7] the authors defined $(t,\delta)$ -light complete traceability multimedia digital fingerprinting codes and proved that they don’t exist. Informally, if some coefficient $\lambda_{i}$ is sufficiently small, then it is possible to compensate the signal of $i$ th user by the noise so that it would be impossible to identify this user. However, for the case of averaging attacks, when all non-zero coefficients $\lambda_{i}$ are equal, the situation is different. Let us give the corresponding definition from [1].

Definition 2.

A binary $n\times M$ matrix $H$ is called a (Euclidean) $(t,\delta)$ -light complete traceability code if for any two distinct coalitions $I_{1}$ , $I_{2}$ , $\lvert I_{1}\rvert,\lvert I_{2}\rvert\leq t$ , we have

\sigma(H\mid I_{1})+\bm{e}_{1}\neq\sigma(H\mid I_{2})+\bm{e}_{2},

for any real vectors $\bm{e}_{1},\bm{e}_{2}\in\mathbb{R}^{n}$ , $\lVert\bm{e}_{1}\rVert,\lVert\bm{e}_{2}\rVert\leq\delta$ .

In other words, Euclidean distance between vectors $\sigma(H\mid I_{1})$ and $\sigma(H\mid I_{2})$ , generated by different coalitions $I_{1}$ and $I_{2}$ , $\lvert I_{1}\rvert,\lvert I_{2}\rvert\leq t$ , should be big, i.e.

\lVert\sigma(H\mid I_{1})-\sigma(H\mid I_{2})\rVert>2\delta.

Remark 1.

Although an averaging attack is very restrictive for the coalition, in many papers authors consider only them instead of general linear attacks. One of the arguments is that averaging attack is the most fair choice since all the members of a coalition contribute the same proportion of data into a forged copy [4, 3]. However, in future research it may be reasonable to study a model with different coefficients $\lambda_{i}$ , which are lower bounded by some constant.

Define codes for the case of noise vectors with a bounded cardinality of their support.

Definition 3.

A binary $n\times M$ matrix $H$ is called a Hamming $(t,T)$ -light complete traceability code if for any two distinct coalitions $I_{1}$ , $I_{2}$ , $\lvert I_{1}\rvert,\lvert I_{2}\rvert\leq t$ , we have

\sigma(H\mid I_{1})+\bm{e}_{1}\neq\sigma(H\mid I_{2})+\bm{e}_{2},

for any real vectors $\bm{e}_{1},\bm{e}_{2}\in\mathbb{R}^{n}$ , $\lvert supp(\bm{e}_{1})\rvert,\lvert supp(\bm{e}_{2})\rvert\leq T$ .

Equivalently, the number of different coordinates of vectors $\sigma(H\mid I_{1})$ and $\sigma(H\mid I_{2})$ , generated by distinct coalitions $I_{1}$ and $I_{2}$ , $\lvert I_{1}\rvert,\lvert I_{2}\rvert\leq t$ , should be big, i.e.

\lvert supp(\sigma(H\mid I_{1})-\sigma(H\mid I_{2}))\rvert>2T.

Denote the maximal cardinality of Euclidean and Hamming light complete traceability codes of length $n$ by $M_{E}(n,t,\delta)$ and $M_{H}(n,t,T)$ respectively. Define the rates of these codes as follows

R_{E}(n,t,\delta)=\frac{\log_{2}M_{E}(n,t,\delta)}{n},

R_{H}(n,t,T)=\frac{\log_{2}M_{H}(n,t,T)}{n}.

In the following proposition we show an obvious connection between these two families of codes.

Proposition 1.

1. A Hamming $(t,T)$ -light complete traceability code $H$ is a Euclidean $(t,\delta)$ -light complete traceability code for $\delta=\sqrt{2T}/(2t(t-1))$ .
2. A Euclidean $(t,\delta)$ -light complete traceability code $H$ is a Hamming $(t,T)$ -light complete traceability code for $T=\lfloor 2\delta^{2}\rfloor$ .
3. The rates of these codes are connected as follows

	$\displaystyle R_{E}(n,t,\sqrt{2T}/(2t(t-1)))\geq R_{H}(n,t,T),$
	$\displaystyle R_{H}(n,t,\lfloor 2\delta^{2}\rfloor)\geq R_{E}(n,t,\delta).$

Proof.

1. Assume that a Hamming $(t,T)$ -light complete traceability code $H$ is not a Euclidean $(t,\sqrt{2T}/(2t(t-1)))$ -light complete traceability code, i.e. there exist two coalitions $I_{1}$ and $I_{2}$ , such that

\lVert\bm{\Delta}\rVert\leq 2\delta,\text{ where }\bm{\Delta}=\sigma(H\mid I_{1})-\sigma(H\mid I_{2}),\;\delta=\sqrt{2T}/(2t(t-1)).

Since the minimal positive value of coordinate $\Delta_{i}$ is at least $1/(t(t-1))$ , we conclude that there are at most

4\delta^{2}t^{2}(t-1)^{2}=2T

coordinates, in which $\sigma(H\mid I_{1})$ and $\sigma(H\mid I_{2})$ are different. Hence, there are two vectors $\bm{u}_{1}$ , $\bm{u}_{2}$ , $\lvert supp(\bm{u}_{1})\rvert,\lvert supp(\bm{u}_{2})\rvert\leq T$ , such that

\sigma(H\mid I_{1})+\bm{u}_{1}=\sigma(H\mid I_{2})+\bm{u}_{2}.

Therefore, $H$ is not a Hamming $(t,T)$ -light complete traceability code. This contradiction proves the first claim.

2. Assume that a Euclidean $(t,\delta)$ -light complete traceability code $H$ is not a Hamming $(t,\lfloor 2\delta^{2}\rfloor)$ -light complete traceability code, i.e. there exist two coalitions $I_{1}$ and $I_{2}$ , such that

\lvert supp(\bm{\Delta})\rvert\leq 2T,\text{ where }\bm{\Delta}=\sigma(H\mid I_{1})-\sigma(H\mid I_{2}),\;T=\lfloor 2\delta^{2}\rfloor.

Since the absolute value of every coordinate of the vector $\bm{\Delta}$ is at most 1, we have

\lVert\bm{\Delta}\rVert\leq\sqrt{2T}\leq 2\delta,

which contradicts the definition of Euclidean $(t,\delta)$ -light complete traceability codes.

3. Claim 3 is an obvious corollary of claims 1 and 2. ∎

In [1] it was proved that $\liminf_{n\to\infty}R_{E}(n,t,\delta)\geq\Omega(1/t)$ for constant $\delta$ . An upper bound is the same as in the noiseless case, $\limsup_{n\to\infty}R_{E}(n,t,\delta)\leq\frac{\log_{2}t}{2t}(1+o(1))$ , since the proof works for an averaging attack. Therefore, there is a $\Theta(\log_{2}t)$ gap between the lower and upper bound. We eliminate this gap in the next section.

3 Lower bound on the rate of light complete traceability codes

In this section we prove

Theorem 1.

For $\tau<1/4$

\liminf_{n\to\infty}R_{H}(n,t,\lfloor\tau n\rfloor)\geq\frac{(1-2\tau)\log_{2}t}{6t}(1+o(1)),\;\;t\to\infty.

(3)

Combining Theorem 1 and Proposition 1 we obtain the following

Corollary 1.

For $\delta^{2}=\alpha n$ , $\alpha<1/8t^{2}(t-1)^{2})$ , we have

\liminf_{n\to\infty}R_{E}(n,t,\sqrt{\alpha n})\geq\frac{(1-2\tau)\log_{2}t}{6t}(1+o(1)),\;\;t\to\infty,

where $\tau=2\alpha t^{2}(t-1)^{2}.$

For the case of small noise $\delta=o(\sqrt{n})$ and $n\to\infty$ a new lower bound has the following form

\liminf_{n\to\infty}R_{E}(n,t,\delta)\geq\lim\limits_{\alpha\to 0}\liminf_{n\to\infty}R_{E}(n,t,\sqrt{\alpha n})\geq\frac{\log_{2}t}{6t}(1+o(1)).

It improves the previous lower bound $\Omega(1/t)$ and has the same order $\Theta(\log_{2}t/t)$ as the upper bound. However, the new bound is not explicit, i.e. there is no effective encoding or decoding algorithm for a new code.

Proof of Theorem 1.

Consider a random $n\times M$ matrix $H$ , $M=2^{Rn}$ , in which every entry is chosen independently and equals 1 with a probability $1/2$ . The value of $R$ will be specified later. Fix two coalitions $I_{1}$ and $I_{2}$ , $\lvert I_{1}\rvert,\lvert I_{2}\rvert\leq t$ . Call a row $r$ good, if

\frac{\sum\limits_{i_{1}\in I_{1}}h_{r,i_{1}}}{\lvert I_{1}\rvert}\neq\frac{\sum\limits_{i_{2}\in I_{2}}h_{r,i_{2}}}{\lvert I_{2}\rvert}.

Otherwise, we call a row bad. Call a pair of coalitions good, if there are at least $2T+1$ good rows for them. Otherwise, call such a pair bad. Then the condition that $H$ is a Hamming $(t,T)$ -light complete traceability code is equivalent to the absence of bad pairs of coalitions.

We say that a bad pair of coalitions $I_{1}$ and $I_{2}$ is minimal, if there is no another bad pair of coalitions $I_{1}^{\prime}$ and $I_{2}^{\prime}$ , $I_{1}^{\prime}\cup I_{2}^{\prime}\subset I_{1}\cup I_{2}$ . For example, a bad pair of intersecting coalitions $I_{1}$ and $I_{2}$ with $\lvert I_{1}\rvert=\lvert I_{2}\rvert$ can’t be minimal, since it contains another bad pair $I_{1}\setminus I_{2}$ and $I_{2}\setminus I_{1}$ . Obviously, to prove that $H$ is a Hamming $(t,T)$ -light complete traceability code it is enough to check that there are no minimal bad pairs of coalitions.

We are going to prove that a mathematical expectation of the number of minimal bad pairs of coalitions is tending to zero as $n\to\infty$ . By Markov’s inequality this would imply that for big enough $n$ there exists $(t,T)$ -light complete traceability Hamming code with the rate $R$ and $T=\lfloor\tau n\rfloor$ .

Now we estimate the probability that a row $i$ is bad for coalitions $I_{1}$ and $I_{2}$ .

Lemma 2.

The probability that a row is bad for coalitions $I_{1}$ and $I_{2}$ , $\lvert I_{1}\rvert=q$ , $\lvert I_{2}\rvert=r$ , $q>r$ , is upper bounded by $p(q)=q^{-1/3+o(1)}$ , $q\to\infty$ . For non-intersecting coalitions $I_{1}$ and $I_{2}$ , $\lvert I_{1}\rvert=\lvert I_{2}\rvert=q$ , the probability that a row is bad is upper bounded by $p(q)=q^{-1/2+o(1)}$ . Moreover, $p(q)\leq 1/2$ for all $q$ .

Proof of Lemma 2.

For the case $q=r$ , $I_{1}\cap I_{2}=\emptyset$ , probability of a bad row is equal to

\displaystyle 2^{-2q}\sum\limits_{i=0}^{q}\binom{q}{i}\binom{q}{i}=2^{-2q}\binom{2q}{q}=O(q^{-0.5}),

which is not greater than $1/2$ for all $q$ .

Now assume that $q>r$ . Denote the cardinality of the intersection of $I_{1}$ and $I_{2}$ as $k$ . Consider two cases $q-k>s$ and $q-k\leq s$ , $s=q^{2/3}$ .

The first case $q-k>s$ . Note that for any distribution of zeroes and ones in columns from $I_{2}$ there exists at most one fraction of ones in $I_{1}\setminus I_{2}$ which makes the row bad. Hence the probability of obtaining a bad string is upper bounded by

\max\limits_{l}\frac{\binom{q-k}{l}}{2^{q-k}}\leq 1/2.

For $q\to\infty$ this bound looks as follows

\max\limits_{l}\frac{\binom{q-k}{l}}{2^{q-k}}<\frac{1+o(1)}{\sqrt{\pi(q-k)/2}}<\frac{1+o(1)}{\sqrt{\pi s/2}}=O(q^{-1/3}),

where in the first inequality a Stirling’s approximation $\binom{q-k}{(q-k)/2}\sim\frac{2^{q-k}}{\sqrt{\pi(q-k)/2}}$ for a maximal binomial coefficient was used.

The second case $q-k\leq s$ . Observe that the greatest common divisor $d=(q,r)$ is at most $s$ , since $d\leq q-r\leq q-k\leq s$ . Since $s_{1}/q=s_{2}/r$ implies $(q/d)\mid s_{1}$ and $(r/d)\mid s_{2}$ , it is readily seen that for a bad row $i$ the $i$ th coordinate in sums $\sum\limits_{j\in I_{1}}\bm{h}_{j}$ and $\sum\limits_{j\in I_{2}}\bm{h}_{j}$ should be divided by $q/d$ and $r/d$ respectively. Therefore, probability of a bad row can be upper bounded by the probability $P$ that a binomial random variable $\xi\sim Bin(q,1/2)$ is divided by $q/d\geq q^{1/3}$ . One can see that $P<1/2$ for $q/d>1$ . Now we prove that for $q\to\infty$ the probability $P$ is at most $q^{-1/3+o(1)}$ .

By Hoeffding’s inequality [11]

\Pr(\lvert\xi-q/2\rvert>\sqrt{q\ln q})\leq 2e^{-2\ln q}=O(q^{-2}).

Define $S=[\lfloor\frac{q/2-\sqrt{q\ln q}}{q/d}\rfloor,\lfloor\frac{q/2+\sqrt{q\ln q}}{q/d}\rfloor]$ . Then we can estimate $P$ as follows

	$\displaystyle P$	$\displaystyle=\sum\limits_{l=0}^{d}\Pr(\xi=l\cdot q/d)$
		$\displaystyle\leq\sum\limits_{l\in S}\Pr(\xi=l\cdot q/d)+\Pr(\lvert\xi-q/2\rvert>\sqrt{q\ln q})$
		$\displaystyle\leq\max\limits_{x}\Pr(\xi=x)\cdot\left\lceil\frac{2\sqrt{q\ln q}+1}{q/d}\right\rceil+O(q^{-2})$
		$\displaystyle\leq\frac{1}{\sqrt{q}}\cdot\left\lceil\frac{2\sqrt{q\ln q}+1}{q/d}\right\rceil+O(q^{-2})$
		$\displaystyle\leq O(q^{-1/3}\sqrt{\ln q})=q^{-1/3+o(1)}.$

∎

To estimate a mathematical expectation $E$ of the number of minimal bad pairs of coalitions we iterate over all $<M^{q+r}$ pairs of coalitions having sizes $q$ and $r$ , $q>r$ , all pairs of non-intersecting coalitions of size $q$ , and over all possible amounts $L<2T+1$ of good rows.

	$\displaystyle E$	$\displaystyle<\sum\limits_{0<r\leq q\leq t}M^{q+r}\sum\limits_{L=0}^{2T}\binom{n}{L}(1-p(q))^{L}p(q)^{n-L}$
		$\displaystyle\stackrel{{\scriptstyle a)}}{{<}}\sum\limits_{q=1}^{t}qM^{2q}(2T+1)\binom{n}{2T}(1-p(q))^{2T}p(q)^{n-2T}$
		$\displaystyle=\sum\limits_{q=1}^{t}2^{2qRn}p(q)^{n}2^{(h(2\tau)+o(1))n}\left(\frac{1-p(q)}{p(q)}\right)^{2\tau n}$
		$\displaystyle=\sum\limits_{q=1}^{t}2^{A(q)n},$

where

\displaystyle A(q)=2qR+\log_{2}p(q)+h(2\tau)+2\tau\log_{2}\left(\frac{1-p(q)}{p(q)}\right).

In inequality a) we used the fact that

\binom{n}{L}(1-p(q))^{L}p(q)^{n-L}\leq\binom{n}{2T}(1-p(q))^{2T}p(q)^{n-2T},

since $2\tau<1/2\leq 1-p(q)$ and $2T\leq(1-p(q))n$ .

Let $\hat{R}=\min\limits_{q\in[1,t]}-\frac{\log_{2}p(q)+h(2\tau)+2\tau\log_{2}\left(\frac{1-p(q)}{p(q)}\right)}{2q}$ . Note that since $2\tau<1-p(q)$ for all $q$ then $\hat{R}>0$ . For $R<\hat{R}$ the condition $A(q)<0$ holds, hence, $E\to 0$ as $q\to\infty$ , which implies that the rate $\hat{R}$ is achievable. For $t\to\infty$ the minimum would be attained at $q$ , which tends to $\infty$ , so

\hat{R}=\frac{(1-2\tau)\log_{2}t}{6t}(1+o(1)),\quad t\to\infty.

Theorem 1 is proved. ∎

4 Conclusion

In this paper we proved a new lower bound on the rate of Euclidean $(t,\delta)$ -light complete traceability codes, which shows that the optimal rate has order $\Theta(\log_{2}t/t)$ . However, the proof uses probabilistic arguments and does not provide an explicit construction with efficient encoding and decoding algorithms. A natural open problem is to design a code with an optimal rate and efficient decoding algorithm.

Coefficient $\lambda_{i}$ shows what proportion of the original content was contributed by user $i$ into an illegal copy. It is natural that if the contribution of user $i$ was very small then it will be hard for a dealer to identify such user. So, another open task is to design a code capable of finding all members of a coalition for an adversarial noise and linear attack, whose coefficients $\lambda_{i}$ are lower bounded by some constant, i.e. all users, whose contribution was big enough.

Acknowledgment

The reported study was supported by RFBR and National Science Foundation of Bulgaria (NSFB), project number 20-51-18002, and by RFBR, grant no. 20-01-00559.

References

[1] E. Egorova, M. Fernandez, G. Kabatiansky, and Y. Miao, “Existence and construction of complete traceability multimedia fingerprinting codes resistant to averaging attack and adversarial noise,” Problems of Information Transmission, vol. 56, no. 4, pp. 388–398, 2020.
[2] K. R. Liu, W. Trappe, Z. J. Wang, M. Wu, and H. Zhao, Multimedia fingerprinting forensics for traitor tracing. EURASIP Book Series on Signal Processing and Communications: Hindawi Publishing Corporation, 2005.
[3] W. Trappe, M. Wu, Z. J. Wang, and K. R. Liu, “Anti-collusion fingerprinting for multimedia,” IEEE Transactions on Signal Processing, vol. 51, no. 4, pp. 1069–1087, 2003.
[4] M. Cheng and Y. Miao, “On anti-collusion codes and detection algorithms for multimedia fingerprinting,” IEEE transactions on information theory, vol. 57, no. 7, pp. 4843–4851, 2011.
[5] M. Cheng, L. Ji, and Y. Miao, “Separable codes,” IEEE transactions on information theory, vol. 58, no. 3, pp. 1791–1803, 2011.
[6] E. Egorova, M. Fernandez, G. Kabatiansky, and M. H. Lee, “Signature codes for weighted noisy adder channel, multimedia fingerprinting and compressed sensing,” Designs, Codes and Cryptography, vol. 87, no. 2, pp. 455–462, 2019.
[7] J. Fan, Y. Gu, M. Hachimori, and Y. Miao, “Signature codes for weighted binary adder channel and multimedia fingerprinting,” IEEE Transactions on Information Theory, vol. 67, no. 1, pp. 200–216, 2020.
[8] E. E. Egorova and G. A. Kabatiansky, “Separable collusion-secure multimedia codes,” Problems of Information Transmission, vol. 57, no. 2, pp. 178–198, 2021.
[9] A. G. D’yachkov and V. V. Rykov, “On a coding model for a multiple-access adder channel,” Problemy Peredachi Informatsii, vol. 17, no. 2, pp. 26–38, 1981.
[10] N. H. Bshouty and H. Mazzawi, “On parity check (0, 1)-matrix over z p,” in Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 2011, pp. 1383–1394.
[11] W. Hoeffding, “Probability inequalities for sums of bounded random variables,” in The collected works of Wassily Hoeffding. New York: Springer, 1994, pp. 409–426.