This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: University of Ottawa, Ottawa, Canada
11email: {tbardini,lmoura,cadams}@uottawa.ca

Modification tolerant signature schemes: location and correction

Thais Bardini Idalino Funding granted from CNPq-Brazil [233697/2014-4] and OGS.    Lucia Moura    Carlisle Adams
Abstract

This paper considers malleable digital signatures, for situations where data is modified after it is signed. They can be used in applications where either the data can be modified (collaborative work), or the data must be modified (redactable and content extraction signatures) or we need to know which parts of the data have been modified (data forensics). A classical digital signature is valid for a message only if the signature is authentic and not even one bit of the message has been modified. We propose a general framework of modification tolerant signature schemes (MTSS), which can provide either location only or both location and correction, for modifications in a signed message divided into nn blocks. This general scheme uses a set of allowed modifications that must be specified. We present an instantiation of MTSS with a tolerance level of dd, indicating modifications can appear in any set of up to dd message blocks. This tolerance level dd is needed in practice for parametrizing and controlling the growth of the signature size with respect to the number nn of blocks; using combinatorial group testing (CGT) the signature has size O(d2logn)O(d^{2}\log n) which is close to the best known lower bound of Ω(d2logd(logn))\Omega(\frac{d^{2}}{\log d}(\log n)). There has been work in this very same direction using CGT by Goodrich et al. (ACNS 2005) and Idalino et al. (IPL 2015). Our work differs from theirs in that in one scheme we extend these ideas to include corrections of modification with provable security, and in another variation of the scheme we go in the opposite direction and guarantee privacy for redactable signatures, in this case preventing any leakage of redacted information.

Keywords:
M

October 2019

odification tolerant signature; redactable signature; content extraction signature; malleable signature; modification localization; modification correction; digital signature; combinatorial group testing; cover-free family.

1 Introduction

Classical digital signature schemes (CDSS) are used to guarantee that a document was created by the sender (authenticity) and has not been modified along the way (integrity); they also help to prevent the signer from claiming s/he did not sign a message (non-repudiation). The verification algorithm has a boolean output: a successful outcome is achieved if and only if both the signature is valid and the document has not been modified. In this paper, we consider more general digital signature schemes which we call modification-tolerant signature schemes (MTSS), which go beyond the ability of detecting modifications provided by CDSS, and have the ability of locating modifications or locating and correcting modifications. We discuss two types of modification-tolerant signature schemes: a general MTSS that allows the location of modified blocks of the data, and an MTSS with correction capability, that allows the correction of the modified blocks, recovering the original message. We give three instantiations of the scheme for the purpose of location, correction, and redaction.

In which situations can modifications be allowed or even desired in the context of digital signatures? One situation where modifications are desirable is the so-called redactable signatures [13], also called content extraction signatures [21]. In redactable signature schemes [13, 18, 21], some blocks of a document are redacted (blacked out) for privacy purposes, without interacting with the original signer and without compromising the validity of the signature. Related to these are the “sanitizable” signatures introduced by Ateniese et al. [1] which are also used for the purpose of privacy of (parts of) the original message, but the modifications are done by a semi-trusted third party (the sanitizer) who can modify blocks of the document and sign the modified version, without the need of intervention by the original signer. Thus, in both redactable and sanitizable signature schemes, the privacy of the (redacted or modified parts of the) original message must be preserved. For MTSS, privacy of the original data that has been modified is not required. An MTSS with only location capability that guarantees privacy can be used for implementing redactable signatures, but that is not an intrinsic requirement for MTSS. Indeed, as pointed out in [18] the scheme provided in [12] does not meet standard privacy requirements. In the case of MTSS with correction capability, privacy cannot be guaranteed by definition, since the method permits the recovery of the original document.

A different scenario where a moderate amount of modification is desirable involves collaborative work. The authors of a document can use MTSS to allow further modifications as long as the original parts can be identified as their own. Other collaborators may apply modifications to the original document and append the original document’s signature, which provides information about which blocks were modified, as well as a guarantee of integrity of the original blocks. MTSS can separate the original blocks from the modified ones, while correction capability further provides retrieval of the original version of the document.

Locating modifications has also been considered in situations where modifications are not desired, but their localization is used as a mechanism to mitigate damage. Indeed, in the context of message authentication codes for data structures, Goodrich et al. [11] propose a message authentication code (MAC) with modification locating capabilities. They propose their use in data forensics applications since the identification of which information was modified can be used to identify the perpetrator of the crime (for example: the salary of a person or the grade of a student was modified on a database). In [12], in the context of MTSS with only location capability, the authors mention the advantage of being able to guarantee the integrity of part of the data instead of the all-or-nothing situation given by a CDSS boolean outcome. For example, the integrity of 95% of a document or database may contain enough information needed for a specific application, whereas it would have been considered completely corrupted and unusable in the case of CDSS. In the case of MTSS with correction capability, we can go beyond mitigating the damage, and actually recover the originally signed document.

The mechanism behind the MTSS schemes instantiated here, like in [11, 12], is the use of cover-free families (CFF) in the same way as it is traditionally employed in combinatorial group testing. Combinatorial group testing has been used in cryptography in the context of digital signatures [23], broadcast communication [8, 20], and many others. The main idea is to test tt groups, which are subsets of the nn blocks, together (with enough redundancy and intersections between groups), and each group is used to produce a distinct hash. The tuple of tt hashes is then signed and provided that no more than dd blocks were modified, it is possible to identify precisely which blocks have been modified. The main efficiency concern is the compression ratio: the order of growth of n/tn/t as nn grows, for a fixed modification tolerance dd. Using cover-free families, it is possible to achieve a compression ratio of O(n/(d2logn))O(n/(d^{2}\log n)), which is not much worse than the O(n)O(n) compression ratio given by modification intolerant schemes such as CDSS.

Our contributions: In the present paper, we propose a general framework for MTSS, and a specific MTSS scheme for modification correction and another for redacted signatures. Both schemes are based on a modification tolerance dd, indicating the maximum number of modifications or redactions. The security of the schemes rely only on an underlying classical digital signature scheme and on collision resistant hash functions; therefore the schemes can be used in the context of postquantum cryptography as long as these underlying building blocks are postquantum secure.

First, we extend methods that use cover-free families for modification location [11, 12] to further provide modification correction (Scheme 2 in Section 4.2), provided that the size of the blocks are small enough, say bounded by a constant ss where an algorithm with 2s2^{s} steps can run in an acceptable amount of time. In short, the localization of the modified blocks is done using CFF similarly to [11, 12] with an extra constraint on blocks of size at most ss, and an exhaustive search is used to correct the block to a value that makes the concatenation of a specific group of blocks match its original hash. The assumption that a collision resistant hash function does not cause collisions for messages of small size up to ss is not only reasonable, but can be tested before employing it in the method.

Second, we propose a variation of the scheme for modification location to ensure total privacy of the modified blocks in order to extend it for the purpose of redactable signatures (Scheme 3 in Section 6). In this case, a block modification can be a redaction to hide private information. Unlike the modification correction scheme, this scheme does not need a restriction on block size.

Paper Organization: In Section 2, we discuss related work. In Section 3, we give a general framework for modification tolerant signature schemes with and without modification correction. In Section 4, we instantiate the schemes for location and/or correction that allows any modifications in up to dd blocks using cover-free families. In Section 5, we prove the unforgeability of the schemes under the adaptive chosen message attack. In Section 6, we extend and instantiate the modification location scheme for redactable signatures, and prove it guarantees total privacy. In Section 7, we discuss implementation issues of the schemes proposed in this paper. In Section 8, we consider the relationship of parameters and the impact on the size of the signature and the ability to locate and correct modifications. A conclusion is given in Section 9.

2 Related Work

The idea of error detection and correction of corrupted data is well established in coding theory. Combinatorial group testing can help locating where errors (modification of data) occurred, which can be considered an intermediate goal, which is stronger than error detection and more efficient than error correction. In the context of cryptography, localization of modified portions of data has been studied in the context of hash functions [4, 5], digital signatures [3, 12], and message authentication codes [7, 11]. Correction of modifications is proposed by some schemes, but they have severe limitations on the correction capability [3, 7]. Schemes such as the ones in [4, 5, 11, 12] use techniques related to cover-free families to generate redundant information, which is used later to locate the modified portions of the data. The benefit of cover-free families is that they require a small amount of redundant data for a fixed threshold dd in location capability.

The methods that provide location at the hash level [4, 5] are based on superimposed codes, which are binary codes equivalent to dd-cover-free families (dd-CFF). These codes are used in two steps of the process: to generate a small set of hash tags from an input message, and to verify the integrity of the message using these tags. Because of the dd-CFF property, the scheme allows the identification of up to dd corrupted data segments [4, 5]. A digital signature scheme with location of modifications is presented in [12]. Their scheme uses dd-CFF matrices to generate a digital signature scheme that carries extra information for location of modifications. In this case, data is divided into nn blocks which are concatenated and hashed according to the rows of the dd-CFF. This set of hash values is signed using any classical digital signature algorithm, and both the signature and the set of hashes form the new signature of the message. The verification algorithm uses this information to precisely locate up to dd blocks that contain modified content [12]. This is presented in detail in Section 4.2 as Scheme 1. In [11], the authors propose a solution for locating the items that were modified inside a data structure. They compute a small set of message authentication tags based on the rows of a dd-disjunct matrix, which is equivalent to dd-CFFs. These tags are stored within the topology of the data structure, and an auditor can later use these tags to determine exactly which pieces of data were modified and perform forensics investigation.

The methods described above have a common approach: they all compute extra integrity information based on combinatorial group testing techniques and use this information to locate up to dd modified pieces of data. In our work, we show that if these pieces are small enough, it is possible to actually correct them to the originally-signed values. In the literature, we find signature and message authentication schemes that provide the correction of modifications, such as in [3, 7]. These schemes use different approaches than the ones previously discussed, and the capacity of location and correction is very limited. In [3] only a single error can be corrected, while in [7] the data is divided into blocks, and only one bit per block can be corrected.

Error correcting codes (dd-ECC) are related to this work, as we could use them to provide authentication and correction of dd modifications as we explain next. For a message mm formed by blocks of up to log2q\log_{2}q bits each, sign it using a CDSS and compute the corresponding codeword cmc_{m} using a dd-ECC over alphabet qq. Send the codeword cmc_{m} and the signature σ\sigma. The receiver obtains cc^{\prime}, which may differ from cmc_{m} in at most dd blocks, and by decoding cc^{\prime}, obtains cmc_{m} and thus the original message mm. Then, the verifier can check its authenticity and integrity by verifying (m,σ(m,\sigma). We call this scheme Scheme E. In Section 4.3, we compare Scheme E and Scheme 2 using dd-CFF, and find that they have very similar compression ratios. Scheme 2 has the advantage of giving more information in the failing case, of being more efficient when we do not need correction, and of being a simple variation of Schemes 1 and 3, facilitating comparison among the different approaches.

There is also a solution proposed in [2] for location of modifications in images with digital signatures, using a different approach than dd-CFFs. Their scheme consists of partitioning the image into nn blocks, and generating one digital signature per block (with dependency on neighbours to avoid attacks). Although we can use signature schemes with very small signature sizes, the total number of signatures in this scheme is linear with the number of blocks nn, while in [12], for example, the amount of extra information produced is logarithmic in nn because of the dd-CFF.

Redactable or content extraction digital signatures [6, 13, 21] are used when the owner of a document signed by another party, needs to show parts of that document to a third party, but wants to keep parts of the document private. In Section 6, we give a variation of an MTSS scheme that implements redactable signatures. More on related work involving redactable signatures is discussed in that section.

3 Framework for modification-tolerant digital signatures

3.1 Classical Digital Signature Schemes (CDSS)

Classical digital signature schemes are based on public-key cryptography and consist of three algorithms: KeyGeneration, Sign, and Verify. We consider that any document or piece of data to be signed is a sequence of bits called a message.

Definition 1

A classical digital signature scheme (CDSS) is a tuple Σ\Sigma of three algorithms:

  • KeyGeneration()(\ell) generates a pair of keys (secret and public) (SK,PKSK,PK) for a given security parameter \ell.

  • Sign(m,SK)(m,SK) receives the message mm to be signed, the secret key SKSK, and outputs the signature σ\sigma.

  • Verify(m,σ,PK)(m,\sigma,PK) takes as input a message mm, signature σ\sigma, and public key PKPK. Using PKPK, outputs 11 if the pair (m,σm,\sigma) is valid (as in Definition 2), and outputs 0 otherwise.

Definition 2

Let Σ\Sigma be a CDSS as defined above, and let (SK,PK)(SK,PK) be a pair of secret and public keys. A pair of message and signature (m,σ)(m,\sigma) is valid if σ\sigma is a valid output111We use “valid output” instead of Σ\Sigma.Sign(m,SK)=σ(m,SK)=\sigma because the signing algorithm does not need to be deterministic. of Σ\Sigma.Sign(m,SK)(m,SK).

Generally speaking, we say CDSS is unforgeable if a signature that verifies using PKPK can only be generated by the signer who has SKSK. In more detail, we consider the model of adaptive chosen message attack, where the attacker 𝒜\mathcal{A} adaptivelly chooses a list of messages m1,,mqm_{1},\ldots,m_{q}, and requests the respective signatures σ1,,σq\sigma_{1},\ldots,\sigma_{q} from an oracle 𝒪\mathcal{O}. Given this information, we say the attacker performs an existential forgery if he is able to produce a valid signature σ\sigma on a new message mm chosen by him, mmi,1iqm\neq m_{i},1\leq i\leq q [15, 22]. Because with unlimited time an adversary can perform all possible computations, we limit the computational power of the attacker by requiring an efficient algorithm to be one that runs in probabilistic polynomial time (PPT). Moreover, we say a function ff is negligible if for every polynomial pp there exists an NN such that for all n>Nn>N we have that f(n)<1/p(n)f(n)<1/p(n) [14].

Definition 3 (Unforgeability)

A digital signature scheme is existentially unforgeable under an adaptive chosen message attack if there is no PPT adversary 𝒜\mathcal{A} that can create an existential forgery with non-negligible probability.

3.2 Description of MTSS

Now we define the algorithms of the modification tolerant signature scheme (MTSS). We assume that the message mm to be signed has size |m||m| in bits, and is split into nn blocks, not necessarily of the same size. A message split into blocks is represented as a sequence m=(m[1],,m[n])m=(m[1],\ldots,m[n]), where m[i]m[i] represents the iith block of message mm. For two messages mm and mm^{\prime}, each split into nn blocks, we denote diff(m,m)={i{1,,n}:m[i]m[i]}(m,m^{\prime})=\{i\in\{1,\ldots,n\}:m[i]\neq m^{\prime}[i]\}. An authorized modification structure is a collection 𝒮P({1,,n})\mathcal{S}\subseteq P(\{1,\ldots,n\}), where PP is the power set of {1,,n}\{1,\ldots,n\}, that contains each set of blocks that the signer allows to be modified. The idea is that if σ\sigma is a valid signature for mm, then σ\sigma is a valid signature for mm^{\prime} if and only if diff(m,m)𝒮(m,m^{\prime})\in\mathcal{S}. Of course, to prevent inconsistencies we must have 𝒮\emptyset\in\mathcal{S}. Indeed, an MTSS is equivalent to a CDSS if 𝒮={}\mathcal{S}=\{\emptyset\}. A more general example is 𝒮={,{2},{3},{5},{2,3},{2,5},{3,5}}\mathcal{S}=\{\emptyset,\{2\},\{3\},\{5\},\{2,3\},\{2,5\},\{3,5\}\} for n=5n=5 blocks, which specifies blocks 11 and 44 cannot be changed and any change of at most two other blocks is allowed. The authorized modification structure is used to provide flexibility of modifications while providing control for the signer. In practice, we do not expect 𝒮\mathcal{S} to be stored explicitly, but instead to be implicitly enforced by the scheme.

Definition 4

A modification tolerant signature scheme (MTSS) for authorized modification structure 𝒮P({1,,n})\mathcal{S}\subseteq P(\{1,\ldots,n\}) on messages with nn blocks is a tuple Σ\Sigma of three algorithms:

  • MTSS-KeyGeneration()(\ell) generates a pair of secret and public keys (SK,PKSK,PK) for a given security parameter \ell.

  • MTSS-Sign(m,SK)(m,SK) receives the message mm to be signed, the secret key SKSK, and outputs the signature σ\sigma.

  • MTSS-Verify(m,σ,PK)(m,\sigma,PK) takes as input a message mm, signature σ\sigma, and public key PKPK. Outputs (1,I)(1,I) if (m,σm,\sigma) is valid for modification set II (as in Definition 5), and outputs 0 otherwise.

Definition 5

Let Σ\Sigma be an MTSS for authorized modification structure 𝒮P({1,,n})\mathcal{S}\subseteq P(\{1,\ldots,n\}), and let (SK,PK)(SK,PK) be a pair of secret and public keys. A pair (m,σ)(m,\sigma) of message and signature is valid if there exists mm^{\prime} such that σ\sigma is a valid output222We use “valid output” instead of Σ\Sigma.MTSS-Sign(m,SKm^{\prime},SK) = σ\sigma because the signing algorithm does not need to be deterministic. of Σ\Sigma.MTSS-Sign(m,SKm^{\prime},SK) and diff(m,m)𝒮(m,m^{\prime})\in\mathcal{S}. In this case, we say (m,σ)(m,\sigma) is valid for modification set II (where I=diff(m,m)I=\text{diff}(m,m^{\prime})).

The definition of unforgeability of an MTSS scheme is exactly like Definition 3, but the existential forgery now needs to produce a valid signature as given in Definition 5. This is in alignment with the notions introduced in [6].

Definition 6

An MTSS Σ\Sigma for authorized modification structure 𝒮P({1,,n})\mathcal{S}\subseteq P(\{1,\ldots,n\}) has correction capability if it is a tuple Σ\Sigma of four algorithms, which, in addition to the algorithms in Definition 4, has the following algorithm:

  • MTSS-Verify&Correct(m,σ,PKm,\sigma,PK): takes as input a message mm, signature σ\sigma, and public key PKPK. Outputs a pair (ver,m)(ver,m^{\prime}) where:

    1. 1.

      ver=Σver=\Sigma.MTSS-Verify(m,σ,PK)(m,\sigma,PK).

    2. 2.

      mm^{\prime} is a message with mλm^{\prime}\neq\lambda (the corrected message) if ver=(1,I)ver=(1,I), I=diff(m,m)I=\text{diff}(m,m^{\prime}), and (m,σ)(m^{\prime},\sigma) is a valid pair for modification set I=I^{\prime}=\emptyset; in all other cases m=λm^{\prime}=\lambda, which indicates failure to correct.

Location of modifications with MTSS would be trivial for any 𝒮P({1,,n})\mathcal{S}\subseteq P(\{1,\ldots,n\}) if the signer simply produced σ\sigma as a tuple of nn signatures, one for each block. However, this would be extremely inefficient for a large number of blocks nn. We must reconcile the objectives of having a large 𝒮\mathcal{S} and having a compact signature. Of course, the signature size depends on the security parameter, but once this is fixed, we would like the signature to grow moderately as a function of nn. This motivates the following definition of compression ratio.

Definition 7

An MTSS Σn\Sigma^{n} for messages with nn blocks and signature σ\sigma with |σ|s(n)|\sigma|\leq s(n) has compression ratio ρ(n)\rho(n) if ns(n)\frac{n}{s(n)} is O(ρ(n))O(\rho(n)).

The compression ratio measures how efficient our signature is with respect to the trivial scheme of keeping one signature per block, with ρ(n)=O(1)\rho(n)=O(1), supporting 𝒮=P({1,,n})\mathcal{S}=P(\{1,\ldots,n\}). Classical signatures have ρ(n)=n\rho(n)=n (best possible), but 𝒮={}\mathcal{S}=\{\emptyset\}. In the next section we present a tradeoff, where ρ(n)=nlogn\rho(n)=\frac{n}{\log n} and 𝒮\mathcal{S} is the set of all sets of up to dd modifications, for fixed dd. Indeed, when using cover-free families it is possible to have a compression ratio of O(nd2logn)O(\frac{n}{d^{2}\log n}) using known CFF constructions [19], while a lower bound on s(n)s(n) [10] tells us we cannot have ρ(n)\rho(n) larger than Θ(n(d2/logd)logn)\Theta(\frac{n}{(d^{2}/\log d)\log n}).

4 dd-Modification Tolerant MTSS Based on Combinatorial Group Testing

Here we propose an MTSS that allows the modification of any set of up to dd blocks in the message, for a constant dd which we call a tolerance level. In other words, the authorized modification structure is 𝒮={T{1,,n}:|T|d}\mathcal{S}=\{T\subseteq\{1,\ldots,n\}:|T|\leq d\}. To obtain a compact signature size, we rely on combinatorial group testing, which we summarize in Section 4.1 before we describe the scheme in Section 4.2. Similar modification location methods based on combinatorial group testing have been proposed in [11, 12], and the instantiation we propose for the first three algorithms of MTSS (Section 4.2) are based on [12]. Our new contributions in this section include proof of security for the MTSS scheme based on cover-free families and the addition of error correction capability by proposing algorithm MTSS-Verify&Correct that corrects modifications in this context.

4.1 Cover-Free Families and Group Testing

Combinatorial group testing deals with discovering dd defective items in a set of nn items, via testing various groups of items for the presence of defects in each group. In nonadaptive combinatorial group testing, the groups are determined before the testing process starts. For the problems considered in this paper, a modified block is a defective item, and the groups are sets of blocks combined and hashed together. In our case, we must use nonadaptive combinatorial group testing, as the tests are generated at time of signing, while verification is done later. Cover-free families allow for a small number of groups that help to identify the modified blocks.

Recall that a permutation matrix of dimension ll is an l×ll\times l matrix that is obtained from permuting rows of the identity matrix.

Definition 8

A dd-cover-free family (dd-CFF) is a t×nt\times n binary matrix \mathcal{M} such that any set of d+1d+1 columns contains a permutation submatrix of dimension d+1d+1.

Each column of \mathcal{M} corresponds to a block of the message, and each row corresponds to a group of blocks that will be tested together. A test fails if a group contains a modified block and passes if every block in a group is unchanged. The definition of dd-CFF guarantees that for any column index jj and any other set of dd column indices j1,,jdj_{1},\ldots,j_{d}, there will be a row ii s.t. i,j=1\mathcal{M}_{i,j}=1, and i,j1==i,jd=0\mathcal{M}_{i,j_{1}}=\ldots=\mathcal{M}_{i,j_{d}}=0. In other words, for any unchanged block, there exists a test that contains that block but none of the up to dd modified blocks.

Figure 1 shows an example of how to use a 22-CFF(9,129,12) to test a message with 1212 blocks (represented by the columns) by testing 99 groups of blocks (the rows). Every unchanged block is in at least one group that passes the test (with result 0), and every group that failed the test (result 1) contains at least one modified block, which in this example are blocks 33 and 1212. We note that column “result” is the bitwise-or of the columns corresponding to modified blocks 3 and 12.

blocks 1 2 3 4 5 6 7 8 9 10 11 12
X X result:
t1t_{1} 1 0 0 1 0 0 1 0 0 1 0 0 0
t2t_{2} 1 0 0 0 1 0 0 1 0 0 1 0 0
t3t_{3} 1 0 0 0 0 1 0 0 1 0 0 1 1
t4t_{4} 0 1 0 1 0 0 0 0 1 0 1 0 0
t5t_{5} 0 1 0 0 1 0 1 0 0 0 0 1 1
t6t_{6} 0 1 0 0 0 1 0 1 0 1 0 0 0
t7t_{7} 0 0 1 1 0 0 0 1 0 0 0 1 1
t8t_{8} 0 0 1 0 1 0 0 0 1 1 0 0 1
t9t_{9} 0 0 1 0 0 1 1 0 0 0 1 0 1
Figure 1: Example of a 22-CFF(9,129,12).

The next theorem is important for the efficiency of MTSS-Verify&Correct. Indeed, it ensures that for each modified block, we can find a test that contains it together with only other unchanged blocks. Therefore, an exhaustive trial-and-error can be used to guess this block until the hash of this group of blocks matches the original hash.

Theorem 4.1

Let j1,,jdj_{1},\ldots,j_{d} be the column indices that represent invalid elements. There is a row ii such that i,j1=1,i,j2==i,jd=0\mathcal{M}_{i,j_{1}}=1,\mathcal{M}_{i,j_{2}}=\ldots=\mathcal{M}_{i,j_{d}}=0.

Proof

Since \mathcal{M} is a dd-CFF, it is also a (d1)(d-1)-CFF, therefore one of the rows in the permutation submatrix indexed by j1,,jdj_{1},\ldots,j_{d} is as stated.

4.2 Description of dd-Modification Tolerant Signature Scheme

The main idea of a dd-modification tolerant signature scheme is to sign a message split into blocks by concatenating the hashes of these blocks according to a dd-CFF matrix. This allows us to locate up to dd modified blocks in the signed message, and correct these modifications. In this context, we represent a concatenation of two strings aa and bb as a||ba||b, and λ\lambda represents an empty string.

Definition 9

A dd-modification tolerant signature scheme (dd-MTSS) is an MTSS with authorized modification structure 𝒮={T{1,,n}:|T|d}\mathcal{S}=\{T\subseteq\{1,\ldots,n\}:|T|\leq d\}.

We now give an instantiation of dd-MTSS using dd-cover-free families based on [12].

Scheme 1: A dd-Modification Tolerant Signature Scheme

The scheme requires an underlying CDSS Σ\Sigma, a public hash function hh, and a dd-CFF(t,nt,n) matrix \mathcal{M}. The algorithms are given next:

  • MTSS-KeyGeneration()(\ell): generates a key pair (SK,PKSK,PK) using algorithm
    Σ\Sigma.KeyGeneration()(\ell).

  • MTSS-sign(m,SKm,SK): Takes as input a secret key SKSK and a message
    m=(m[1],,m[n])m=(m[1],\ldots,m[n]), and proceeds as follows.

    1. 1.

      Calculate hj=h(m[j]),1jnh_{j}=h(m[j]),1\leq j\leq n.

    2. 2.

      For each 1it1\leq i\leq t, compute cic_{i} as the concatenation of all hjh_{j} such that i,j=1,1jn\mathcal{M}_{i,j}=1,1\leq j\leq n. Set T[i]=h(ci)T[i]=h(c_{i}).

    3. 3.

      Compute h=h(m)h^{*}=h(m) and set T=(T[1],T[2],,T[t],h)T=(T[1],T[2],\ldots,T[t],h^{*}).

    4. 4.

      Calculate σ=Σ\sigma^{\prime}=\Sigma.Sign(T,SKT,SK). Output signature σ=(σ,T)\sigma=(\sigma^{\prime},T).

  • MTSS-verify(m,σ,PKm,\sigma,PK): takes as input a message m=(m[1],,m[n])m=(m[1],\ldots,m[n]), signature σ=(σ,T)\sigma=(\sigma^{\prime},T) for T=(T[1],T[2],,T[t],h)T=(T[1],T[2],\ldots,T[t],h^{*}), and public key PKPK, and proceeds as follows.

    1. 1.

      Ensure that Σ\Sigma.Verify(T,σ,PK)=1(T,\sigma^{\prime},PK)=1, otherwise stop and output (0,)(0,-).

    2. 2.

      Check if h=h(m)h^{*}=h(m). Stop and output (1,)(1,\emptyset) if that is the case, continue otherwise.

    3. 3.

      Use \mathcal{M} and mm and do the same process as in steps 1 and 2 of MTSS-sign to produce hashes T[1],,T[t]T^{\prime}[1],\ldots,T^{\prime}[t].

    4. 4.

      Start with an empty set VV, and for each 1it1\leq i\leq t such that T[i]=T[i]T[i]=T^{\prime}[i], compute the set of indices of unmodified blocks Vi={j:i,j=1}V_{i}=\{j:\mathcal{M}_{i,j}=1\}, and accumulate these values in the set of all indices of unmodified blocks V=VViV=V\cup V_{i}. Compute I={1,,n}VI=\{1,\ldots,n\}\setminus V. If |I|d|I|\leq d output (1,I)(1,I), else output (0,I)(0,I).

The correctness of the scheme is shown next.

Theorem 4.2

Consider a valid signature σ\sigma generated by MTSS-Sign for a message mm and key pair (SK,PKSK,PK), and let mm^{\prime} be a possibly modified message with |diff(m,m)|d|\text{diff}(m,m^{\prime})|\leq d. Then, MTSS-Verify(m,σ,PKm^{\prime},\sigma,PK) = (1,diff(m,m))(1,\text{diff}(m,m^{\prime})).

Proof

Since σ\sigma is valid, MTSS-Verify(m,σ,PKm^{\prime},\sigma,PK) does not stop at step 1. If m=mm=m^{\prime}, then h(m)=h(m)h(m)=h(m^{\prime}) and the algorithm will stop in step 2, with (1,diff(m,m)=)(1,\text{diff}(m,m^{\prime})=\emptyset). It remains to check the case it stops in step 4. The dd-CFF property guarantees that if a block has not been modified, it is contained in at least one valid concatenation with T[i]=T[i]T[i]=T^{\prime}[i], since there is a row ii that avoids all modified blocks. Therefore, this block is contained in ViV_{i}. For each row ii that contains a modified block, we have T[i]T[i]T[i]\neq T^{\prime}[i], so modified blocks are not contained in any ViV_{i}. Therefore II consists of precisely the modified blocks. Thus, |I|d|I|\leq d, and the algorithm outputs (1,I=diff(m,m))(1,I=\text{diff}(m,m^{\prime})). ∎

The next theorem shows that when step 4 outputs (0,I)(0,I), Scheme 1 may give more information than required in Definition 4, as it may identify some unmodified blocks, even if not all.

Theorem 4.3

Consider a valid signature σ\sigma generated by MTSS-Sign for a message mm and key pair (SK,PKSK,PK), and let mm^{\prime} be a modified message with |diff(m,m)||\text{diff}(m,m^{\prime})| >d>d. Then, MTSS-Verify(m,σ,PKm^{\prime},\sigma,PK) = (0,I)(0,I), and for any i{1,,n}Ii\in\{1,\ldots,n\}\setminus I, block m[i]m[i] is guaranteed to be unmodified.

Proof

This case will lead MTSS-Verify to step 4, and since |diff(m,m)|>d|\text{diff}(m,m^{\prime})|>d, the output will be (0,I)(0,I). Any block in {1,,n}I\{1,\ldots,n\}\setminus I is guaranteed to be part of matching row ii, and must be unmodified, even though not every unmodified block will necessarily be placed in {1,,n}I\{1,\ldots,n\}\setminus I. ∎

Scheme 1 has been proposed in [12]. One of our main contributions here is to add correcting capability to dd-MTSS. We require the size of each block to be upper bounded by a value ss that is small enough that guessing each of the (up to) dd modified blocks is “computationally feasible”. Basically, by brute force we compute up to O(d2s+n)O(d2^{s}+n) hashes to accomplish the modification correction (see the algorithm under Scheme 2). We use the indices of the modified blocks in II and do an exhaustive search to recover their original values.

Scheme 2: A dd-MTSS with Correction Capability

The scheme requires an underlying CDSS Σ\Sigma, a public hash function hh, and a dd-CFF(t,nt,n) matrix \mathcal{M}. It further requires that the message is divided into nn blocks of size at most ss. The scheme has the three algorithms from Scheme 1, and additionally the algorithm below:

  • MTSS-Verify&Correct(m,σ,PKm,\sigma,PK): receives as input a message
    m=(m[1],,m[n])m=(m[1],\ldots,m[n]), a signature σ=(σ,T)\sigma=(\sigma^{\prime},T) where T=(T[1],T[2],,T[t],h)T=(T[1],T[2],\ldots,T[t],h^{*}), and a public key PKPK, and proceeds as follows.

    1. 1.

      Compute result = MTSS-verify(m,σ,PK{\color[rgb]{0,0,0}{m}},\sigma,PK). If result = (0,X)(0,X), then stop and output (0,X)(0,X), otherwise result = (1,I1,I). If |I|=0|I|=0 go to step 6, otherwise run steps 252-5 for each kIk\in I.

    2. 2.

      Identify a row ii in \mathcal{M} such that i,k=1\mathcal{M}_{i,k}=1 and i,j=0\mathcal{M}_{i,j}=0, for all jI{k}j\in I\setminus\{k\}.

    3. 3.

      Compute hj=h(m[j])h_{j}=h(m[j]) for all jj such that i,j=1,jk\mathcal{M}_{i,j}=1,j\neq k, ii from step 2. Set corrected[k] = false.

    4. 4.

      For every possible binary string bb of size s\leq s, proceed as follows:

      • Compute hk=h(b)h_{k}=h(b).

      • For ii obtained in step 2 and 1jn1\leq j\leq n, compute cic_{i} as the concatenation of every hjh_{j} such that i,j=1\mathcal{M}_{i,j}=1 and set T[i]=h(ci)T^{\prime}[i]=h(c_{i}).

      • If T[i]=T[i]T^{\prime}[i]=T[i] and corrected[k] = false, set corrected[k] = true and correct the block m[k]=bm[k]=b.

      • Else, if T[i]=T[i]T^{\prime}[i]=T[i] and corrected[k] = true, stop and output (1,I,λ1,I,\lambda).

    5. 5.

      Return to step 2 with the next kIk\in I.

    6. 6.

      Output (1,I,m)(1,I,m).

We note that the flag corrected[k]corrected[k] is used to identify a possible collision of two different bit strings bb giving the same hash value h(m[k])h(m[k]). Since the correct block cannot be determined, we exit with λ\lambda indicating failure. The next proposition has details on the correctness of this algorithm.

Proposition 1

Let (m,σ)(m,\sigma) be a valid pair of message and signature produced by MTSS-Sign(m,SKm,SK), using a hash function hh and with m=(m[1],,m[n])m=(m[1],\ldots,m[n]). Let mm^{\prime} be a message and let I=diff(m,m)I=\text{diff}(m,m^{\prime}) with |I|d|I|\leq d. If for every kIk\in I, h(m[k])h(m[k]) has no other preimage of size up to ss, then
MTSS-Verify&Correct(m,σ,PKm^{\prime},\sigma,PK) = (1,I,m)(1,I,m).

Proof

As seen in Theorem 4.2, the set II contains precisely the indices of the modified blocks, and Theorem 4.1 guarantees that such a row ii in step 22 of the algorithm exists. Finally, if for every kIk\in I the hash h(m[k])h(m[k]) has no second preimage, then step 44 of the algorithm computes a unique replacement for each modified block, and the algorithm outputs the corrected message in step 66. ∎

Now we prove that when selecting a good hash function, we can always guarantee the correction of any up to dd modified blocks.

Theorem 4.4

Consider Scheme 2 restricted to messages with blocks of size at most ss, and such that hh is a hash function where no two inputs of size up to ss have the same hash value. Then, MTSS-Verify&Correct(m,σ,SKm^{\prime},\sigma,SK) can always correct a message with up to dd modified blocks.

Proof

Easily obtained by Proposition 1 since no matter what is the value of the block, no other block of size up to ss can have the same image under hh. ∎

Next, we show that the assumption of existence of such hash function is realistic, and in fact very easy to find.

Proposition 2

Consider a family of hash functions h:XYh:X\rightarrow Y where |Y|=2l|Y|=2^{l} and a subset of the inputs SXS\subseteq X where |S|2s+1|S|\leq 2^{s+1}. The probability that there is collision in SS, i.e. the probability that there exists x,z in S with h(x)=h(z)h(x)=h(z), is approximately ϵ=1e22sl+1\epsilon=1-e^{-2^{2s-l+1}}.

Proof

This comes from the fact that the probability of finding at least one collision after QQ hash calculations is approximately 1eQ22l+11-e^{-\frac{Q^{2}}{2^{l+1}}}, and in our application Q=|S|=2s+1Q=|S|=2^{s+1}.

In practice, we will set 2sl2s\ll l. This ensures via Proposition 2 that hh is very likely to have the desired property of being injective on SS, the set of binary strings with size at most ss. In the unlikely event that hh fails this property, we try another hash function until we succeed. The expected number of attempts will be very close to 1. For example, if we consider SHA-256256 as the hash function, |Y|=2256|Y|=2^{256}, and for s=20s=20, the probability of a collision within the set of size at most 2020 is ϵ=1e240256+13.70×1068\epsilon=1-e^{-2^{40-256+1}}\approx 3.70\times 10^{-68}. Indeed, we experimentally verified that SHA-256256 has no collisions for s=20s=20.

Theorem 4.5

Consider Scheme 2 with tolerance level dd, and let mm be a message split into nn blocks, each block of size at most ss. Let σ\sigma = MTSS-Sign(m,SKm,SK) and mm^{\prime} be a message with I=diff(m,m)I=\text{diff}(m,m^{\prime}) with |I|d|I|\leq d. Then, MTSS-Verify&Cor- rect(m,σ,PKm^{\prime},\sigma,PK) returns (1,I,m)(1,I,m) and uses O(n+d2logn+d2s)O(n+d^{2}\log n+d2^{s}) hash computations.

Proof

By choosing a suitable hash function, Theorem 4.4 guarantees that MTSS-Verify&Correct returns (1,I,m)(1,I,m). The algorithm starts with the location of the modified blocks. This step uses a dd-CFF for which we can use a construction with t=O(d2logn)t=O(d^{2}\log n) rows [19], and therefore a total of n+tn+t hash calculations are required to locate these modifications. After locating up to dd modified blocks, we need to perform the following computations for each one of them. We compute every possible block of size up to ss and their corresponding hash values (total of 2s+112^{s+1}-1), and according to the row of the CFF matrix, we compute a few extra hashes (in total not more than nn, if storing hih_{i} instead of recomputing among different runs of of line 3). This gives a total of O(d2s+n)O(d2^{s}+n) hash computations for the correction step.

4.3 Comparing Scheme 2 with Scheme E

Let an (l,n,Dl,n,D)q-ECC 𝒞\mathcal{C} be an error correcting code with minimum distance DD, codewords of length ll that encode messages of size nn over an alphabet of size qq. The alphabet must be so that it can distinguish each possible block considered in Scheme 2, so q2sq\geq 2^{s}. For simplification, assume cm=(m,bm)c_{m}=(m,b_{m}) where bmb_{m} is a tuple of lnl-n letters (“check bits”). This code can correct d=D12d=\lfloor\frac{D-1}{2}\rfloor errors (modifications in the message). Next, we describe signature and verification using Scheme E. Using the same inputs as MTSS-Sign, algorithm ECC-Sign(m,SKm,SK) do the following steps: 1) Compute σ=Σ\sigma^{\prime}=\Sigma.Sign(m,SKm,SK); 2) Compute cm=(m,bm)c_{m}=(m,b_{m}) according to ECC 𝒞\mathcal{C} and return σ=(bm,σ)\sigma=(b_{m},\sigma^{\prime}). Then, algorithm ECC-Verify&Correct(m,σ=(b,σ),PKm,\sigma=(b,\sigma^{\prime}),PK) do the following steps: 1) Decode c=(m,b)c=(m,b) to mm^{\prime} using 𝒞\mathcal{C}; 2) If Σ\Sigma.Verify(m,σ,PKm^{\prime},\sigma,PK) = 1 then return (1,diff(m,m),m)(1,\text{diff}(m,m^{\prime}),m^{\prime}), else return 0.

Since 𝒞\mathcal{C} can correct up to dd errors, if the signature σ\sigma is valid (σ\sigma^{\prime} is authentic for mm^{\prime} and |diff(m,m)|d|\text{diff}(m,m^{\prime})|\leq d), then ECC-Verify&Correct will behave in the same way as MTSS-Verify&Correct and will return (1,diff(m,m),m)(1,\text{diff}(m,m^{\prime}),m^{\prime}). However, when ECC-Verify&Correct returns 0, it does not distinguish between the two failing cases obtained by MTSS-Verify&Correct, namely: Case 1) output (0,)(0,-), which means the CDSS signature σ\sigma^{\prime} is not authentic; Case 2) output (0,I)(0,I), which means σ\sigma^{\prime} is authentic and message mm differs from the signed mm^{\prime} in more than dd blocks, and also if |I|<n|I|<n, then |{1,,n}I|>0|\{1,\ldots,n\}\setminus I|>0 and we are sure of at least one unmodified block. Therefore, while the Scheme E is an MTSS scheme according to Definition 4, it provides less information than Scheme 2.

A comparison of Scheme 2 with Scheme E shows they have similar compression ratios. Indeed, we first note that given an (l,n,D)q(l,n,D)_{q}-ECC, it is possible to construct a dd-CFF(lq,qnl\cdot q,q^{n}), with d=(l1)/(lD)d=\lfloor(l-1)/(l-D)\rfloor [16]. Then, we consider some families of error correcting codes, and for each family we restrict Scheme 2 to only use CFFs constructed using codes from this family. In the next table, we summarize the compression ratios obtained for 3 families of codes, but we omit the details of our calculations due to lack of space.

code: Hamming, n=22r1n=2^{2^{r}-1} Reed-Solomon, n=q(q1)d+1n=q^{\frac{(q-1)}{d}+1} PR-code[19]
Scheme 2 22r2r2\approx 2^{2^{r}-2r-2} q(q1)d1q^{\frac{(q-1)}{d}-1} Θ(nd2logn)\Theta(\frac{n}{d^{2}\log n})
Scheme E 22r2r1\approx 2^{2^{r}-2r-1} q(q1)d+12d+1\frac{q^{\frac{(q-1)}{d}+1}}{2d+1} O((logd)nd2logn)O(\frac{(\log d)n}{d^{2}\log n})

In conclusion, both schemes serve the same purpose with similar compression ratios, but Scheme 2 has several advantages. First, Scheme 2 gives more information in the failing case where ECC returns 0, as discussed above. Second, Scheme 2 provides a non-correcting version of the verification algorithm (MTSS-Verify) which in the case of unmodified messages is basically as time efficient as Σ\Sigma.Verify (see step 1 of MTSS-Verify); in this case, Scheme E still needs to run a decoding algorithm, with complexity influenced by q2sq\geq 2^{s}, where ss is the largest size of a block. Finally, Scheme 2 is part of a family of similar schemes presented here (Schemes 1-3), which can be more easily compared.

5 Security

In this section, we present the security proof of dd-MTSS for Schemes 1 and 2. In order to do this, we need to check that the security of the hash function hh and the unforgeability of the underlying CDSS can together ensure unforgeability of dd-MTSS. Note that although Scheme 1 appeared in [12], no security proof has been given in that paper. For the next proof we assume a collision-resistant hash function, i.e. a hash function in which a collision cannot be efficiently found [22].

When we consider dd-MTSS using dd-CFFs, a valid (m,σ)(m,\sigma) as in Definition 5 implies that there exists mm^{\prime} such that σ\sigma is an output of MTSS-Sign(m,SKm^{\prime},SK) and |diff(m,m)|d|\text{diff}(m,m^{\prime})|\leq d. In the next theorem we suppose there is a valid (m,σ)(m,\sigma) as a forgery to our scheme. We consider two types of forgery: a strong forgery consists of (m,σ)(m,\sigma) such that MTSS-Verify(m,σ,PKm,\sigma,PK) = (1,I),|I|d(1,I),|I|\leq d; a weak forgery consists of (m,σ)(m,\sigma) such that MTSS-Verify(m,σ,PKm,\sigma,PK) = (1,)(1,\emptyset).

Theorem 5.1

Let X be a dd-MTSS as described in Scheme 1 based on an existentially unforgeable CDSS Σ\Sigma and on a collision resistant hash function hh. Then, X is existentially unforgeable.

Proof

(By contradiction) Suppose X is not existentially unforgeable. Then, there is an adversary 𝒜\mathcal{A} that, after qq adaptive queries to a signing oracle 𝒪\mathcal{O}, obtains pairs (m1,σ1),,(mq,σq)(m_{1},\sigma_{1}),\ldots,\\ (m_{q},\sigma_{q}), with σi=(σi,Ti),1iq\sigma_{i}=(\sigma^{\prime}_{i},T_{i}),1\leq i\leq q, and with non-negligible probability, outputs a valid pair (m,σ)(m,\sigma), with σ=(σ,T)\sigma=(\sigma^{\prime},T), T=(T[1],T[2],,T[t],h)T=(T[1],T[2],\ldots,T[t],h^{*}), and |diff(m,mi)|>d|\text{diff}(m,m_{i})|>d, for all 1iq1\leq i\leq q.

We show that if such 𝒜\mathcal{A} exists, then we can build a probabilistic polynomial time algorithm 𝒜\mathcal{A}^{\prime} which has the following input and output:

Input: an existentially unforgeable CDSS Σ\Sigma and a collision-resistant hash function hh, both with security parameter \ell.

Output: either a existential forgery (T,σ)(T,\sigma^{\prime}) of Σ\Sigma or a collision pair for hh.

Next we describe the steps of such 𝒜\mathcal{A}^{\prime}.

  1. 1.

    Simulate the probabilistic polynomial time adversary 𝒜\mathcal{A} forging an MTSS based on Σ\Sigma and hh using queries mentioned above. With non-negligible probability, this will produce a forgery (m,σ=(σ,T))(m,\sigma=(\sigma^{\prime},T)) in X, as described above. Otherwise, return “FAIL”.

  2. 2.

    If TTiT\neq T_{i}, for all 1iq1\leq i\leq q, 𝒜\mathcal{A}^{\prime} presents (T,σT,\sigma^{\prime}) as a forgery in Σ\Sigma, for σ\sigma^{\prime} the corresponding signature of TT.

  3. 3.

    If T=TiT=T_{i}, for some 1iq1\leq i\leq q, first calculate II by computing MTSS-Verify(m,σ,PKm,\sigma,PK). Then, we have:

    • In the case of weak forgery, we must have I=I=\emptyset, and so the final element of TT is h(m)h(m) and the final element of TiT_{i} is h(mi)h(m_{i}), and since mmim\neq m_{i}, 𝒜\mathcal{A}^{\prime} presents (m,mim,m_{i}) as a collision pair for hh.

    • Otherwise, we must have a strong forgery with 1|I|d1\leq|I|\leq d. So, there exists mm^{\prime} such that MTSS-Verify(m,σ,PKm^{\prime},\sigma,PK) = (1,)(1,\emptyset) and |diff(m,m)|d|\text{diff}(m,m^{\prime})|\leq d. Since |diff(m,mi)|>d|\text{diff}(m,m_{i})|>d, there must be a block m[k],k{1,,n}Im[k],k\in\{1,\ldots,n\}\setminus I, that is considered valid in mm but m[k]mi[k]m[k]\neq m_{i}[k]. Let pp be any row of the CFF matrix {\cal M} with p,k=1{\cal M}_{p,k}=1. Because T=TiT=T_{i}, this implies T[p]=h(c)=h(c)=Ti[p]T[p]=h(c)=h(c^{\prime})=T_{i}[p], for c=h(m[j1])||h(m[j2])||||h(m[js])c=h(m[j_{1}])||h(m[j_{2}])||\ldots||h(m[j_{s}]) and c=h(mi[j1])||h(mi[j2])||||h(mi[js])c^{\prime}=h(m_{i}[j_{1}])||h(m_{i}[j_{2}])||\ldots||h(m_{i}[j_{s}]), where k{j1,j2,,js}k\in\{j_{1},j_{2},\ldots,j_{s}\}. If h(m[k])=h(mi[k])h(m[k])=h(m_{i}[k]), then 𝒜\mathcal{A}^{\prime} presents (m[k],mi[k]m[k],m_{i}[k]) as a collision pair for hh. Otherwise, 𝒜\mathcal{A}^{\prime} presents (c,cc,c^{\prime}) as a collision pair for hh.

The probability p()p(\ell) that 𝒜\mathcal{A}^{\prime} succeeds is the same as the probability that 𝒜\mathcal{A} succeeds, where \ell is the security parameter. Whenever 𝒜\mathcal{A}^{\prime} succeeds, we have either an existential forgery of Σ\Sigma or a collision for hh, corresponding to steps 2 and 3, respectively. For each \ell, one of these steps is at least as likely to occur as the other, so it has probability at least 1/21/2. So, for any security parameter \ell, use one of two algorithms (an existential forger for Σ\Sigma, or a collision finder for hh) that runs in probabilistic polynomial time and succeeds with probability at least p()/2p(\ell)/2. In other words, we can technically design two adversaries 𝒜1\mathcal{A}^{\prime}_{1} and 𝒜2\mathcal{A}^{\prime}_{2} based on 𝒜\mathcal{A}^{\prime}. 𝒜1\mathcal{A}^{\prime}_{1} forges Σ\Sigma by proceeding as 𝒜\mathcal{A}^{\prime} but returning “FAIL” if it falls in the case of step 3. Similarly, 𝒜2\mathcal{A}^{\prime}_{2} finds a collision for hh or returns “FAIL” if it falls in the case of step 2. Then, either 𝒜1\mathcal{A}^{\prime}_{1} or 𝒜2\mathcal{A}^{\prime}_{2} will succeed with probability at least p()/2p(\ell)/2 for infinitely many \ell. This contradicts either the unforgeability of Σ\Sigma or the collision resistance of hh. ∎

6 Using MTSS for Redactable Signatures

Now we turn our attention to using MTSS in general and using similar algorithms to our proposed dd-MTSS for redactable signature schemes. Redactable and sanitizable signature schemes allow for parts of a message to be removed or modified while still having a valid signature without the intervention of the signer. Sanitizable signatures usually requires the existence of a semi-trusted party called the sanitizer who is entrusted to do the modifications and recomputation of the signature, in some schemes done in a transparent way (see [18]). Our proposed scheme does not deal with sanitizable, but rather with redactable signatures.

Redactable signatures have been proposed in several variants also under the names of content extraction signatures [21] and homomorphic signatures [13]. Redactable signature schemes (RSS) “allow removing parts of a signed message by any party without invalidating the respective signature” [6] and without interaction with the signer. In [21], the authors mention content extraction for privacy protection or bandwidth savings. Suppose Bob is the owner of a document signed by Alice and does not want to send the whole document to a third verifying party Cathy but only some extracted parts of the document; however, Cathy still needs to verify that Alice is the signer of the original document. Alice is agreeable with future redactions when she signed the document. An example given in [21] is that Bob has his university transcripts signed by the issuing university (Alice) and wants to submit the transcripts to a prospect employer (Cathy) without revealing some of his private information such as his date of birth. Cathy must be able to verify that the signature came from the university in spite of the redaction. In addition to the privacy application, content extraction can be used in a similar way when only part of a large document needs to be passed by Bob to Cathy for the purpose of reducing the communication bandwidth [21].

The notion of redactable signatures we consider next is in line with our general definition of MTSS, but differs in that we add another algorithm called MTSS-Redact and we require total privacy of the redacted parts. We give next a dd-MTSS version that is redactable based on ideas similar to Scheme 1 but modifying it to guarantee total privacy of redacted blocks. As mentioned in [18], the scheme proposed in [12] which was presented as Scheme 1 does not meet standard privacy requirements and in particular leaks the original message’s hash value. The scheme we propose below addresses this issue by individually signing each of the hashes of the groups of blocks, and at the time of redaction of blocks, also redacting from the signature tuple any hashes involving concatenations of the modified blocks. To avoid more complex forms of forgery, that could take advantage of combining individual signatures of concatenations coming from different signed messages, we add the same random string to the various hashes of concatenations to link the individual parts that are signed at the same time; in addition, to avoid reordering of blocks within the same message via reordering the groups, we add a group counter to these hashes before signing.

In the description below, a redaction is represented by the symbol \blacksquare.

Scheme 3: A Redactable dd-MTSS with Privacy Protection

The scheme requires an underlying CDSS Σ\Sigma, a public hash function hh, a dd-CFF(t,nt,n) matrix \mathcal{M} and a random number generator rand. The algorithms are given next:

  • MTSS-KeyGeneration()(\ell): generates a key pair (SK,PKSK,PK) using algorithm
    Σ\Sigma.KeyGeneration()(\ell).

  • MTSS-sign(m,SKm,SK): Takes as input SKSK and a message m=(m[1],,m[n])m=(m[1],\ldots,m[n]), and proceeds as follows.

    1. 1.

      Calculate hj=h(m[j]),1jnh_{j}=h(m[j]),1\leq j\leq n. Compute a random string r=r=rand().

    2. 2.

      For each 1it1\leq i\leq t, compute cic_{i} as the concatenation of all hjh_{j} such that i,j=1,1jn\mathcal{M}_{i,j}=1,1\leq j\leq n, and set T[i]=h(ci)rid(i,t+1)T[i]=h(c_{i})||r||id(i,t+1), where id(i,t+1)id(i,t+1) encodes the numbers ii and t+1t+1.

    3. 3.

      Compute h=h(m)h^{*}=h(m), set T[t+1]=hrid(t+1,t+1)T[t+1]=h^{*}||r||id(t+1,t+1) and set T=(T[1],,T[t+1])T=(T[1],\ldots,T[t+1]).

    4. 4.

      Calculate σ[i]=Σ\sigma^{\prime}[i]=\Sigma.Sign(T[i],SKT[i],SK), for each 1it+11\leq i\leq t+1 and set σ=(σ[1],σ[2],,σ[t+1])\sigma^{\prime}=(\sigma^{\prime}[1],\sigma^{\prime}[2],\ldots,\sigma^{\prime}[t+1]). Output signature σ=(σ,r)\sigma=(\sigma^{\prime},r).

  • MTSS-verify(m,σ,PKm,\sigma,PK): takes as input a message m=(m[1],,m[n])m=(m[1],\ldots,m[n]), a signature σ=(σ,r)\sigma=(\sigma^{\prime},r), and a public key PKPK, and proceed as follows.

    1. 1.

      Check if Σ\Sigma.Verify(h(m)rid(t+1,t+1),σ[t+1],PK)=1(h(m)||r||id(t+1,t+1),\sigma^{\prime}[t+1],PK)=1. Stop and output (1,)(1,\emptyset) if that is the case; continue otherwise.

    2. 2.

      Use \mathcal{M}, mm, rr and do the same process as in steps 1-3 of MTSS-sign to produce tuple T=(T[1],,T[t])T^{\prime}=(T^{\prime}[1],\ldots,T^{\prime}[t]).

    3. 3.

      For each 1it1\leq i\leq t such that Σ\Sigma.Verify(T[i],σ[i],PK)=1(T^{\prime}[i],\sigma^{\prime}[i],PK)=1, compute the set of indices of intact blocks Vi={j:i,j=1}V_{i}=\{j:\mathcal{M}_{i,j}=1\} and do V=VViV=V\cup V_{i}. Compute I={1,,n}VI=\{1,\ldots,n\}\setminus V. Output (1,I)(1,I) if |I|d|I|\leq d; output 0 otherwise.

  • MTSS-redact(m,σ,Rm,\sigma,R): takes as input a message m=(m[1],,m[n])m=(m[1],\ldots,m[n]), a signature σ=(σ,r)\sigma=(\sigma^{\prime},r), and a set R{1,,n}R\subseteq\{1,\ldots,n\} of blocks to be redacted, with |R|d|R|\leq d, and proceeds as follows.

    1. 1.

      If R=R=\emptyset, then stop and output (m,σ)(m,\sigma).

    2. 2.

      Create copies of the message and signature: m¯=m\overline{m}=m, σ¯=σ\overline{\sigma}={\sigma}, so that σ¯=(σ¯,r)\overline{\sigma}=(\overline{\sigma}^{\prime},r).

    3. 3.

      Set σ¯[t+1]=\overline{\sigma}^{\prime}[t+1]=\blacksquare.

    4. 4.

      For each jRj\in R: set m¯[j]=\overline{m}[j]=\blacksquare and for each index ii such that i,j=1\mathcal{M}_{i,j}=1 set σ¯[i]=\overline{\sigma}^{\prime}[i]=\blacksquare

    5. 5.

      Output (m¯,σ¯=(σ¯,r))(\overline{m},\overline{\sigma}=(\overline{\sigma}^{\prime},r)).

The correctness of the first three algorithms follows similar reasoning using the CFF properties as argued in Theorem 4.2, taking into account the different approach used here where tt CDSS signatures and tt CDSS verifications are required. The redaction offers total privacy in the sense of information theory, as no information related to the redacted blocks is kept in the signature, which are erased in steps 3 and 4 of MTSS-redact. The redacted blocks and redacted parts of the signature will only affect the verification of the parts of the signatures involving redacted blocks; all other block indices will be indicated as unmodified in the output of line 3 of MTSS-verify, as long as no more than dd blocks have been redacted.

In the next theorem, we establish the unforgeability of this scheme.

Theorem 6.1

Let X be a redactable dd-MTSS given in Scheme 3 based on an existentially unforgeable CDSS Σ\Sigma, on a collision-resistant hash function hh, and on a random Oracle. Then, X is existentially unforgeable.

Sketch of the proof. Similar reasoning as in Theorem 5.1 can be used to argue that the unforgeability of Σ\Sigma and the collision resistance of the hash function are sufficient to protect against forgery at the level of individual parts of the tuple signature. However, we need to rule out more complex forgery attempts that could try to combine individual signatures in the signature tuple in different ways as well as different blocks from different messages. The use of a common random string in the creation of each T[i]T[i] in lines 2 and 3 of algorithm MLSS-Sign prevents an adversary from trying to combine message blocks and signature parts of (m1,σ1),,(mq,σq)(m_{1},\sigma_{1}),\ldots,(m_{q},\sigma_{q}) coming from different calls to the random Oracle to create a new valid signature since, by definition, two different calls to the oracle will, with high probability, not produce the same random string. Furthermore, the concatenation of the encoding of the test index and the number of tuple elements, id(i,t+1)id(i,t+1), within each tuple position T[i]T[i] in line 2 and 3, makes it highly unlikely that blocks are added, removed or reordered, since this would amount to finding a collision in the hash function. ∎

7 Implementation Issues

When implementing Schemes 1-3, there are a few details that need to be considered regarding efficiency, security, and even flexibility on the inputs or outputs of the algorithms. For example, we consider a message divided into blocks that we represented as a sequence, but blocks may be represented using different data structures depending on the application and the type of data. Pohls [18] discusses sequence, set and tree data structures for blocks, which could be employed.

When signing a message using MTSS-Sign as described in Schemes 1-3, we could consider not hashing every block before concatenation, especially if the sizes of the blocks are small enough that a hash function will end up increasing the size of the data being concatenated. This approach needs to be carefully considered, since a simple concatenation does not insert a delimitation on where the individual blocks start or end, and therefore blocks with a shared suffix or prefix may lead to wrong identifications of valid/invalid. To be safe, we can hash each block before concatenating them as presented in our schemes, or ensure the concatenations use delimiters to separate blocks, or require blocks of fixed size.

The correction algorithm MTSS-Verify&Correct computes the set of indices of modified blocks II and tries to correct these blocks to their original value. If there is at least one position where two different blocks match the hash of the original one (a second preimage on the hash function), the algorithm aborts the correction process and outputs an empty string λ\lambda to represent correction error. As seen in Proposition 2 and the discussion that follows, it is very unlikely for such event to occur, and we could always choose another hash function with no collisions for certain block sizes. However, if some specific application is not flexible in the choice of hash functions and such an event happens, an interesting approach can be to correct as many modified blocks as possible, and return some information regarding the ones with collisions (such as a fail message or even a list of possible values for those specific blocks). This approach allows for partial correction of modified data, that may be interesting for some applications. Moreover, if we already know that the chosen hash function has no collisions for blocks of size up to ss, we do not need to run step 4 for every possible block of size s\leq s, and we could stop this loop as soon as we find the first match, improving the efficiency of the method.

Regarding the multiple hash computations that happen in the correction algorithm, one could consider some improvements. Note that we already compute the hashes of all unmodified blocks when we do the call of MTSS-Verify to obtain the set II, so these hashes of unmodified blocks could be reused in step 2. Moreover, we repeat for every modified block in II the same process of computing all blocks of size s\leq s and their corresponding hash values. For the cases where we have a big set of modified blocks (big dd), one may consider to pre-compute all these values, in case the application can handle the storage that this will require.

8 Parameter Relations

For MTSS with modification location only (Scheme 1), a message mm is split into nn blocks of no specific size, and a location capability parameter dd needs to be decided. These parameters are used to construct a dd-CFF(t,nt,n) with number of rows t=Θ(d2logn)t=\Theta(d^{2}\log n) if using [19], t=dnt=d\sqrt{n} if using [17], and t=q2t=q^{2} when using [9], for any prime power qq and positive integer kk that satisfy qdk+1q\geq dk+1. The signature size is impacted by dd and nn, since there are t+1t+1 hashes as part of the signature. Therefore, while smaller sizes of blocks and larger dd give a more precise location of modifications, they also increase the size of the signature.

Now consider the exact same message mm, but for the case of an MTSS with correction capabilities (Scheme 2). This scheme requires that the blocks have a small size of up to ss bits, which implies that the very same message mm now has many more blocks nnn^{\prime}\gg n. A larger number of blocks directly increases the size of the signature. But now, the dd that was enough to locate the modifications before may not be enough anymore, since modifications that were in one big block before now may be spread over several small blocks. When locating the modifications, the algorithm aborts the process if the number of modified blocks is larger than the expected location capability dd, which may cause the scheme to fail to correct more often if this value is not increased as nn^{\prime} increases.

The size of the signature σ\sigma in Schemes 1 and 2 is |σ|=|h(.)|×(t+1)+|σ||\sigma|=|h(.)|\times({\color[rgb]{0,0,0}{t}}+1)+|\sigma^{\prime}|, and in Scheme 3 is |σ|=|σ|×(t+1)+|r||\sigma|=|\sigma^{\prime}|\times({\color[rgb]{0,0,0}{t}}+1)+|r|, for σ\sigma^{\prime} a classical digital signature, rr a random bit string, h(.)h(.) a traditional hash function, and tt the number of rows of a dd-CFF(t,nt,n). The input consists of a message of size |m|=N|m|=N in bits, divided into nn blocks, each of size at most ss, so Nn×sN\approx n\times s. In Scheme 2, given NN, we need to wisely choose ss so it is small enough to allow corrections of blocks, while guaranteeing nn is small enough to have a reasonable |σ||\sigma|. We cannot expect our signature to be as small as the ones from classical digital signature schemes, since we require extra information in order to be able to locate and correct portions of the data. In summary what we want is n×s|σ|n\times s\gg{\color[rgb]{0,0,0}{|\sigma|}}.

The next examples show that even for small s=8s=8, we still have a reasonably small signature.

Example 1: N=1N=1 GB = 2302^{30} bytes = 23023=2332^{30}2^{3}=2^{33} bits, h=h= SHA-256256, s=log|h|=log28=8s=\log|h|=\log 2^{8}=8 bits, d=4d=4, and we use RSA with a 20482048-bit modulus. Then we have n=N/s=233/23=230n=N/s=2^{33}/2^{3}=2^{30} blocks, with 88 bits each. Consider the dd-CFF(q2,qk+1q^{2},q^{k+1}) from [9], with k=6,q=25,t=252,n=257k=6,q=25,t=25^{2},n=25^{7}. Now, since |σ|=|h(.)|×(t+1)+|σ||\sigma|=|h(.)|\times({\color[rgb]{0,0,0}{t}}+1)+|\sigma^{\prime}| in Schemes 1 and 2, we have |σ|=28×(252+1)+2048=162304|\sigma|=2^{8}\times({\color[rgb]{0,0,0}{25^{2}}}+1)+2048={\color[rgb]{0,0,0}{162304}} bits, which is 20288{\color[rgb]{0,0,0}{20288}} bytes, or 20{\color[rgb]{0,0,0}{\sim 20}} KB.

Example 2: For the same message and parameters as in Example 1, now we use a random bit string rr of size 128128 bits and create a signature as in Scheme 3. Since |σ|=|σ|×(t+1)+|r||\sigma|=|\sigma^{\prime}|\times({\color[rgb]{0,0,0}{t}}+1)+|r|, we have |σ|=2048×(252+1)+128=1282176|\sigma|=2048\times({\color[rgb]{0,0,0}{25}}^{2}+1)+128={\color[rgb]{0,0,0}{1282176}} bits, which is 160272{\color[rgb]{0,0,0}{160272}} bytes, or 160\sim{\color[rgb]{0,0,0}{160}} KB.

For Scheme 3, small blocks like in Example 2 are not required, so we can get a much smaller signature. The choice of ss and nn in this case depends on the application, and on whether we wish to correct non-redactable blocks.

9 Conclusion

We introduce modification tolerant signature schemes (MTSS) as a general framework to allow localization and correction of modifications in digitally signed data. We propose a scheme based on dd-CFFs that allows correction of modifications, and a variation without correction that gives redactable signatures. The presented schemes are provably secure and present a digital signature of size close to the best known lower bound for dd-CFFs.

Interesting future research includes an implementation of the schemes proposed here, with practical analysis of parameter selection for specific applications. In addition, new solutions for MTSS beyond dd-CFFs can also be of interest. The dd-CFF treats every block the same and allows for any combination of up to dd blocks to be modified. Specific applications can have different requirements about what combinations of blocks can be modified by specifying a different authorized modification structure. Moreover, other hypotheses about more likely distribution of modifications throughout the document can be used for efficiency; for example, modified blocks may be concentrated close to each other. One idea in this direction is to use blocks with sub-blocks for increasing granularity of modification location and to aid correction. This would involve matrices with a smaller dd for the bigger blocks and a larger dd for sub-blocks of a block, picking these parameters with the aim of decreasing tt and consequently signature size. It is out of the scope of this paper to go into detailed studies of other types of modification location matrices, which we leave for future work.

References

  • [1] G. Ateniese, D. H. Chou, B. de Medeiros, and G. Tsudik. Sanitizable Signatures. In European Symposium on Research in Computer Security (ESORICS 2005), Lecture Notes in Computer Science, 3679, pages 159–177.
  • [2] P. S. L. M. Barreto, H. Y. Kim and V. Rijmen. Toward secure public-key blockwise fragile authentication watermarking. In IEE Proceedings - Vision, Image and Signal Processing, 149(2):57–62, 2002.
  • [3] R. G. Biyashev and S. E. Nyssanbayeva. Algorithm for creating a digital signature with error detection and correction. Cybernetics and Systems Analysis, 48(4):489–497, 2012.
  • [4] A. De Bonis and G. Di Crescenzo. Combinatorial group testing for corruption localizing hashing. In Computing and Combinatorics, pages 579–591, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
  • [5] A. De Bonis and G. Di Crescenzo. A group testing approach to improved corruption localizing hashing. Cryptology ePrint Archive, Report 2011/562, 2011. https://eprint.iacr.org/2011/562.
  • [6] D. Derler, H. C. Pöhls, K. Samelin, and D. Slamanig. A general framework for redactable signatures and new constructions. In International Conference on Information Security and Cryptology (ICISC 2015), Lecture Notes in Computer Science, 9558, pages 3–19. Springer, 2015.
  • [7] G. Di Crescenzo, R. Ge, and G. R. Arce. Design and analysis of dbmac, an error localizing message authentication code. In IEEE Global Telecommunications Conference, 2004. GLOBECOM’04., volume 4, pages 2224–2228, 2004.
  • [8] P. D’Arco and D. Stinson. Fault tolerant and distributed broadcast encryption. In Proceedings of the 2003 RSA Conference on The Cryptographers’ Track, pages 263–280. Springer-Verlag, 2003.
  • [9] P. Erdös, P. Frankl and Z. Füredi, Families of finite sets in which no set is covered by the union of r others, Israel J. Math., 51 (1985), 79–89.
  • [10] Z. Füredi, On r-Cover-free Families, Journal of Combinatorial Theory, 73 (1996), 172–173.
  • [11] M. T. Goodrich, M. J. Atallah, and R. Tamassia. Indexing information for data forensics. In ACNS, 2005.
  • [12] T. B. Idalino, L. Moura, R. F. Custódio, and D. Panario. Locating modifications in signed data for partial data integrity. Information Processing Letters, 115(10):731 – 737, 2015.
  • [13] R. Johnson, D. Molnar, D. Song, and D. Wagner. Homomorphic signature schemes. In Topics in Cryptology - Cryptographers Track - RSA 2002., pages 244–262. Springer, 2002.
  • [14] J. Katz, Y. Lindell. Introduction to Modern Cryptography. Chapman & Hall/CRC, 2007.
  • [15] A. J. Menezes, S. A. Vanstone, and P. C. Van Oorschot. Handbook of Applied Cryptography. CRC Press, Inc., Boca Raton, FL, USA, 1st edition, 1996.
  • [16] H. Niederreiter, H. Wang, and C. Xing. Function Fields Over Finite Fields and Their Applications to Cryptography. In: Garcia A., Stichtenoth H. (eds) Topics in Geometry, Coding Theory and Cryptography. Algebra and Applications, 2006, 59–104.
  • [17] J. Pastuszak, J. Pieprzyk, and J. Seberry. Codes identifying bad signature in batches. In INDOCRYPT, volume 1977 of Lecture Notes in Computer Science, pages 143–154. Springer, 2000.
  • [18] H. C. Pöhls. Increasing the Legal Probative Value of Cryptographically Private Malleable Signatures. Ph.D. Thesis, University of Passau, 2018.
  • [19] E. Porat, A. Rothschild. Explicit nonadaptive combinatorial group testing schemes. IEEE Transactions on Information Theory 57 (2011) 7982– 7989.
  • [20] R. Safavi-Naini and H. Wang. New results on multi-receiver authentication codes. In Kaisa Nyberg, editor, Advances in Cryptology — EUROCRYPT’98, pages 527–541. Springer Berlin Heidelberg, 1998.
  • [21] R. Steinfeld, L. Bull, and Y. Zheng. Content Extraction Signatures. In Information Security and Cryptology — ICISC 2001. Lecture Notes in Computer Science, vol 2288, pages 285–304
  • [22] D. Stinson and M. Paterson. Cryptography: Theory and Practice, Fourth Edition. CRC Press, 4th edition, 2018.
  • [23] G. M. Zaverucha and D. Stinson. Group testing and batch verification. In Proceedings of the 4th International Conference on Information Theoretic Security, ICITS’09, pages 140–157, 2009.