This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Sun Yat-Sen University22institutetext: Huawei Noah’s Ark Lab

Secure Linear Aggregation Using Decentralized Threshold Additive Homomorphic Encryption For Federated Learning

Haibo Tian 11    Fangguo Zhang 11    Yunfeng Shao 22    Bingshuai Li Dr. Tian and Prof. Zhang were with the GuangDong Province Key Laboratory of Information Security Technology, School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, Guangdong, 510275, P. R. China e-mail: {tianhb, isszhfg}@mail.sysu.edu.cn. Dr. Shao and Dr. Li were with Huawei Noah’s Ark Lab. e-mail: {shaoyunfeng, libingshuai}@huawei.com. 22
Abstract

Secure linear aggregation is to linearly aggregate private inputs of different users with privacy protection. The server in a federated learning (FL) environment can fulfill any linear computation on private inputs of users through the secure linear aggregation. At present, based on pseudo-random number generator and one-time padding technique, one can efficiently compute the sum of user inputs in FL, but linear calculations of user inputs are not well supported. Based on decentralized threshold additive homomorphic encryption (DTAHE) schemes, this paper provides a secure linear aggregation protocol, which allows the server to multiply the user inputs by any coefficients and to sum them together, so that the server can build a full connected layer or a convolution layer on top of user inputs. The protocol adopts the framework of Bonawitz et al. to provide fault tolerance for user dropping out, and exploits a blockchain smart contract to encourage the server honest. The paper gives a security model, security proofs and a concrete lattice based DTAHE scheme for the protocol. It evaluates the communication and computation costs of known DTAHE construction methods. The evaluation shows that an elliptic curve based DTAHE is friendly to users and the lattice based version leads to a light computation on the server.

Keywords:
P

rivacy Protection, Secure Linear Aggregation, Additive Homomorphic Encryption, Smart Contract.

1 Introduction

Federated learning (FL) is intended to train better machine learning models on decentralized real-world data. The models then could be used to build more intelligent equipments for people, such as cars, wearable devices or browsers. McMahan et al. [1] proposed a well known FL protocol. The players in their protocol include users who owned data and a parameter server that aggregates model information of users. The protocol runs periodically. In each period, the parameter server randomly selects some users to upload their local model parameters, and averages the parameters to update a global learning model. A user in a period downloads the global learning model, feeds their local data, runs a deep learning network locally and gets updated local model parameters, the information of which is sent to the parameter server. In different periods the server may select different users, and within a period some of the selected users may drop out.

As real-world data are usually sensitive, an important problem in FL is data privacy. Although user data are not directly sent to the parameter server in FL, information of local model parameters may leak the raw data of users. Fredrikson et al. [2] show how to recover train samples from prediction results of a model. Rubaie and Chang [3] exploit feature vectors to reconstruct raw input data of a model. Chai et al. [4] show how to recover preferences of users by model gradient data. Bonawitz et al. [5] believe that recent updated local model parameters of a user may leak raw data of the user.

There are mainly two approaches to solve the data privacy problem in FL. One uses differential privacy and the other uses cryptographic tools. The work of Martin et al. [6] is an earlier report of the differential privacy approach. Wei et al. [7] point out a tradeoff between the convergence performance and privacy protection levels of the differential privacy method. For a fixed privacy protection level, the number of users increases, the convergence performance behaves better. However, if the number of users in each period is limited, one may use the cryptographic tools based approach. Bonawitz et al. [5] propose an elegant solution based on one-time padding and a secure pseudorandom generator.

To the best of our knowledge, the solution of Bonawitz et al. [5] is the only work suitable for the FL using cryptographic tools. They take the data privacy problem in FL as an secure aggregation problem. And they show some new requirements of a secure aggregation protocol for the FL. Except a security requirement, other requirements are as follow:

  1. 1.

    The protocol should operates on a high-dimensional vectors;

  2. 2.

    The protocol should tolerate users dropping out;

  3. 3.

    The protocol should be communication efficient even with a new set of uses on each period.

With these requirements, Bonawitz et al. [5] show that previous works are unsatisfactory which include some works based on homomorphic encryption schemes. In detail, they believe that solutions based on Paillier scheme [8, 9, 10, 11] are either computationally expensive or require additional trusted dealer, and solutions based on ElGamal scheme [12, 13, 14, 15] need a high expansion factor considering the size of the group elements and that of the model parameters.

Considering the development of the communication technology, we believe that a moderate expansion factor is acceptable and the functionality of a protocol is more important. Liu et al. [16] proposed a federated forest where a parameter server should find the maximal value of user inputs. Zhuo et al. [17] proposed a federated deep reinforcement learning model where a parameter server needs to build a multi-layer perception on user inputs. The users in these scenarios are usually not mobile users so that the communication cost is not the dominate factor. However, to protect the privacy of users, a sum only secure aggregation is not enough.

We provide a secure linear aggregation protocol to enrich the parameter server. It could naturally be used in the federated averaging algorithm [1] in the same way as the secure sum aggregation [5]. It also could be used in [17] to build a linear multi-layer perception to get a reinforcement learning model. It may be adapted in [16] with the homomorphic encryption [18] to find the maximal value of user inputs. For simplicity, the solution here is only based on decentralized threshold additive homomorphic encryption (DTAHE) schemes.

Currently, there are three known methods to construct a DTAHE scheme. The first method is to distribute many secret shares to a user. Bendlin and Damgård [19] proposed a threshold homomorphic encryption scheme in this way. It relies on secret sharing schemes with general access structure. Boneh et al. [20] proposed such a scheme based on secret sharing schemes constructed by a monotone formulae. The second method is to use a large modulus for coefficients of an element in a polynomial ring. Boneh et al. [20] propose to use Shamir secret sharing scheme in this way. The last method is to use the ElGamal encryption as a basic building tool [21]. We exclude the Paillier encryption based construction method since it is hard to produce a shared key pair in a distributed manner without a trusted dealer. According to our evaluation, in the secure linear aggregation protocol, the communication overhead of the first method is too high and the computation overhead of the third method is a bit high. The second method also has a drawback since a large modulus means a higher polynomial degree when the noise bound is fixed [22]. So we provide a new method to construct a DTAHE scheme with security proofs. It does not increase the modulus size and polynomial degree.

The secure linear aggregation protocol is against an active adversary [5]. An active adversary could corrupt a parameter server and ask tt users to decrypt a cipher of a target user. It is not easy to defend against the attack. Note that Bonawitz et al. [5] add a consistency check round to solve a similar problem in their protocol. However, even we add such a round, the problem still exists since the target user may have dropped out before the consistency check round. To solve the problem, we introduce a blockchain system. We design a smart contract to record ciphers of users and to check the evaluation process of the parameter server. If the parameter server deviates its expected behaviours in a period, the parameter server could be punished and the users in that period could get compensation.

There exists a lot of works to introduce a blockchain into the FL. A top complained problem in FL is the incentive of users to participate a learning process. Researchers propose blockchain enabled models [23, 24, 25, 26, 27] to give rewards to data owners. Basically, model initializer proposes model parameters and rewards in the blockchain, data owners choose their interested model to download the current model parameters, to train an updated model by their local private inputs and to update information of model parameters to the blockchain, and miners of the blockchain aggregate inputs of users to get a new global model parameter and rewards relevant users. Pokhrel and Choi [28] and Kim et al. [29] show some theoretical results about the performance of blockchain enabled FL considering the delays in a blockchain. A simulation result in [30] shows that the FL without a blockchain is most efficient. Another motivation to introduce a blockchain system to the FL is about the trustiness of a learning model. Sarpatwar et al. [31] propose to model and capture provenance of an overall learning process for verification. Awan et al. [32] propose to record data produced in a learning task in a blockchain for verification. We use a blockchain as a trusted third party to verify the behaviours of the parameter server. Since a verification process usually could be separated from a learning process, the FL and the blockchain could work at their own paces.

In summary, our contributions are as follows.

  • We give a definition of DTAHE for FL and provide a basic linear aggregation protocol based on the definition. The basic protocol supports a parameter server to linearly operate ciphers of model parameters from different users.

  • We provide a secure linear aggregation protocol against an active adversary with security model and proofs. It shows how to use a blockchain smart contract to help a secure linear aggregation protocol.

  • We provides a new method to construct a lattice based instance of the DTAHE scheme with security proofs. Evaluations show that the DTAHE scheme leads to a lightweight computation on the server side.

2 Preliminaries

2.1 Basic Notations

For any set XX, we denote by |X||X| the number of elements of the set XX. If xx is a string, |x||x| denotes its bit length. And if xx is a vector, |x||x| denotes the dimension of the vector. x||yx||y denotes the bit catenation of two strings xx and yy.

Let R=[x]/(f(x))R=\mathbb{Z}[x]/(f(x)) be a polynomial ring where f(x)f(x) is a monic irreducible polynomial of degree dd. Elements of the ring RR is denoted by vectors. For aR\vec{a}\in R, the coefficients of a\vec{a} is denoted by aia_{i} such that a=i=0d1aixi\vec{a}=\sum_{i=0}^{d-1}a_{i}\cdot x^{i}. The infinity norm of a||\vec{a}|| is defined as maxi|ai|max_{i}|a_{i}| and the expansion factor of R is defined as δR=max{||ab||/(||a||||b||):a,bR}\delta_{R}=max\{||\vec{a}\cdot\vec{b}||/(||\vec{a}||\cdot||\vec{b}||):\vec{a},\vec{b}\in R\}.

Let h>1h>1 be an integer. Then h\mathbb{Z}_{h} denotes a set of integers (h2,h2](-\frac{h}{2},\frac{h}{2}]. The symbol /q\mathbb{Z}/q\mathbb{Z} denotes a ring on integers {0,,q1}\{0,\ldots,q-1\}. For xx\in\mathbb{Z}, [x]h[x]_{h} denotes the unique integer in h\mathbb{Z}_{h} with [x]h=xmodh[x]_{h}=x\bmod h. For xR\vec{x}\in R, [x]h[\vec{x}]_{h} denotes the element in RR obtained by applying []h[\cdot]_{h} to all its coefficients. For xx\in\mathbb{R}, x\lfloor x\rceil denotes rounding to the nearest integer and x\lfloor x\rfloor, x\lceil x\rceil denote rounding up or down.

Let λ\lambda be an integer as the security parameter. A function negl(λ)negl(\lambda) is negligible in λ\lambda if negl(λ)=o(1/λc)negl(\lambda)=o(1/\lambda^{c}) for every cc\in\mathbb{N}. An event occurs with negligible probability if the probability of the event is negl(λ)negl(\lambda). An event occurs with overwhelming probability if its complement occurs with negligible probability.

Given a probability distribution 𝒟\mathcal{D}, we use x𝒟x\leftarrow\mathcal{D} to denote that xx is sampled from 𝒟\mathcal{D}. For a set XX, xXx\leftarrow X denotes that xx is sampled uniformly from XX. A distribution χ\chi over integers is called BB-bounded if it is supported on [B,B][-B,B].

2.2 Federated Learning

McMahan et al. [1] described the FL framework. We give a briefly review to see their federated averaging algorithm. As shown in Fig. 1, there is a parameter server S1S_{1} and many users. S1S_{1} exchanges model parameters with users to collaboratively build a better learning model. The federated averaging algorithm runs periodically. In each period, S1S_{1} selects nn users randomly. In Fig. 1, we intended to show two periods with totally different users. Next we focus on a period ee where users in a set UeU_{e}.

Refer to caption
Figure 1: The framework of FL

The FL in a period is as follows.

  1. 1.

    A user uUeu\in U_{e} is activated to download the global parameters from the parameter server.

  2. 2.

    The user uu feeds their local data samples to get updated model parameters. And the parameters or their gradients ωu\omega_{u} with the number of local data samples nun_{u} should be sent to S1S_{1}.

  3. 3.

    The parameter server S1S_{1} compute the updated global model parameters ω=1uUenuuUenuωu\omega=\frac{1}{\sum_{u\in U_{e}}n_{u}}\sum_{u\in U_{e}}n_{u}\omega_{u}.

The parameter server S1S_{1} continues to randomly activate a set of users in the next period until the parameters ω\omega converge to that of the global optimal learning model.

Users could upload the nun_{u} together with ωu\omega_{u} so that the server could get the value uUenu\sum_{u\in U_{e}}n_{u} and compute the updated ω\omega. So a sum only aggregation is enough for the federated averaging algorithm. However, a parameter server may execute more computations in other federated learning models [16, 17], where a sum only aggregation is not enough.

2.3 The Model Of Secure Aggregation For FL

We adapt the notion of message-driven entities in [33] to the FL security model [5] to describe the participants in a FL scenario. A message-driven entity is initially invoked by an environment process. Each entity has their initial states. Once invoked, an entity waits for an activation that can happen for a message from the network or an environment process. On activation, the entity processes the incoming message or the arguments from a process with its current internal state, and generates a new internal state, outgoing messages and a cumulative output. Once an activation is completed, the entity waits for the next activation if it does not stop.

A protocol π\pi consists of a server side progress πS\pi_{S} and a client side progress πu\pi_{u}. An FL process in S1S_{1} invokes πS\pi_{S}, which runs periodically until πS\pi_{S} is stopped by the FL process. In each period, πS\pi_{S} randomly selects a set of users UU and activates their client side progresses. A user uu should have voluntarily invoked their πu\pi_{u} before S1S_{1} activates them. On activation, πu\pi_{u} exchanges messages with πS\pi_{S} until πS\pi_{S} finishes a period or πu\pi_{u} is stopped by the user. There may be some auxiliary entities for the security of the protocol π\pi. A common setup includes a certificate authority (CA) [5]. A CACA is invoked and runs permanently to receive certificate requests from users and issue certificates to users. In our protocol, we also introduce blockchain miners BMBM. Miners are invoked to receive transactions from users and the server, and to execute smart contracts on commonly consented transactions.

The server and each user have a bi-directional secure channel between them. Practically, it reuses a transport layer secure channel established by FL processes. Users do not have direct links in the protocol π\pi. The communications of users are transferred by the server. Users and the CACA may exchange messages at an initial stage. When the protocol π\pi runs, the CACA could be offline. Blockchain miners usually have a point-to-point (P2P) network to deliver transactions and blocks. When a blockchain is used in π\pi, FL entities should have the ability to send and receive transactions to and from the P2P network.

An active adversary 𝒜\mathcal{A} could corrupt the server and users in π\pi. For each period, the number of corrupted parties is at most ncn_{c} [5]. When 𝒜\mathcal{A} corrupts an entity, it knows the internal states and long-term secrets of the entity, and controls all the behaviours of the entity from that point. Obviously, the server S1S_{1} could be corrupted to act as a malicious server. When 𝒜\mathcal{A} corrupts a user, it could send transactions on behalf of the user.

We does not allow an adversary to corrupt CACA or BMBM since we do not try to model the security of a CA system or a blockchain system. We simply make assumptions about them:

  • A transaction from a user or the server will be executed and recorded correctly by blockchain miners within a bounded delay.

  • A certificate from a CACA proves the binding relation of a verification key and the real identity of a user.

We define GOπ,e,{S1}Ue,𝒜(x,r)GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x},\vec{r}) as a global output for π\pi in a period ee where x={xS}{xu}uUe\vec{x}=\{x_{S}\}\cup\{x_{u}\}_{u\in U_{e}} denotes the inputs of the server S1S_{1} and the inputs of users in the set UeU_{e} of the period ee, and r={r0,rS}{ru}uUe\vec{r}=\{r_{0},r_{S}\}\cup\{r_{u}\}_{u\in U_{e}} denotes the random inputs of the adversary 𝒜\mathcal{A}, server S1S_{1} and users in UeU_{e}. Let

GOπ,e,{S1}Ue,𝒜(x)GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x})

be the random variable about GOπ,e,{S1}Ue,𝒜(x,r)GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x},\vec{r}). The randomness comes from the adversary 𝒜\mathcal{A}, server S1S_{1} and users in UeU_{e}.

We define the input privacy of a user [5] as follows.

Definition 1

(Input Privacy of a User). For any uncorrupted user uUeu\in U_{e} in a period ee, the privacy of its input is assured if

GOπ,e,{S1}Ue,𝒜(x)GOπ,e,{S1}Ue,𝒜(x)GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x})\approx GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x}^{\prime})

where x={xS}{xv}vUe\{u}{0}\vec{x}^{\prime}=\{x_{S}\}\cup\{x_{v}\}_{v\in U_{e}\backslash\{u\}}\cup\{0\} and “\approx” means computationally indistinguishable.

The security goal in [5] is input privacy of all uncorrupted users. Apparently, if the privacy of all users are protected, the privacy of a user is protected. If the privacy of each uncorrupted user in protected, the privacy of all uncorrupted users are protected. So the two definitions are equivalent.

2.4 DTAHE

Boneh et al. [20] give a definition of decentralized threshold fully homomorphic encryption (DTFHE). We refine it as a DTAHE definition for the FL scenario. A DTAHE scheme is a tuple of probabilistic polynomial time (PPT) algorithms DTAHE=(SetupDTAHE=(Setup, KeyGenKeyGen, ShareShare, CombKeyCombKey, EncEnc, EvalEval, ParDecParDec, FinDec)FinDec).

  1. 1.

    Setup(1λ)parmSetup(1^{\lambda})\rightarrow parm: It takes as input a security parameter λ\lambda, outputs system parameters parmparm.

  2. 2.

    KeyGen(parm)(sku,pku)KeyGen(parm)\rightarrow(sk_{u},pk_{u}): It takes as input the system parameter parmparm to produce public and private keys for a user uu.

  3. 3.

    Share(parm,{pkv}vU\{u},t,sku){ev,u}vU\{u}Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,sk_{u})\rightarrow\{e_{v,u}\}_{v\in U\backslash\{u\}}: It takes as input the system parameter parmparm, public keys of users in a set UU excluding the user uu, a threshold value tt and private keys skusk_{u} of the user uu, to produce encrypted shares ev,ue_{v,u} for each user vU\{u}v\in U\backslash\{u\}.

  4. 4.

    CombKey(parm,{pku}uU)pkCombKey(parm,\{pk_{u}\}_{u\in U})\rightarrow pk: It takes as input the system parameter parmparm, public keys of a set of users in UU, and produces an encryption key pkpk.

  5. 5.

    Enc(parm,pk,mu)cuEnc(parm,pk,m_{u})\rightarrow c_{u}: It takes as input the system parameter parmparm, a public key pkpk and a message mum_{u} from a user uu, and produces a ciphertext cuc_{u}.

  6. 6.

    Eval(parm,{cu}uU,{αu}uU)c^Eval(parm,\{c_{u}\}_{u\in U},\{\alpha_{u}\}_{u\in U})\rightarrow\hat{c}: It takes as the system parameter parmparm, ciphers {cu}uU\{c_{u}\}_{u\in U} and coefficients {αu}uU\{\alpha_{u}\}_{u\in U}, and produces an evaluated cipher c^=uUαucu\hat{c}=\sum_{u\in U}\alpha_{u}\cdot c_{u}.

  7. 7.

    ParDec(parm,c^,{eu,v}vU)m^uParDec(parm,\hat{c},\{e_{u,v}\}_{v\in U})\rightarrow\hat{m}_{u}: It takes as input the system parameter parmparm, the cipher c^\hat{c}, and a set of encrypted shares {eu,v}vU\{u}\{e_{u,v}\}_{v\in U\backslash\{u\}} to the user uu, and produces a partially decrypted value m^u\hat{m}_{u}.

  8. 8.

    FinDec(parm,t,c^,{m^u}uV)mFinDec(parm,t,\hat{c},\{\hat{m}_{u}\}_{u\in V})\rightarrow m: It takes as input the system parameter parmparm, the threshold value tt, the cipher c^\hat{c} and partially decrypted ciphers {m^u}uV\{\hat{m}_{u}\}_{u\in V} from users in a set VV with |V|t|V|\geq t, and produces a plaintext mm.

One could simply give an ElGamal based DTAHE instance following the constructions in [21], which justifies the correctness of the DTAHE definition.

2.5 DTAHE Model

We adapt the model of DTFHE in [20] for the DTAHE. The first definition is evaluation correctness.

Definition 2

(Evaluation Correctness). A DTAHE scheme for a set of users UU satisfies evaluation correctness if for all λ\lambda and tt, the following holds:

For an evaluated cipher

c^Eval(parm,{cu}uU,{αu}uU)\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U},\{\alpha_{u}\}_{u\in U})

the probability

Pr[FinDec(parm,t,c^,{ParDec(parm,c^,{eu,v}vU)}uV)=uUαumu]Pr\left[\begin{gathered}FinDec\left(\begin{gathered}parm,t,\hat{c},\hfill\\ {\{ParDec(parm,\hat{c},{\{{e_{u,v}}\}_{v\in U}})\}_{u\in V}}\hfill\\ \end{gathered}\right)\hfill\\ =\sum\limits_{u\in U}{{\alpha_{u}}}\cdot{m_{u}}\hfill\\ \end{gathered}\right]

is overwhelming where

cuEnc(parm,pk,mu),c_{u}\leftarrow Enc(parm,pk,m_{u}),
({ev,u}vU)Share(parm,{pkv}vU\{u},t,sku),(\{e_{v,u}\}_{v\in U})\leftarrow Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,sk_{u}),
(sku,pku)KeyGen(parm),(sk_{u},pk_{u})\leftarrow KeyGen(parm),
pkCombKey(parm,{pku}uU),pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U}),

and

parmSetup(1λ).parm\leftarrow Setup(1^{\lambda}).

The second definition is sematic security. It captures the privacy of messages.

Definition 3

(Sematic Security). We say that a DTAHE scheme for a user set UU satisfies sematic security if for all λ\lambda, the following holds:

For any PPT adversary 𝒜\mathcal{A}, the following experiments Expt𝒜,Sem(1λ)Expt_{\mathcal{A},Sem}(1^{\lambda}) outputs 11 with probability 12+negl(λ)\frac{1}{2}+negl(\lambda):

  • Expt𝒜,Sem(1λ)Expt_{\mathcal{A},Sem}(1^{\lambda}):

    1. 1.

      The adversary outputs UU and VV where |U|=n|U|=n and |V|=t|V|=t specify an access structure.

    2. 2.

      The challenger runs

      parmSetup(1λ),parm\leftarrow Setup(1^{\lambda}),
      (sku,pku)KeyGen(parm),(sk_{u},pk_{u})\leftarrow KeyGen(parm),
      {ev,u}vU\{u}Share(parm,{pkv}vU\{u},t,sku),{\{{e_{v,u}}\}_{v\in U\backslash\{u\}}}\leftarrow Share\left(\begin{gathered}parm,{\{p{k_{v}}\}_{v\in U\backslash\{u\}}},\hfill\\ t,s{k_{u}}\hfill\\ \end{gathered}\right),
      pkCombKey(parm,{pku}uU),pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U}),

      and provides (parm,pk,{{ev,u}vU\{u}}uU)(parm,pk,\{\{e_{v,u}\}_{v\in U\backslash\{u\}}\}_{u\in U}) to 𝒜\mathcal{A}.

    3. 3.

      𝒜\mathcal{A} outputs a set SUS\subseteq U such that |S|<t|S|<t. It submits message vectors {mu,0,mu,1}uU\{m_{u,0},m_{u,1}\}_{u\in U} and SS to the challenger.

    4. 4.

      The challenger provides 𝒜\mathcal{A} the shares {{su,v}vU}uS\{\{s_{u,v}\}_{v\in U}\}_{u\in S} and a cipher set

      {cuEnc(parm,pk,mu,b)}uU, for b{0,1}.\{c_{u}\leftarrow Enc(parm,pk,m_{u,b})\}_{u\in U}\text{, for }b\in\{0,1\}.
    5. 5.

      𝒜\mathcal{A} outputs a guess bit bb^{\prime}. The experiment outputs 11 if b=bb=b^{\prime}.

The last definition is simulation security. It captures the privacy of shared secrets and private keys of users.

Definition 4

(Simulation Security). A DTAHE scheme satisfies simulation security if for all λ\lambda, the following holds:

There is a stateful PPT algorithm 𝒞=(𝒞1,𝒞2)\mathcal{C}=(\mathcal{C}_{1},\mathcal{C}_{2}) such that for any PPT adversary 𝒜\mathcal{A}, the following experiments Expt𝒜,Real(1λ)Expt_{\mathcal{A},Real}(1^{\lambda}) and Expt𝒜,Ideal(1λ)Expt_{\mathcal{A},Ideal}(1^{\lambda}) are indistinguishable:

  • Expt𝒜,Real(1λ)Expt_{\mathcal{A},Real}(1^{\lambda}):

    1. 1.

      The adversary outputs UU and VV where |U|=n|U|=n and |V|=t|V|=t specify an access structure.

    2. 2.

      The challenger runs

      parmSetup(1λ),parm\leftarrow Setup(1^{\lambda}),
      (sku,pku)KeyGen(parm),(sk_{u},pk_{u})\leftarrow KeyGen(parm),
      ({ev,u}vU)Share(parm,{pkv}vU\{u},t,sku),({\{{e_{v,u}}\}_{v\in U}})\leftarrow Share\left(\begin{gathered}parm,{\{p{k_{v}}\}_{v\in U\backslash\{u\}}},\hfill\\ t,s{k_{u}}\hfill\\ \end{gathered}\right),
      pkCombKey(parm,{pku}uU),pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U}),

      and provides (parm,pk,{{ev,u}vU}uU)(parm,pk,\{\{e_{v,u}\}_{v\in U}\}_{u\in U}) to 𝒜\mathcal{A}.

    3. 3.

      𝒜\mathcal{A} outputs a set SUS^{*}\subseteq U with |S|=t1|S^{*}|=t-1 and messages {mu}uU\{m_{u}\}_{u\in U}.

    4. 4.

      The challenger provides 𝒜\mathcal{A} the shares {{su,v}vU}uS\{\{s_{u,v}\}_{v\in U}\}_{u\in S^{*}} in {{eu,v}vU}uS\{\{e_{u,v}\}_{v\in U}\}_{u\in S^{*}} and a cipher set

      {cuEnc(parm,pk,mu)}uU.\{c_{u}\leftarrow Enc(parm,pk,m_{u})\}_{u\in U}.
    5. 5.

      𝒜\mathcal{A} issues a polynomial number of adaptive queries of the form (SU,{cu}uU,{αu}uU)(S\subseteq U,\{c_{u}\}_{u\in U^{*}},\{\alpha_{u}\}_{u\in U^{*}}) where UUU^{*}\subseteq U. For each query, the challenger computes c^Eval(parm,{cu}uU,{αu}uU)\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U^{*}},\{\alpha_{u}\}_{u\in U^{*}}) and provides 𝒜\mathcal{A}

      {m^uParDec(parm,c^,{eu,v}vU)}uS.\{\hat{m}_{u}\leftarrow ParDec(parm,\hat{c},\{e_{u,v}\}_{v\in U})\}_{u\in S}.
    6. 6.

      At the end of the experiment, 𝒜\mathcal{A} outputs a distinguishing bit bb.

  • Expt𝒜,Ideal(1λ)Expt_{\mathcal{A},Ideal}(1^{\lambda}):

    1. 1.

      The adversary outputs UU and VV where |U|=n|U|=n and |V|=t|V|=t specify an access structure.

    2. 2.

      The challenger runs

      (parm,pk,{{ev,u}vU}uU,st)𝒞1(1λ,U,t)(parm,pk,\{\{e_{v,u}\}_{v\in U}\}_{u\in U},st)\leftarrow\mathcal{C}_{1}(1^{\lambda},U,t)

      and provides (parm,pk,{{ev,u}vU}uU)(parm,pk,\{\{e_{v,u}\}_{v\in U}\}_{u\in U}) to 𝒜\mathcal{A}.

    3. 3.

      𝒜\mathcal{A} outputs a set SUS^{*}\subseteq U with |S|=t1|S^{*}|=t-1 and messages {mu}uU\{m_{u}\}_{u\in U}.

    4. 4.

      The challenger provides 𝒜\mathcal{A} shares {{su,v}vU}uS\{\{s_{u,v}\}_{v\in U}\}_{u\in S^{*}} and ciphers

      {cuEnc(parm,pk,mu)}uU.\{c_{u}\leftarrow Enc(parm,pk,m_{u})\}_{u\in U}.
    5. 5.

      𝒜\mathcal{A} issues a polynomial number of adaptive queries of the form (SU,{cu}uU,{αu}uU)(S\subseteq U,\{c_{u}\}_{u\in U^{*}},\{\alpha_{u}\}_{u\in U^{*}}) where UUU^{*}\subseteq U. For each query, the challenger runs

      {m^u}uS𝒞2(S,{cu}uU,{αu}uU,{mu}uU,{cu}uU,st){\{{\hat{m}_{u}}\}_{u\in S}}\leftarrow{\mathcal{C}_{2}}\left(\begin{gathered}S,{\{{c_{u}}\}_{u\in{U^{*}}}},{\{{\alpha_{u}}\}_{u\in{U^{*}}}},\hfill\\ {\{{m_{u}}\}_{u\in U}},{\{{c_{u}}\}_{u\in U}},st\hfill\\ \end{gathered}\right)

      and provides {m^u}uS\{\hat{m}_{u}\}_{u\in S} to 𝒜\mathcal{A}.

    6. 6.

      At the end of the experiment, 𝒜\mathcal{A} outputs a distinguishing bit bb.

3 Secure Linear Aggregation Protocol

We provides two protocols in this section. The first is a basic protocol showing how to embed a DTAHE scheme to the secure aggregation framework in [5]. The second is the secure linear aggregation protocol against an active adversary.

3.1 A Basic Protocol

As shown in Fig. 2, initially a server S1S_{1} runs parmSetup(1λ)parm\leftarrow Setup(1^{\lambda}) to provide a system-wide parameters for all users, and each user runs (sku,pku)KeyGen(parm)(sk_{u},pk_{u})\leftarrow KeyGen(parm) to produce their key pairs. The server and users then runs a four-round protocol π\pi with an agreed threshold value tt. We next focus on a period ee to show the protocol.

Refer to caption
Figure 2: Basic Aggregation Protocol from DTAHE
  1. 1.

    Round 1 (AdvertiseKeys). When a user uu is active, it packs a message as mu,1=(u,pku)m_{u,1}=(u,pk_{u}) and sends the message to the server S1S_{1}. The user then waits for the first response from S1S_{1} or stops.

    The server S1S_{1} makes a set Ue1=U_{e}^{1}=\emptyset. After it receives a message mu,1=(u,pku)m_{u,1}=(u,pk_{u}), it sets Ue1=Ue1{u}U_{e}^{1}=U_{e}^{1}\cup\{u\}. When |Ue1|>t|U_{e}^{1}|>t, S1S_{1} packs a message as mS,1={mu,1}uUe1m_{S,1}=\{m_{u,1}\}_{u\in U_{e}^{1}}. The message mS,1m_{S,1} is broadcasted to all the users in Ue1U_{e}^{1} as their responses.

  2. 2.

    Round 2 (ShareKeys). When a user uu receives mS,1m_{S,1}, it makes a Ue1U_{e}^{1} set from mS,1m_{S,1} and checks |Ue1|t|U_{e}^{1}|\geq t. If the verification passes, the user executes as follows:

    1. (a)

      It makes a public key set {pkv}vUe1\{u}\{pk_{v}\}_{v\in U_{e}^{1}\backslash\{u\}} from mS,1m_{S,1} and produce encrypted shares by

      {ev,u}vUe1Share(parm,{pkv}vUe1\{u},t,sku).{\{{e_{v,u}}\}_{v\in U_{e}^{1}}}\leftarrow Share\left(\begin{gathered}parm,{\{p{k_{v}}\}_{v\in U_{e}^{1}\backslash\{u\}}},\hfill\\ t,s{k_{u}}\hfill\\ \end{gathered}\right).
    2. (b)

      It packs mu,2=(u,{(v,ev,u)}vUe1\{u})m_{u,2}=(u,\{(v,e_{v,u})\}_{v\in U_{e}^{1}\backslash\{u\}}) and sends the message to S1S_{1}. Then uu waits for the second response or stops.

    The server S1S_{1} builds a set Ue2U_{e}^{2} in the same way as Ue1U_{e}^{1} satisfying Ue2Ue1U_{e}^{2}\subseteq U_{e}^{1}. When |Ue2|>t|U_{e}^{2}|>t, S1S_{1} packs a message mS,2,v=({(u,ev,u)}uUe2\{v})m_{S,2,v}=(\{(u,e_{v,u})\}_{u\in U_{e}^{2}\backslash\{v\}}) for each user vUe2v\in U_{e}^{2}, and sends them to the users in Ue2U_{e}^{2}.

  3. 3.

    Round 3 (CipherCollection). When a user uu receives a message mS,2,um_{S,2,u}, it makes a set Ue2U_{e}^{2} from mS,2,um_{S,2,u}. If |Ue2|t|U_{e}^{2}|\geq t, uu executes as follows:

    • It runs pkCombKey(parm,{pku}uUe2)pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U_{e}^{2}}).

    • It runs cuEnc(parm,pk,mu)c_{u}\leftarrow Enc(parm,pk,m_{u}) to compute the cipher of its private input mum_{u}.

    • It packs a message mu,3=(u,cu)m_{u,3}=(u,c_{u}) and sends the message to S1S_{1}. Then uu waits for the third response or stops.

    The server S1S_{1} builds a set Ue3U_{e}^{3} in the same way as Ue1U_{e}^{1} satisfying Ue3Ue2U_{e}^{3}\subseteq U_{e}^{2}. It runs

    c^Eval(parm,{cu}uUe3,{αu}uUe3)\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U_{e}^{3}},\{\alpha_{u}\}_{u\in U_{e}^{3}})

    and sends mS,3=c^m_{S,3}=\hat{c} to users in Ue3U_{e}^{3}.

  4. 4.

    Round 4 (Decryption). When a user uu receives mS,3m_{S,3}, it runs

    m^uParDec(parm,c^,{eu,v}vUe2).\hat{m}_{u}\leftarrow ParDec(parm,\hat{c},\{e_{u,v}\}_{v\in U_{e}^{2}}).

    uu then packs a message mu,4=(u,m^u)m_{u,4}=(u,\hat{m}_{u}) and sends the message to the server S1S_{1}. The user uu finishes the client protocol πu\pi_{u} for the period ee and stops.

    The server S1S_{1} builds a set Ue4U_{e}^{4} in the same way as Ue1U_{e}^{1} satisfying Ue4Ue3U_{e}^{4}\subseteq U_{e}^{3}. If |Ue4|t|U_{e}^{4}|\geq t, it runs

    mFinDec(parm,c^,{m^u}uUe4)m\leftarrow FinDec(parm,\hat{c},\{\hat{m}_{u}\}_{u\in U_{e}^{4}})

    to get mm as an aggregated value. The server finishes the server side protocol πS\pi_{S} for the period ee and waits for the next period.

Remark 1

The basic protocol uses the framework in [5] to tolerate user dropping out. It needs the sever to be honest. A malicious server may simply return a cipher cuc_{u} as mS,3m_{S,3} to decrypt the cipher.

3.2 The Secure Linear Aggregation Protocol

Bonawitz et al. [5] use a certificate authority (CA) and an extra consistency round to defend against an active adversary. We also use a CA but keep our protocol four rounds. We introduce a smart contract in a blockchain to encourage the server S1S_{1} and users to be honest.

The blockchain is described as σλt+1=Υ(σλt,Tx)\vec{\sigma}_{\lambda_{t}+1}=\Upsilon(\vec{\sigma}_{\lambda_{t}},Tx) [34] where Υ\Upsilon is a state transition function, σλt\vec{\sigma}_{\lambda_{t}} is the system state in a block height λt\lambda_{t}, and TxTx is a transaction in the system. A transaction could be described as

Tx=(from,to,value,data,aux,sig)Tx=(from,to,value,data,aux,sig)

where fromfrom and toto fields are the sender and receiver accounts of the transaction, valuevalue field is the values to be transferred from the sender to the receiver, datadata field is arbitrary data of the sender, auxaux field is the auxiliary information used in the blockchain and sigsig field is the sender’s signature. The instance of the blockchain certainly could be the Ethereum [34]. Any other blockchain system supporting state transition is fine.

Initially, the server S1S_{1} has an account accSacc_{S} and a user uu has an account accuacc_{u} in the blockchain. The server should deploy a smart contract EncCheckEncCheck on the blockchain which will have an account accEacc_{E} after deployment.

Definition 5

(EncCheckEncCheck). The smart contract includes three functions. A function InitInit is for the server S1S_{1} to deposit values and set parameters. A function RecordRecord is for a user to record their encrypted inputs. A function CheckCheck is for the server S1S_{1} and users to check the correctness of a protocol transcript.

  • InitInit: If the transaction TxTx is signed by the server S1S_{1}, the datadata field of the transaction includes “InitInit” as the function name, and a threshold value tt and the number of periods prdsprds supported by the contract as the arguments, the function checks a deposit variable DepDep of the server S1S_{1}. If Dep<MinValueprdsDep<MinValue*prds, it checks that Dep+valueMinValueprdsDep+value\geq MinValue*prds where MinValueMinValue is the smallest deposit value for a period. If the check fails, it stops. Otherwise, it transfers values from the server account to the contract account, and stores (accS,t,Dep,prds)(acc_{S},t,Dep,prds) in the contract.

  • RecordRecord: If the datadata field of the transaction includes “RecordRecord” as the function name, and an epoch session id esiduesid_{u} and a cipher caccuc_{acc_{u}} as the arguments, the function stores (esidu,accu,caccu)(esid_{u},acc_{u},c_{acc_{u}}) in the contract.

  • CheckCheck: If the transaction TxTx is singed by the server S1S_{1}, the datadata field of the transaction includes “CheckCheck” as the function name, and a termination flag drawdraw, an epoch session id esidSesid_{S}, a cipher cSc_{S}, a list of user accounts LAL_{A}, and coefficients {αaccu}accuLA\{\alpha_{acc_{u}}\}_{acc_{u}\in L_{A}} as the arguments, the function checks whether esidSesid_{S} is new. If the esidSesid_{S} is used before in another transaction TxTx^{\prime}, it checks whether the two transactions are the same. If they are different, the check fails. If they are the same, the function stops. The function forms a set

    AA={accu:accuLAesidu=esidS}.A_{A}=\{acc_{u}:acc_{u}\in LA\wedge esid_{u}=esid_{S}\}.

    If esidSesid_{S} is new, it checks that the size |AA|t|A_{A}|\geq t, and

    cS=Eval(parm,{caccu}accuAA,{αaccu}accuAA).c_{S}=Eval(parm,\{c_{acc_{u}}\}_{acc_{u}\in A_{A}},\{\alpha_{acc_{u}}\}_{acc_{u}\in A_{A}}).

    If any check fails, the value MinValueMinValue is shared by users in the AAA_{A} set. Otherwise, it updates prds=prds1prds=prds-1, and if drawdraw is truetrue, it transfers the deposit back to the account of the server after six new blocks. When the deposit is cleared, the contract suicides.

Now the server S1S_{1} has two tasks to initialize the protocol. At first, the server S1S_{1} runs parmSetup(1λ)parm\leftarrow Setup(1^{\lambda}) to provide system-wide parameters for all users. Secondly, the server S1S_{1} produces a transaction as

TxS,1=(accS,accE,value,Inittprds,aux,sig)Tx_{S,1}=(acc_{S},acc_{E},value,Init||t||prds,aux,sig)

to initialize the deployed smart contract.

A user uu has three tasks to participate in the protocol II. At first, it runs (sku,pku)KeyGen(parm)(sk_{u},pk_{u})\leftarrow KeyGen(parm) to produce protocol keys. Secondly, it produces a transaction

Txu,1=(accu,accu,0,pku,aux,sig)Tx_{u,1}=(acc_{u},acc_{u},0,pk_{u},aux,sig)

and sends the transaction to the blockchain to store its public keys. Thirdly, the user uu should produce a signature key pair (skSigu,vkSigu)SIG.Gen(1λ)(sk_{Sigu},vk_{Sigu})\leftarrow SIG.Gen(1^{\lambda}) using a signature scheme (SIG.Gen,SIG.Sign,SIG.Ver)(SIG.Gen,SIG.Sign,SIG.Ver) that is existential unforgeable against adaptive chosen messages (EUF-CMA). And then uu applies for a certificate certucert_{u} of the verifying key vkSiguvk_{Sigu} from a CA.

The CA and miners are auxiliary entities in the protocol. A CA is used to defend against a “Sybil” attack where an adversary may create many blockchain accounts to act as users. The miners of a blockchain receive transactions, pack them into blocks, reach a consensus on a block and update the global state. The public keys of users, parameters of the server S1S_{1}, and deposited values of the server S1S_{1} are states of the blockchain maintained by the miners. The whole picture of the protocol is in Fig. 3. We next focus on a period ee to show the protocol.

Refer to caption
Figure 3: The Secure Linear Aggregation Protocol
  1. 1.

    Round 1 (AdvertiseAccounts). When a user uu is active, it looks up the deposit DepDep, supported periods prdsprds and the threshold tt in the smart contract account accEacc_{E} indexed by the server account accSacc_{S}. If DepDep, tt, or prdsprds are too small for the user’s local policy, the user simply stops. Otherwise, it produces a signature

    σuSIG.Sign(skSigu,accu||Tu)\sigma_{u}\leftarrow SIG.Sign(sk_{Sigu},acc_{u}||T_{u})

    where TuT_{u} is a time stamp of the user. It packs a message as mu,1=(u,accu,Tu,σu,certu)m_{u,1}=(u,acc_{u},T_{u},\sigma_{u},cert_{u}) and sends the message to the server S1S_{1}. The user uu then waits for the first response from S1S_{1} or stops.

    The server S1S_{1} builds a set Ue1U_{e}^{1} as in the basic protocol. It then packs a message as mS,1=({mu,1}uUe1,esid)m_{S,1}=(\{m_{u,1}\}_{u\in U_{e}^{1}},esid) where esidesid is a session number for the period. The message mS,1m_{S,1} is broadcasted to all the users in Ue1U_{e}^{1} set as their response.

  2. 2.

    Round 2 (ShareKeys). When a user uu receives mS,1m_{S,1}, it parses mS,1m_{S,1} to extract the identities of users which form a set Ue1U_{e}^{1}. If checks that |Ue1|t|U_{e}^{1}|\geq t, and the time stamps, signatures and certificates are correct. If the verifications pass, uu looks up the blockchain to find public keys of users in Ue1U_{e}^{1} by their accounts which form a set {pkv}vUe1\{u}\{pk_{v}\}_{v\in U_{e}^{1}\backslash\{u\}}, and executes as in the basic protocol.

    The server S1S_{1} executes in the same way as in the basic protocol to produce mS,2,vm_{S,2,v} for a user vv.

  3. 3.

    Round 3 (CipherCollection). When a user uu receives a message mS,2,um_{S,2,u}, it makes a set Ue2U_{e}^{2} from mS,2,um_{S,2,u}. If |Ue2|t|U_{e}^{2}|\geq t, it executes in the same way as in the basic protocol to compute pkpk and cuc_{u}. Then uu creates a transaction

    Txu,2=(accu,accE,0,Recordesidcu,aux,sig)Tx_{u,2}=(acc_{u},acc_{E},0,Record||esid||c_{u},aux,sig)

    and sends the transaction to the blockchain. The user uu then waits for a response transaction of the server S1S_{1} from the blockchain network or stops.

    The server S1S_{1} makes a set Ue3=U_{e}^{3}=\emptyset. After it receives a transaction Txu,2Tx_{u,2}, if the transaction includes the same esidesid as the current session number, S1S_{1} sets Ue3=Ue3{u}U_{e}^{3}=U_{e}^{3}\cup\{u\} where the identity uu is indexed by the accuacc_{u}. It then runs

    c^Eval(parm,{cu}uUe3,{αu}uUe3),\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U_{e}^{3}},\{\alpha_{u}\}_{u\in U_{e}^{3}}),

    and packs a transaction

    TxS,2=(accS,accE,0,Checkfalseesidc^{accu}uUe3,aux,sig)T{x_{S,2}}=\left(\begin{gathered}ac{c_{S}},ac{c_{E}},0,Check||false||\hfill\\ esid||\hat{c}||{\{ac{c_{u}}\}_{u\in U_{e}^{3}}},aux,sig\hfill\\ \end{gathered}\right)

    and sends the transaction to the blockchain.

  4. 4.

    Round 4 (Decryption). When a user uu receives TxS,2Tx_{S,2}, it builds a Ue3U_{e}^{3} set indexed by {accu}uUe3\{acc_{u}\}_{u\in U_{e}^{3}}. If Ue3Ue2U_{e}^{3}\subseteq U_{e}^{2}, |Ue3|t|U_{e}^{3}|\geq t, and the esidesid field is the current session number, uu then executes in the same way as in the basic protocol to interact with the server S1S_{1}.

    The server S1S_{1} executes in the same way as in the basic protocol to get an aggregated result.

When the FL process is to stop, S1S_{1} may send a TxS,2Tx_{S,2} with the truetrue termination flag. The EncCheckEncCheck contract will delay the withdraw action until six blocks are generated. This is intentionally for users to check the correctness of the server. Before the deposit is given back to the server, a user could send TxS,2Tx_{S,2} received in an period to the contract. If there are different TxS,2Tx_{S,2} transactions with the same esidesid field, the deposit will be shared by users in that period.

Remark 2

In a period, two transactions are transferred by the blockchain network. Miners, the parameter server and users will receive transactions, parse them, and process them independently.

3.3 Security Analysis

We prove the security of the secure linear aggregation protocol against an active adversary. In a high level, we divide the global outputs into four views, and analyze the advantages of the adversary as a new view appears.

Theorem 3.1

Suppose that a DTAHE scheme is sematic and simulation secure, a signature scheme SIGSIG is EUF-CMA secure, and the secret sharing scheme in the DTAHE has a privacy property [20]. Let tt be the threshold value, ncn_{c} be the maximal number of corrupted users in a period, and λm\lambda_{m} be the minimal number of private inputs to make the final aggregated value hide one input. For any uncorrupted user uu, against an adversary 𝒜\mathcal{A}, if λmtnc\lambda_{m}\geq t-n_{c} and the deposit of the server S1S_{1} in the smart contract EncCheckEncCheck has not lost, the protocol satisfies GOπ,e,{S1}Ue,𝒜(x)GOπ,e,{S1}Ue,𝒜(x)GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x})\approx GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x}^{\prime}).

Proof: A global output GOπ,e,{S1}Ue,𝒜(x)GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x}) consists of four views {viewi}i{1,4}\{view_{i}\}_{i\in\{1,...4\}}. The view1view_{1} includes the outputs of the initialization procedures and the first round of the server S1S_{1} and users. The views {view2,view3,view4}\{view_{2},view_{3},view_{4}\} include the outputs of their corresponding rounds. We consider two global outputs GOGO and GOGO^{\prime} with inputs x\vec{x} and x\vec{x}^{\prime}. Note that the only difference of the two inputs are the inputs of the user uu. We denote the four views of GOGO^{\prime} as {viewi}i{1,4}\{view_{i}^{\prime}\}_{i\in\{1,...4\}}.

  • Initially, view1view_{1} and view1view_{1}^{\prime} include the same certificates, public keys and accounts in the initialization procedure of the protocol π\pi.

  • When π\pi executes, view1view_{1} and view1view_{1}^{\prime} include message-signature pairs of users and the first response of the server. view2view_{2} and view2view_{2}^{\prime} include encrypted shares and the second response of the server. Since private inputs of the user uu are not involved, the two global outputs are indistinguishable until now.

  • The view3view_{3} and view3view_{3}^{\prime} consist of two kinds of transactions, separately. Transactions of users in Ue3U_{e}^{3} could form a ciphertext set {cu}uUe3\{c_{u}\}_{u\in U_{e}^{3}}. The transaction of the server S1S_{1} includes c^\hat{c}. In the view3view_{3}, the ciphertext of the target user uu is cuEnc(parm,pk,mu)c_{u}\leftarrow Enc(parm,pk,m_{u}), and the evaluated ciphertext of the server S1S_{1} is c^Eval(parm,{cu}uUe3,{αu}uUe3)\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U_{e}^{3}},\{\alpha_{u}\}_{u\in U_{e}^{3}}). In the view3view_{3}^{\prime}, the ciphertext of the target user uu is cuEnc(parm,pk,0)c_{u}\leftarrow Enc(parm,pk,0), and the evaluated ciphertext of the server is produced in the same way. Now the chances of 𝒜\mathcal{A} increases.

    • 𝒜\mathcal{A} could exploit the differences of the two views view3view_{3} and view3view_{3}^{\prime}. Since the only difference is the message mum_{u} and the message 0, if the DTAHE is sematic secure, 𝒜\mathcal{A} has only negligible advantages to distinguish the two views.

    • 𝒜\mathcal{A} could exploit the combined views of view3view2view_{3}\cup view_{2} and view3view2view_{3}^{\prime}\cup view_{2}^{\prime}. Suppose that 𝒜\mathcal{A} could get the shared secrets in view2view_{2} and view2view_{2}^{\prime}. Then 𝒜\mathcal{A} could decrypt cuc_{u} by the ParDecParDec and FinDecFinDec algorithms. However, since the DTAHE is simulation secure, the secret shares are protected well. 𝒜\mathcal{A} has only negligible advantages to distinguish the two combined views.

    • 𝒜\mathcal{A} could exploit the combined views of view3view2view1view_{3}\cup view_{2}\cup view_{1} and view3view2view1view_{3}^{\prime}\cup view_{2}^{\prime}\cup view_{1}^{\prime} in the two global outputs.

      At first, Suppose that 𝒜\mathcal{A} corrupts users in CC^{*} and |C|=nc|C^{*}|=n_{c}. It then knows secret keys {sku}uC\{sk_{u}\}_{u\in C^{*}} which could be used to decrypt ciphers in view2view_{2}. The decrypted shares may help 𝒜\mathcal{A} to decrypt cuc_{u}. However, since we assume nc<tn_{c}<t and the secret sharing scheme has a privacy property[20], this event happens negligibly.

      Secondly, 𝒜\mathcal{A} may make special view1view_{1} and view1view_{1}^{\prime}. 𝒜\mathcal{A} could establish tt blockchain accounts and store public keys in each account. Then 𝒜\mathcal{A} acts as users to cheat the target user uu. However, since messages in view1view_{1} and view1view_{1}^{\prime} are signed by users who have certificates from a CA, the accounts of 𝒜\mathcal{A} could not be identified by the user uu. And since we explicitly include a time stamp in the first message of a user, 𝒜\mathcal{A} could not replay messages of users in other periods. To satisfy the claims, the signature scheme SIGSIG should be EUF-CMA secure.

  • view4view_{4} and view4view_{4}^{\prime} consists of partially decrypted ciphers and the final aggregated value. 𝒜\mathcal{A} has more chances:

    • 𝒜\mathcal{A} could exploit the partially decrypted ciphers to discover secret shares. Since the DTAHE is simulation secure, 𝒜\mathcal{A} has only negligible advantages.

    • 𝒜\mathcal{A} could exploit the final aggregated value. An obvious strategy is that in the transaction of the server TxS,2Tx_{S,2}, only cuc_{u} is included. To make the strategy more practical, 𝒜\mathcal{A} may aggregate some ciphers of corrupted users and the target user. Then the final output is the sum of nc+1n_{c}+1 inputs.

      The strategy is a possible way to get inputs of a user at the cost of the deposit in the smart contract. Since TxS,2Tx_{S,2} comes from the blockchain network, miners will execute the CheckCheck function in the EncCheckEncCheck smart contract. The function takes {accu}uUe3\{acc_{u}\}_{u\in U_{e}^{3}} as LAL_{A}, and checks the correctness of the evaluation of the server S1S_{1}. The smart contract builds a AALAA_{A}\subseteq L_{A} and checks that |AA|t|A_{A}|\geq t. So if nc+1<tn_{c}+1<t, the check fails and the server S1S_{1} will be punished. To avoid the punishment, 𝒜\mathcal{A} should prepare a LAL_{A} set with |LA|nc+λmt|L_{A}|\geq n_{c}+\lambda_{m}\geq t.

      𝒜\mathcal{A} may try to withdraw their deposit before the penalty is paid. However, the smart contract needs extra six blocks to confirm the request of the server. The six-block waiting time is the last chance of users to check the behaviours of the server. Users are encouraged to send the received transaction of the server to the blockchain since they are the possible beneficiaries.

In summary, if the cryptographic primitives are secure, λmtnc\lambda_{m}\geq t-n_{c}, and the server pays no penalties, the adversary has only negligible advantage to distinguish the two outputs GOGO and GOGO^{\prime}.

Remark 3

We explicitly introduce a parameter λm\lambda_{m} to describe the number of private inputs of uncorrupted users in a period. The parameter is in fact used in [5] implicitly. The consistency check round [5] makes sure that at least tt private inputs are summed. If ncn_{c} inputs are known by an adversary 𝒜\mathcal{A}, then in their protocol λmtnc\lambda_{m}\geq t-n_{c}.

4 A DTAHE Instance

Since all known construction methods of DTAHE schemes have some weaknesses. We provide a new construction method to produce a lattice based DTAHE scheme for an interactive protocol.

4.1 The Instance

We use the BFV scheme [22, 35] as a basic building block which is implemented in the SEAL library [36]. We need a sematic secure hybrid encryption scheme [37] HPKE=(HPKE.Gen,HPKE.Enc,HPKE.Ver)HPKE=(HPKE.Gen,HPKE.Enc,HPKE.Ver) to encrypt shares. A lattice based HPKEHPKE instance will give us a fully lattice DHAHE instance that may be post quantum secure. We also need the Shamir secret sharing scheme denoted by SS=(SS.Split,SS.Recover)SS=(SS.Split,SS.Recover).

  1. 1.

    Setup(1λ)parmSetup(1^{\lambda})\rightarrow parm: It takes as input a security parameter λ\lambda, produces a parameter set parm=(d,f(x),h,R,Rh,χ,μ,a,l,λ)parm=(d,f(x),h,R,R_{h},\chi,\mu,\vec{a},l,\lambda), where dd is the degree depending on λ\lambda of a cyclotomic polynomial f(x)f(x), h2h\geq 2 is an integer depending on λ\lambda, RR is a ring R=[x]/(f(x))R=\mathbb{Z}[x]/(f(x)), RhR_{h} is the set of polynomials in RR with coefficients in h\mathbb{Z}_{h}, χ\chi here is in fact defined as discrete Gaussian distribution, μ\mu is a uniform distribution, a\vec{a} is uniformly selected from RhR_{h} as aRh\vec{a}\leftarrow R_{h}, and ll is an integer depending on λ\lambda.

  2. 2.

    KeyGen(parm)(sku,pku)KeyGen(parm)\rightarrow(sk_{u},pk_{u}): It selects suR3\vec{s}_{u}\leftarrow R_{3} and samples euχ\vec{e}_{u}\leftarrow\chi. It sets sku,0=susk_{u,0}=\vec{s}_{u} and pku,0=[(asu+eu)]hpk_{u,0}=[-(\vec{a}\cdot\vec{s}_{u}+\vec{e}_{u})]_{h}. It runs (pku,1,sku,1)HPKE.Gen(1λ)(pk_{u,1},sk_{u,1})\leftarrow HPKE.Gen(1^{\lambda}). The output is sku=(sku,0,sku,1)sk_{u}=(sk_{u,0},sk_{u,1}) and pku=(pku,0,pku,1)pk_{u}=(pk_{u,0},pk_{u,1}).

  3. 3.

    Share(parm,{pkv}vU\{u},t,sku){ev,u}vU\{u}Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,sk_{u})\rightarrow\{e_{v,u}\}_{v\in U\backslash\{u\}}: It samples eu,2χ\vec{e}_{u,2}\leftarrow\chi, computes n=|{pkv}vU\{u}|+1n=|\{pk_{v}\}_{v\in U\backslash\{u\}}|+1, sets ssu={sku,0,eu,2}ss_{u}=\{sk_{u,0},\vec{e}_{u,2}\}, and for each coefficient ssu,iss_{u,i} of the elements in ssuss_{u}, computes {sv,u,i}vUSS.Split(ssu,i,n,t,h)\{\vec{s}_{v,u,i}\}_{v\in U}\leftarrow SS.Split(ss_{u,i},n,t,h). For all the shares to vU\{u}v\in U\backslash\{u\}, it computes a cipher ev,u=HPKE.Enc(pkv,1,{sv,u,i}i|ssu|d)e_{v,u}=HPKE.Enc(pk_{v,1},\{\vec{s}_{v,u,i}\}_{i\in|ss_{u}|*d}).

  4. 4.

    CombKey(parm,{pku}uU)pkCombKey(parm,\{pk_{u}\}_{u\in U})\rightarrow pk: It computes pk=[uUpku,0]hpk=[\sum_{u\in U}pk_{u,0}]_{h}.

  5. 5.

    Enc(parm,pk,mu)cuEnc(parm,pk,m_{u})\rightarrow c_{u}: It selects uuR3\vec{u}_{u}\leftarrow R_{3}, eu,0,eu,1χ\vec{e}_{u,0},\vec{e}_{u,1}\leftarrow\chi, computes cu,0=[auu+eu,0]hc_{u,0}=[\vec{a}\cdot\vec{u}_{u}+\vec{e}_{u,0}]_{h} and cu,1=[pkuu+eu,1+h/lmu]hc_{u,1}=[pk\cdot\vec{u}_{u}+\vec{e}_{u,1}+\lfloor h/l\rfloor\cdot m_{u}]_{h}. It sets cu=(cu,0,cu,1)c_{u}=(c_{u,0},c_{u,1}).

  6. 6.

    Eval(parm,{cu}uU,{αu}uU)c^Eval(parm,\{c_{u}\}_{u\in U},\{\alpha_{u}\}_{u\in U})\rightarrow\hat{c}: It computes c^0=[uUαucu,0]h\hat{c}_{0}=[\sum_{u\in U}\alpha_{u}c_{u,0}]_{h}, c^1=[uUαucu,1]h\hat{c}_{1}=[\sum_{u\in U}\alpha_{u}c_{u,1}]_{h} and sets c^=(c^0,c^1)\hat{c}=(\hat{c}_{0},\hat{c}_{1}).

  7. 7.

    ParDec(parm,c^,{eu,v}vU)m^uParDec(parm,\hat{c},\{e_{u,v}\}_{v\in U})\rightarrow\hat{m}_{u}: It decrypts the shares for the user uUu\in U from the user vUv\in U as su,vHPKC.Dec(sku,1,eu,v)\vec{s}_{u,v}\leftarrow HPKC.Dec(sk_{u,1},e_{u,v}). It parses the shares of coefficients as shares of elements in RR, sets (seu,v,ssku,v)=su,v(\vec{se}_{u,v},\vec{ssk}_{u,v})=\vec{s}_{u,v}, then computes m^u=[c^0vUssku,v+vUseu,v]h\hat{m}_{u}=[\hat{c}_{0}\cdot\sum_{v\in U}\vec{ssk}_{u,v}+\sum_{v\in U}\vec{se}_{u,v}]_{h}.

  8. 8.

    FinDec(parm,t,c^,{m^u}uV)mFinDec(parm,t,\hat{c},\{\hat{m}_{u}\}_{u\in V})\rightarrow m: It recovers cs=[uVlium^u]hcs=[\sum_{u\in V}li_{u}\hat{m}_{u}]_{h} where liuli_{u} is the Lagrange coefficient of the user uu with respect to the user set VV, and computes the final output m=[l[c^1+cs]hh]lm=[\lfloor\frac{l\cdot[\hat{c}_{1}+cs]_{h}}{h}\rceil]_{l}.

Remark 4

If the data dimension of the message mum_{u} is greater than dd, the number of noise samples eu,2R\vec{e}_{u,2}\in R increases.

4.2 Security Analysis

The security of our scheme could be reduced to a variant of the classical ring version decisional learning with errors (RLWE) problem [38, 22]. The variant is named nn-Decision-RLWE problem.

Definition 6

(nn-Decision-RLWE). For a random set {siRh}i{1,,n}\{\vec{s}_{i}\in R_{h}\}_{i\in\{1,\ldots,n\}} and a distribution χ\chi over RR, denote with A{siRh}i{1,,n},χ,μA_{\{\vec{s}_{i}\in R_{h}\}_{i\in\{1,\ldots,n\}},\chi,\mu} the distribution by choosing a uniformly random element aRh\vec{a}\leftarrow R_{h} and nn noise term {eiχ}i{1,,n}\{\vec{e}_{i}\leftarrow\chi\}_{i\in\{1,\ldots,n\}} and outputting (a,{[asi+ei]h}i{1,,n})(\vec{a},\{[\vec{a}\cdot\vec{s}_{i}+\vec{e}_{i}]_{h}\}_{i\in\{1,\ldots,n\}}). The problem is then to distinguish between the distribution A{siRh}i{1,,n},χ,μA_{\{\vec{s}_{i}\in R_{h}\}_{i\in\{1,\ldots,n\}},\chi,\mu} and a uniform distribution μ\mu over Rhn+1R_{h}^{n+1}.

By a hybrid argument, one could conclude that if an adversary has an advantage at least ϵn-RLWE\epsilon_{n\text{-}RLWE} to solve the nn-Decision-RLWE problem, the adversary has an advantage at least 1nϵn-RLWE\frac{1}{n}\epsilon_{n\text{-}RLWE} to solve the classical RLWE problem in [38, 22].

The first proof is about the evaluation correctness in the definition 2.

Theorem 4.1

Assume that UU is the user set, χ\chi is BB-bounded and the maximal infinity norm of elements in the set {αu}uU\{\alpha_{u}\}_{u\in U} is AA, the evaluation of the DTAHE is correct with probability 11 if |U|B(1+δRA(1+2δR|U|))<h2l|U|B(1+\delta_{R}A(1+2\delta_{R}|U|))<\frac{h}{2l}.

Proof

Since

m^u=[c^0vUssku,v+vUseu,v]h,\hat{m}_{u}=[\hat{c}_{0}\cdot\sum_{v\in U}\vec{ssk}_{u,v}+\sum_{v\in U}\vec{se}_{u,v}]_{h},

we have the equation (1) due to the Lagrange interpolation.

cs=[uVliu(c^0vUssku,v+vUseu,v)]h=[uVliu(c^0vUssku,v)+uVliu(vUseu,v)]h=[c^0(uVliu(vUssku,v))+uUeu,2]h=[c^0uUsku,0+uUeu,2]h\begin{split}cs&=[\sum_{u\in V}li_{u}\cdot(\hat{c}_{0}\cdot\sum_{v\in U}\vec{ssk}_{u,v}+\sum_{v\in U}\vec{se}_{u,v})]_{h}\\ &=[\sum_{u\in V}li_{u}\cdot(\hat{c}_{0}\cdot\sum_{v\in U}\vec{ssk}_{u,v})\\ &+\sum_{u\in V}li_{u}\cdot(\sum_{v\in U}\vec{se}_{u,v})]_{h}\\ &=[\hat{c}_{0}\cdot(\sum_{u\in V}li_{u}\cdot(\sum_{v\in U}\vec{ssk}_{u,v}))+\sum_{u\in U}\vec{e}_{u,2}]_{h}\\ &=[\hat{c}_{0}\cdot\sum_{u\in U}sk_{u,0}+\sum_{u\in U}\vec{e}_{u,2}]_{h}\\ \end{split} (1)

Let X=[c^1+cs]hX=[\hat{c}_{1}+cs]_{h}. Since

c^0=[uUαucu,0]h,\hat{c}_{0}=[\sum_{u\in U}\alpha_{u}c_{u,0}]_{h},
c^1=[uUαucu,1]h,\hat{c}_{1}=[\sum_{u\in U}\alpha_{u}c_{u,1}]_{h},
cu,0=[auu+eu,0]h,c_{u,0}=[\vec{a}\cdot\vec{u}_{u}+\vec{e}_{u,0}]_{h},
cu1=[pkuu+eu,1+h/lmu]h,c_{u1}=[pk\cdot\vec{u}_{u}+\vec{e}_{u,1}+\lfloor h/l\rfloor\cdot m_{u}]_{h},
pk=[uUpku,0]h,pk=[\sum_{u\in U}pk_{u,0}]_{h},

and

pku=[(asku,0+eu)]h,pk_{u}=[-(\vec{a}\cdot sk_{u,0}+\vec{e}_{u})]_{h},

we have the equation 2.

X=[c^1+c^0uUsku,0+uUeu,2]h=[uUαu(pkuu+eu,1+h/lmu)+uUαu(auu+eu,0)(uUsku,0)+uUeu,2]h\begin{split}X&=[\hat{c}_{1}+\hat{c}_{0}\cdot\sum_{u\in U}sk_{u,0}+\sum_{u\in U}\vec{e}_{u,2}]_{h}\\ &=[\sum_{u\in U}\alpha_{u}(pk\cdot\vec{u}_{u}+\vec{e}_{u,1}+\lfloor h/l\rfloor\cdot m_{u})\\ &+\sum_{u\in U}\alpha_{u}(\vec{a}\cdot\vec{u}_{u}+\vec{e}_{u,0})\cdot(\sum_{u\in U}sk_{u,0})\\ &+\sum_{u\in U}\vec{e}_{u,2}]_{h}\\ \end{split} (2)

Let

NS0=uUαueu,1+(uUαueu,0)(uUsku,0)+uUeu,2,NS_{0}=\sum_{u\in U}\alpha_{u}\vec{e}_{u,1}+(\sum_{u\in U}\alpha_{u}\vec{e}_{u,0})\cdot(\sum_{u\in U}sk_{u,0})+\sum_{u\in U}\vec{e}_{u,2},

and

MP=uUαuh/lmu.MP=\sum_{u\in U}\alpha_{u}\lfloor h/l\rfloor\cdot m_{u}.

Let

Y=[XNS0MP]h.Y=[X-NS_{0}-MP]_{h}.

We then have the equation 3.

Y=[uUαu(pkuu)+uUαu(auu)(uUsku,0)]h=[uUαu(uu(uU(asku0+eu)))+a(uUαuuu)(uUsku,0)]h=[uUαuuuuUeu]h\begin{split}Y&=[\sum_{u\in U}\alpha_{u}(pk\cdot\vec{u}_{u})+\sum_{u\in U}\alpha_{u}(\vec{a}\cdot\vec{u}_{u})\cdot(\sum_{u\in U}sk_{u,0})]_{h}\\ &=[\sum_{u\in U}\alpha_{u}(\vec{u}_{u}\cdot(\sum_{u\in U}-(\vec{a}\cdot sk_{u0}+\vec{e}_{u})))\\ &+\vec{a}\cdot(\sum_{u\in U}\alpha_{u}\vec{u}_{u})\cdot(\sum_{u\in U}sk_{u,0})]_{h}\\ &=[-\sum_{u\in U}\alpha_{u}\vec{u}_{u}\cdot\sum_{u\in U}\vec{e}_{u}]_{h}\\ \end{split} (3)

Let NS=NS0uUαuuuuUeuNS=NS_{0}-\sum_{u\in U}\alpha_{u}\vec{u}_{u}\cdot\sum_{u\in U}\vec{e}_{u}, then X=[NS+MP]hX=[NS+MP]_{h}. We then have the equation 4.

m=[l[c^1+cs]hh]l=[lXh]l=[l(NS+uUαuh/lmu)h]l=[uUαumu]l+[lNSh]l\begin{split}m&=[\lfloor\frac{l\cdot[\hat{c}_{1}+cs]_{h}}{h}\rceil]_{l}\\ &=[\lfloor\frac{l\cdot X}{h}\rceil]_{l}\\ &=[\lfloor\frac{l\cdot(NS+\sum_{u\in U}\alpha_{u}\lfloor h/l\rfloor\cdot m_{u})}{h}\rceil]_{l}\\ &=[\sum_{u\in U}\alpha_{u}m_{u}]_{l}+[\lfloor\frac{l\cdot NS}{h}\rceil]_{l}\\ \end{split} (4)

If NS<h2lNS<\frac{h}{2l}, the above decryption is correct. Since uu,sku0R3\vec{u}_{u},sk_{u0}\leftarrow R_{3}, eu0\vec{e}_{u0}, eu1\vec{e}_{u1}, eu\vec{e}_{u}, eu2χ\vec{e}_{u2}\leftarrow\chi, the maximal infinity norm of elements in the set {αu}uU\{\alpha_{u}\}_{u\in U} is AA, the infinity norm of NSNS is

NSδRA|U|B+δR2A|U|2B+|U|B+δR2A|U|2B=|U|B(1+δRA(1+2δR|U|))\begin{split}||NS||&\leq\delta_{R}A|U|B+\delta_{R}^{2}A|U|^{2}B+|U|B+\delta_{R}^{2}A|U|^{2}B\\ &=|U|B(1+\delta_{R}A(1+2\delta_{R}|U|))\\ \end{split} (5)

The second proof is for the privacy of messages in the definition 3.

Theorem 4.2

If there is an adversary 𝒜\mathcal{A} with advantage ϵsem\epsilon_{sem} to make the experiment Expt𝒜,Sem(1λ)Expt_{\mathcal{A},Sem}(1^{\lambda}) output 11, one could construct a challenger to break the nn-decision-RLWE problem with an advantage 12ϵsem\frac{1}{2}\epsilon_{sem} under the condition that the secret sharing scheme SSSS has the privacy property [20] and the hybrid encryption scheme HPKEHPKE is sematic secure.

Proof

With |U||U| and tt, the challenger samples a |U||U|-decision-RLWE instance (x0,{xu}uU)(x_{0},\{x_{u}\}_{u\in U}). It embeds the problem instance into the DTAHE instance as follows:

parmSetup(1λ),parm\leftarrow Setup(1^{\lambda}),
parm=parm\{a}{x0},parm=parm\backslash\{\vec{a}\}\cup\{x_{0}\},
(sku,pku)KeyGen(parm),(sk_{u},pk_{u})\leftarrow KeyGen(parm),
sku,0=0;pku,0=xu,sk_{u,0}=0;pk_{u,0}=x_{u},
{ev,u}vU\{u}Share(parm,{pkv}vU\{u},t,sku),\{e_{v,u}\}_{v\in U\backslash\{u\}}\leftarrow Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,sk_{u}),
pkCombKey(parm,{pku}uU).pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U}).

It then provides (parm,pk,{{ev,u}vU\{u}}uU)(parm,pk,\{\{e_{v,u}\}_{v\in U\backslash\{u\}}\}_{u\in U}) to 𝒜\mathcal{A}.

The challenger plays with 𝒜\mathcal{A} by {sku,1}uU\{sk_{u,1}\}_{u\in U} until 𝒜\mathcal{A} outputs bb^{\prime}.

If the HPKEHPKE scheme is sematic secure, the ciphers {ev,u}vU\{u}\{e_{v,u}\}_{v\in U\backslash\{u\}} leak nothing about shares. Then from the privacy property of the SSSS scheme, if |S|<t|S|<t, 𝒜\mathcal{A} can not distinguish a secret sku,0sk_{u,0} from zero. So 𝒜\mathcal{A} should produce an educated guess bb^{\prime}.

The strategy of the challenger is to use the guess of 𝒜\mathcal{A}. If the Expt𝒜,Sem(1λ)Expt_{\mathcal{A},Sem}(1^{\lambda}) outputs 0, the challenger believes that the |U||U|-decision-RLWE instance is a uniform random sample from Rhn+1R_{h}^{n+1}.

When the input is indeed a uniform random sample from Rhn+1R_{h}^{n+1}, the advantage of 𝒜\mathcal{A} is simple negligible since the messages are masked by random values. Otherwise, the adversary has an advantage ϵsem\epsilon_{sem} by assumption. So the advantage of the challenger is 12ϵsem\frac{1}{2}\epsilon_{sem}.

The third proof is for the privacy of secret keys and shares in the definition 4.

Theorem 4.3

If the secret sharing scheme SSSS has the privacy property [20] and the hybrid encryption scheme HPKEHPKE is sematic secure, the adversary 𝒜\mathcal{A} has negligible advantage to distinguish the two experiments Exp𝒜,Real(1λ)Exp_{\mathcal{A},Real}(1^{\lambda}) and Expt𝒜,Ideal(1λ)Expt_{\mathcal{A},Ideal}(1^{\lambda}).

Proof

The proof needs a serial of hybrid experiments between an adversary 𝒜\mathcal{A} and a challenger.

  • H0H_{0}: This is the experiment Exp𝒜,Real(1λ)Exp_{\mathcal{A},Real}(1^{\lambda}) in the definition 4.

  • H1H_{1}:Same as H0H_{0}, except that the challenger simulates the ParDecParDec algorithm to produce m^u\hat{m}_{u} for queries of 𝒜\mathcal{A}. Note that 𝒜\mathcal{A} has given the challenger a set SS^{*} with the size |S|=t1|S^{*}|=t-1. From SS^{*}, the challenger could construct a set SC=S\SS_{C}=S\backslash S^{*}. For each party uSCu\in S_{C}, |S{u}|=t|S^{*}\cup\{u\}|=t. The challenger sets m^u\hat{m}_{u} as

    m~u=[liu1(h/lvUαvmv+NSc^1vSlivm^v)]h\tilde{m}_{u}=[li_{u}^{-1}(\lfloor{h/l}\rfloor\sum_{v\in U}\alpha_{v}m_{v}+NS-\hat{c}_{1}-\sum_{v\in S^{*}}li_{v}\hat{m}_{v})]_{h}

    where NSNS is defined in the theorem 4.1. If uSu\in S^{*}, the challenger computes m^u\hat{m}_{u} as in the game H0H_{0}.

    The correctness of the simulation is obviously since

    m~u=[liu1(MP+NS(c^1+vSlivm^v))]h=[liu1(MP+NS(c^1+cslium^u))]h=[liu1(MP+NSX+lium^u)]h=m^u\begin{split}\tilde{m}_{u}&=[li_{u}^{-1}(MP+NS-(\hat{c}_{1}+\sum_{v\in S^{*}}li_{v}\hat{m}_{v}))]_{h}\\ &=[li_{u}^{-1}(MP+NS-(\hat{c}_{1}+cs-li_{u}\hat{m}_{u}))]_{h}\\ &=[li_{u}^{-1}(MP+NS-X+li_{u}\hat{m}_{u})]_{h}\\ &=\hat{m}_{u}\\ \end{split} (6)
  • H2H_{2}: Same as H1H_{1}, except that the challenger shares zero as

    {ev,u}vU\{u}Share(parm,{pkv}vU\{u},t,0).\{e_{v,u}\}_{v\in U\backslash\{u\}}\leftarrow Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,0).

    By the privacy property [20] of the SSSS scheme and the sematic security of the HPKEHPKE scheme, H2H_{2} and H1H_{1} are indistinguishable.

  • H3H_{3}: Same as H2H_{2}, except that NSNS is replaced by NS~\tilde{NS} as

    NS~=NS(uUsku,0)(uUαueu,0)+(uUuu,0)(uUαueu,0)\begin{split}\tilde{NS}&=NS-(\sum_{u\in U}sk_{u,0})(\sum_{u\in U}\alpha_{u}\vec{e}_{u,0})\\ &+(\sum_{u\in U}\vec{u}_{u,0})(\sum_{u\in U}\alpha_{u}\vec{e}_{u,0})\\ \end{split} (7)

    where uu,0R3\vec{u}_{u,0}\leftarrow R_{3}.

    Since sku,0R3sk_{u,0}\leftarrow R_{3}, H3H_{3} and H2H_{2} have the same distribution. In fact, NS~\tilde{NS} may appear in an experiment when the {uu,0}uU\{\vec{u}_{u,0}\}_{u\in U} happens to be part of the secret keys of users. Now the challenger does not use the private keys of users {sku}uU\{sk_{u}\}_{u\in U} or secret shares of users in UU. So the ideal experiment Expt𝒜,Ideal(1λ)Expt_{\mathcal{A},Ideal}(1^{\lambda}) could be simulated indistinguishably.

5 Performance

We evaluate the communication and computation costs of the secure linear aggregation protocol.

5.1 Communication

We have stated that there are mainly three methods [21, 19, 20] to construct a DTAHE. We concrete the method in [21] based on an elliptic curve version ElGamal (EC-ElGamal) scheme, and other methods based on the BFV [35, 13] scheme. The user side communication overhead of the secure linear aggregation protocol is calculated as follows:

i{1,,4}mu,i=(n+2)|u|+(n1)|ev,u|+|cu|+|m^u|+3|accu|+|Tu|+|σu|+|Sig|+|certu|+|aux|+|Record|+|esid|\begin{split}\sum_{i\in\{1,\ldots,4\}}m_{u,i}&=(n+2)|u|+(n-1)|e_{v,u}|+|c_{u}|\\ &+|\hat{m}_{u}|+3|acc_{u}|+|T_{u}|+|\sigma_{u}|+|Sig|\\ &+|cert_{u}|+|aux|+|Record|+|esid|\end{split} (8)

where nn is the number of users. Table I shows the main components in the equation (8). The security parameter λ\lambda is 128128. An element in EC-ElGamal takes 3333 bytes where one byte is for yy-coordinate. LRLR denotes the size of a ring element in RhR_{h}, SNSN the number of shares to each user and LNLN the number of ciphers to encrypt the user input mum_{u}. LRLR^{\prime} and LNLN^{\prime} have the same meaning as LRLR and LNLN with LRLRLR\neq LR^{\prime} and LNLNLN\neq LN^{\prime}. The method to use Shamir secret sharing in [20] is denoted as BGGJK-2, and the other is denoted as BGGJK-1.

Table 1: Communication Overheads of Main Components in the Protocol
|ev,u||e_{v,u}| |cu||c_{u}| |m^u||\hat{m}_{u}|
Pedersen[21] 33+3233+32 66|mu|66*|m_{u}| 33|mu|33*|m_{u}|
BD[19] 33+LRSN33+LR*SN 2LRLN2*LR*LN LRLNSNLR*LN*SN
BGGJK-1[20] 33+LRn433+LR*n^{4} 2LRLN2*LR*LN LRLNn4LR*LN*n^{4}
BGGJK-2[20] 33+LR33+LR^{\prime} 2LRLN2*LR^{\prime}*LN^{\prime} LRLNLR^{\prime}*LN^{\prime}
Ours 33+LR(1+LN)33+LR(1+LN) 2LRLN2*LR*LN LRLNLR*LN

Fig. 4 shows the main communication overhead of the protocol on user side with different DTAHE constructions when the number of user increases. We mainly consider the components in Table I. We set |mu|=105|m_{u}|=10^{5} for all instances, and set d=2048d=2048 and |h|=54|h|=54 so that LR=d|h|/8=13824LR=d*|h|/8=13824 bytes and LN=|mu|/d=49LN=\lceil|m_{u}|/d\rceil=49. We set t=n2/3t=\lceil n*2/3\rceil and compute SN=(n1t1)SN=\left(\begin{array}[]{l}n-1\\ t-1\end{array}\right). The values of LRLR^{\prime} and LNLN^{\prime} are relative to the number of users since the noise element in the ParDecParDec algorithm should be multiplied by (n!)2(n!)^{2}, which are limited by the Theorem 4.1 and the equation (6) in [22].

From Fig. 4, we exclude the BD[19] and BGGJK-1[20] methods to distribute many shares to a user. Our method is better than the BGGJK-2[20] when the user number is greater than 2626. The EC-ElGamal method has the best communication performance when the number of users is greater than 2020.

Refer to caption
Figure 4: Main Communication Overheads of Different DTAHE Schemes in the Protocol

5.2 Computation

Since the computation costs of the protocol are dominated by a DTAHE scheme. We give Table II to show the time cost of some DTAHE algorithms. We set the user number as n=35n=35 and implement three DTAHE schemes for comparison. The schemes are implemented in Python. The “pyOpenSSL” is used to implement the EC-ElGamal based DTAHE. The small discrete logarithm of a group element is found by the well-known “Baby-Step-Giant-Step” method. An open source library “bfv-python” is adapted to implement our scheme and the BGGJK-2 method in [20]. For simplicity, we use a public python module “eciespy” as an instance of the HPKEHPKE scheme. The CPUs are Intel(R) Core(TM) i7-8550U (1.80GHz, 1.99GHz) and the RAM is 16GB.

We use the “secp256r1” curve as the parameter set to implement the EC-ElGamal based DTAHE. For the security parameter λ=128\lambda=128, we set d=2048d=2048, |h|=54|h|=54 and |l|=17|l|=17 for our DTAHE scheme. When n=35n=35, the scheme constructed by the BGGJK-2 method in [20] requires |h|426|h|\geq 426. So we set d=16384d=16384 according to the parameter table in [39]. Each data element in mum_{u} in our test occupies 88 bits. The time costs of the three implementations are listed in the table II, which are measured in seconds. Apparently, a secure linear aggregation protocol with the EC-ElGamal based DTAHE takes the least time on user side. The protocol with our DTAHE scheme takes the least time on server side.

Table 2: Computation Costs of Some DTAHE Algorithms with Fixed Number of Users
ShareShare CombKeyCombKey and EncEnc EvalEval ParDecParDec FinDecFinDec
[21] 0.030.03 6.956.95 14.6214.62 6.326.32 441.25441.25
[20]-2 4.034.03 15.4815.48 2.402.40 17.2917.29 2.982.98
Ours 17.4617.46 7.507.50 1.521.52 51.1751.17 1.641.64

6 Conclusion

This paper shows a secure linear protocol mainly for complex federated learning models. When the communication cost is not the dominate factor, the protocol could be deployed in a federated learning model to protect the private inputs of users. The DTAHE schemes in the protocol may be further optimized to reduce the computation time. For example, parallel computing technologies may reduce the time cost of the EC-ElGamal based DTAHE, and multi-secret sharing schemes may reduce the communication and computation costs of our DTAHE scheme.

Acknowledgment

This work is supported by the National Key R&D Program of China (2017YFB0802500), Guangdong Major Project of Basic and Applied Basic Research(2019B030302008) and Natural Science Foundation of Guangdong Province of China (2018A0303130133).

References

  • [1] H. B. McMahan, E. Moore, D. Ramage, and B. A. y Arcas, “Federated learning of deep networks using model averaging,” CoRR, vol. abs/1602.05629, 2016. [Online]. Available: http://arxiv.org/abs/1602.05629
  • [2] M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” in Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’15.   New York, NY, USA: ACM, 2015, pp. 1322–1333. [Online]. Available: http://doi.acm.org/10.1145/2810103.2813677
  • [3] M. Al-Rubaie and J. M. Chang, “Reconstruction attacks against mobile-based continuous authentication systems in the cloud,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 12, pp. 2648–2663, Dec 2016.
  • [4] C. Di, W. Leye, C. Kai, and Y. Qiang, “Secure federated matrix factorization,” in FML 2019 : The 1st International Workshop on Federated Machine Learning for User Privacy and Data Confidentiality, 2019.
  • [5] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS’17.   New York, NY, USA: Association for Computing Machinery, 2017, pp. 1175–1191. [Online]. Available: https://doi.org/10.1145/3133956.3133982
  • [6] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’16.   New York, NY, USA: Association for Computing Machinery, 2016, pp. 308–318. [Online]. Available: https://doi.org/10.1145/2976749.2978318
  • [7] K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. S. Quek, and H. Vincent Poor, “Federated learning with differential privacy: Algorithms and performance analysis,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3454–3469, 2020.
  • [8] V. Rastogi and S. Nath, “Differentially private aggregation of distributed time-series with transformation and encryption,” in Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’10.   New York, NY, USA: Association for Computing Machinery, 2010, p. 735–746. [Online]. Available: https://doi.org/10.1145/1807167.1807247
  • [9] M. Jawurek and F. Kerschbaum, “Fault-tolerant privacy-preserving statistics,” in Privacy Enhancing Technologies, S. Fischer-Hübner and M. Wright, Eds.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 221–238.
  • [10] M. Joye and B. Libert, “A scalable scheme for privacy-preserving aggregation of time-series data,” in Financial Cryptography and Data Security, A.-R. Sadeghi, Ed.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 111–125.
  • [11] I. Leontiadis, K. Elkhiyaoui, and R. Molva, “Private and dynamic time-series data aggregation with trust relaxation,” in Proceedings of the 13th International Conference on Cryptology and Network Security - Volume 8813.   Berlin, Heidelberg: Springer-Verlag, 2014, p. 305–320. [Online]. Available: https://doi.org/10.1007/978-3-319-12280-9_20
  • [12] E. Shi, T.-H. Chan, E. Rieffel, R. Chow, and D. Song, “Privacy-preserving aggregation of time-series data,” vol. 2, 01 2011.
  • [13] T. H. H. Chan, E. Shi, and D. Song, “Privacy-preserving stream aggregation with fault tolerance,” in Financial Cryptography and Data Security, A. D. Keromytis, Ed.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 200–214.
  • [14] I. Leontiadis, K. Elkhiyaoui, M. Önen, and R. Molva, “Puda – privacy and unforgeability for data aggregation,” Cryptology ePrint Archive, Report 2015/562, 2015, https://eprint.iacr.org/2015/562.
  • [15] Q. Li and G. Cao, “Efficient privacy-preserving stream aggregation in mobile sensing with low aggregation error,” in Privacy Enhancing Technologies, E. De Cristofaro and M. Wright, Eds.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 60–81.
  • [16] Y. Liu, Y. Liu, Z. Liu, Y. Liang, C. Meng, J. Zhang, and Y. Zheng, “Federated forest,” IEEE Transactions on Big Data, pp. 1–1, 2020.
  • [17] H. H. Zhuo, W. Feng, Y. Lin, Q. Xu, and Q. Yang, “Federated deep reinforcement learning,” 2020.
  • [18] J. L. H. Crawford, C. Gentry, S. Halevi, D. Platt, and V. Shoup, “Doing real work with FHE: the case of logistic regression,” in Proceedings of the 6th Workshop on Encrypted Computing & Applied Homomorphic Cryptography, WAHC@CCS 2018, Toronto, ON, Canada, October 19, 2018, 2018, pp. 1–12. [Online]. Available: https://doi.org/10.1145/3267973.3267974
  • [19] R. Bendlin and I. Damgård, “Threshold decryption and zero-knowledge proofs for lattice-based cryptosystems,” in Theory of Cryptography, D. Micciancio, Ed.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 201–218.
  • [20] D. Boneh, R. Gennaro, S. Goldfeder, A. Jain, S. Kim, P. M. R. Rasmussen, and A. Sahai, “Threshold cryptosystems from threshold fully homomorphic encryption,” in Advances in Cryptology – CRYPTO 2018, H. Shacham and A. Boldyreva, Eds.   Cham: Springer International Publishing, 2018, pp. 565–596.
  • [21] T. P. Pedersen, “A threshold cryptosystem without a trusted party,” in Advances in Cryptology — EUROCRYPT ’91, D. W. Davies, Ed.   Berlin, Heidelberg: Springer Berlin Heidelberg, 1991, pp. 522–526.
  • [22] J. Fan and F. Vercauteren, “Somewhat practical fully homomorphic encryption,” Cryptology ePrint Archive, Report 2012/144, 2012, https://eprint.iacr.org/2012/144.
  • [23] X. Zhu, H. Li, and Y. Yu, “Blockchain-based privacy preserving deep learning,” in Information Security and Cryptology, F. Guo, X. Huang, and M. Yung, Eds.   Cham: Springer International Publishing, 2019, pp. 370–383.
  • [24] X. Bao, C. Su, Y. Xiong, W. Huang, and Y. Hu, “Flchain: A blockchain for auditable federated learning with trust and incentive,” in 2019 5th International Conference on Big Data Computing and Communications (BIGCOM), 2019, pp. 151–159.
  • [25] Y. Lu, X. Huang, K. Zhang, S. Maharjan, and Y. Zhang, “Blockchain empowered asynchronous federated learning for secure data sharing in internet of vehicles,” IEEE Transactions on Vehicular Technology, vol. 69, no. 4, pp. 4298–4311, 2020.
  • [26] Y. Qu, L. Gao, T. H. Luan, Y. Xiang, S. Yu, B. Li, and G. Zheng, “Decentralized privacy using blockchain-enabled federated learning in fog computing,” IEEE Internet of Things Journal, vol. 7, no. 6, pp. 5171–5183, 2020.
  • [27] J. Weng, J. Weng, J. Zhang, M. Li, Y. Zhang, and W. Luo, “Deepchain: Auditable and privacy-preserving deep learning with blockchain-based incentive,” IEEE Transactions on Dependable and Secure Computing, pp. 1–1, 2019.
  • [28] S. R. Pokhrel and J. Choi, “Federated learning with blockchain for autonomous vehicles: Analysis and design challenges,” IEEE Transactions on Communications, pp. 1–1, 2020.
  • [29] H. Kim, J. Park, M. Bennis, and S. Kim, “Blockchained on-device federated learning,” IEEE Communications Letters, vol. 24, no. 6, pp. 1279–1283, 2020.
  • [30] Q. Wang, Y. Guo, X. Wang, T. Ji, L. Yu, and P. Li, “Ai at the edge: Blockchain-empowered secure multiparty learning with heterogeneous models,” IEEE Internet of Things Journal, pp. 1–1, 2020.
  • [31] K. Sarpatwar, R. Vaculin, H. Min, G. Su, T. Heath, G. Ganapavarapu, and D. Dillenberger, Towards Enabling Trusted Artificial Intelligence via Blockchain.   Cham: Springer International Publishing, 2019, pp. 137–153. [Online]. Available: https://doi.org/10.1007/978-3-030-17277-0_8
  • [32] S. Awan, F. Li, B. Luo, and M. Liu, “Poster: A reliable and accountable privacy-preserving federated learning framework using the blockchain,” in Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’19.   New York, NY, USA: Association for Computing Machinery, 2019, p. 2561–2563. [Online]. Available: https://doi.org/10.1145/3319535.3363256
  • [33] M. Di Raimondo and R. Gennaro, “New approaches for deniable authentication,” in Proceedings of the 12th ACM Conference on Computer and Communications Security, ser. CCS’ 05.   New York, NY, USA: Association for Computing Machinery, 2005, pp. 112–121. [Online]. Available: https://doi.org/10.1145/1102120.1102137
  • [34] D. G. Wood, “Ethereum: a secure decentralised g generalised transaction ledger homestead,” http://gavwood.com/paper.pdf, 2014, [Online, Accessed 26-June-2020].
  • [35] Z. Brakerski, “Fully homomorphic encryption without modulus switching from classical gapsvp,” in Advances in Cryptology – CRYPTO 2012, R. Safavi-Naini and R. Canetti, Eds.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 868–886.
  • [36] “Microsoft SEAL (release 3.5),” https://github.com/Microsoft/SEAL, Apr. 2020, microsoft Research, Redmond, WA.
  • [37] J. Herranz, D. Hofheinz, and E. Kiltz, “Some (in)sufficient conditions for secure hybrid encryption,” Inf. Comput., vol. 208, no. 11, p. 1243–1257, Nov. 2010. [Online]. Available: https://doi.org/10.1016/j.ic.2010.07.002
  • [38] V. Lyubashevsky, C. Peikert, and O. Regev, “On ideal lattices and learning with errors over rings,” in Advances in Cryptology – EUROCRYPT 2010, H. Gilbert, Ed.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 1–23.
  • [39] M. Albrecht, M. Chase, H. Chen, J. Ding, S. Goldwasser, S. Gorbunov, S. Halevi, J. Hoffstein, K. Laine, K. Lauter, S. Lokam, D. Micciancio, D. Moody, T. Morrison, A. Sahai, and V. Vaikuntanathan, “Homomorphic encryption security standard,” HomomorphicEncryption.org, Toronto, Canada, Tech. Rep., November 2018.