¹¹institutetext: Sun Yat-Sen University²²institutetext: Huawei Noah’s Ark Lab

Secure Linear Aggregation Using Decentralized Threshold Additive Homomorphic Encryption For Federated Learning

Haibo Tian 11 Fangguo Zhang 11 Yunfeng Shao 22 Bingshuai Li Dr. Tian and Prof. Zhang were with the GuangDong Province Key Laboratory of Information Security Technology, School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, Guangdong, 510275, P. R. China e-mail: {tianhb, isszhfg}@mail.sysu.edu.cn. Dr. Shao and Dr. Li were with Huawei Noah’s Ark Lab. e-mail: {shaoyunfeng, libingshuai}@huawei.com. 22

Abstract

Secure linear aggregation is to linearly aggregate private inputs of different users with privacy protection. The server in a federated learning (FL) environment can fulfill any linear computation on private inputs of users through the secure linear aggregation. At present, based on pseudo-random number generator and one-time padding technique, one can efficiently compute the sum of user inputs in FL, but linear calculations of user inputs are not well supported. Based on decentralized threshold additive homomorphic encryption (DTAHE) schemes, this paper provides a secure linear aggregation protocol, which allows the server to multiply the user inputs by any coefficients and to sum them together, so that the server can build a full connected layer or a convolution layer on top of user inputs. The protocol adopts the framework of Bonawitz et al. to provide fault tolerance for user dropping out, and exploits a blockchain smart contract to encourage the server honest. The paper gives a security model, security proofs and a concrete lattice based DTAHE scheme for the protocol. It evaluates the communication and computation costs of known DTAHE construction methods. The evaluation shows that an elliptic curve based DTAHE is friendly to users and the lattice based version leads to a light computation on the server.

Keywords:

rivacy Protection, Secure Linear Aggregation, Additive Homomorphic Encryption, Smart Contract.

1 Introduction

Federated learning (FL) is intended to train better machine learning models on decentralized real-world data. The models then could be used to build more intelligent equipments for people, such as cars, wearable devices or browsers. McMahan et al. [1] proposed a well known FL protocol. The players in their protocol include users who owned data and a parameter server that aggregates model information of users. The protocol runs periodically. In each period, the parameter server randomly selects some users to upload their local model parameters, and averages the parameters to update a global learning model. A user in a period downloads the global learning model, feeds their local data, runs a deep learning network locally and gets updated local model parameters, the information of which is sent to the parameter server. In different periods the server may select different users, and within a period some of the selected users may drop out.

As real-world data are usually sensitive, an important problem in FL is data privacy. Although user data are not directly sent to the parameter server in FL, information of local model parameters may leak the raw data of users. Fredrikson et al. [2] show how to recover train samples from prediction results of a model. Rubaie and Chang [3] exploit feature vectors to reconstruct raw input data of a model. Chai et al. [4] show how to recover preferences of users by model gradient data. Bonawitz et al. [5] believe that recent updated local model parameters of a user may leak raw data of the user.

There are mainly two approaches to solve the data privacy problem in FL. One uses differential privacy and the other uses cryptographic tools. The work of Martin et al. [6] is an earlier report of the differential privacy approach. Wei et al. [7] point out a tradeoff between the convergence performance and privacy protection levels of the differential privacy method. For a fixed privacy protection level, the number of users increases, the convergence performance behaves better. However, if the number of users in each period is limited, one may use the cryptographic tools based approach. Bonawitz et al. [5] propose an elegant solution based on one-time padding and a secure pseudorandom generator.

To the best of our knowledge, the solution of Bonawitz et al. [5] is the only work suitable for the FL using cryptographic tools. They take the data privacy problem in FL as an secure aggregation problem. And they show some new requirements of a secure aggregation protocol for the FL. Except a security requirement, other requirements are as follow:

1.

The protocol should operates on a high-dimensional vectors;
2.

The protocol should tolerate users dropping out;
3.

The protocol should be communication efficient even with a new set of uses on each period.

With these requirements, Bonawitz et al. [5] show that previous works are unsatisfactory which include some works based on homomorphic encryption schemes. In detail, they believe that solutions based on Paillier scheme [8, 9, 10, 11] are either computationally expensive or require additional trusted dealer, and solutions based on ElGamal scheme [12, 13, 14, 15] need a high expansion factor considering the size of the group elements and that of the model parameters.

Considering the development of the communication technology, we believe that a moderate expansion factor is acceptable and the functionality of a protocol is more important. Liu et al. [16] proposed a federated forest where a parameter server should find the maximal value of user inputs. Zhuo et al. [17] proposed a federated deep reinforcement learning model where a parameter server needs to build a multi-layer perception on user inputs. The users in these scenarios are usually not mobile users so that the communication cost is not the dominate factor. However, to protect the privacy of users, a sum only secure aggregation is not enough.

We provide a secure linear aggregation protocol to enrich the parameter server. It could naturally be used in the federated averaging algorithm [1] in the same way as the secure sum aggregation [5]. It also could be used in [17] to build a linear multi-layer perception to get a reinforcement learning model. It may be adapted in [16] with the homomorphic encryption [18] to find the maximal value of user inputs. For simplicity, the solution here is only based on decentralized threshold additive homomorphic encryption (DTAHE) schemes.

Currently, there are three known methods to construct a DTAHE scheme. The first method is to distribute many secret shares to a user. Bendlin and Damgård [19] proposed a threshold homomorphic encryption scheme in this way. It relies on secret sharing schemes with general access structure. Boneh et al. [20] proposed such a scheme based on secret sharing schemes constructed by a monotone formulae. The second method is to use a large modulus for coefficients of an element in a polynomial ring. Boneh et al. [20] propose to use Shamir secret sharing scheme in this way. The last method is to use the ElGamal encryption as a basic building tool [21]. We exclude the Paillier encryption based construction method since it is hard to produce a shared key pair in a distributed manner without a trusted dealer. According to our evaluation, in the secure linear aggregation protocol, the communication overhead of the first method is too high and the computation overhead of the third method is a bit high. The second method also has a drawback since a large modulus means a higher polynomial degree when the noise bound is fixed [22]. So we provide a new method to construct a DTAHE scheme with security proofs. It does not increase the modulus size and polynomial degree.

The secure linear aggregation protocol is against an active adversary [5]. An active adversary could corrupt a parameter server and ask $t$ users to decrypt a cipher of a target user. It is not easy to defend against the attack. Note that Bonawitz et al. [5] add a consistency check round to solve a similar problem in their protocol. However, even we add such a round, the problem still exists since the target user may have dropped out before the consistency check round. To solve the problem, we introduce a blockchain system. We design a smart contract to record ciphers of users and to check the evaluation process of the parameter server. If the parameter server deviates its expected behaviours in a period, the parameter server could be punished and the users in that period could get compensation.

There exists a lot of works to introduce a blockchain into the FL. A top complained problem in FL is the incentive of users to participate a learning process. Researchers propose blockchain enabled models [23, 24, 25, 26, 27] to give rewards to data owners. Basically, model initializer proposes model parameters and rewards in the blockchain, data owners choose their interested model to download the current model parameters, to train an updated model by their local private inputs and to update information of model parameters to the blockchain, and miners of the blockchain aggregate inputs of users to get a new global model parameter and rewards relevant users. Pokhrel and Choi [28] and Kim et al. [29] show some theoretical results about the performance of blockchain enabled FL considering the delays in a blockchain. A simulation result in [30] shows that the FL without a blockchain is most efficient. Another motivation to introduce a blockchain system to the FL is about the trustiness of a learning model. Sarpatwar et al. [31] propose to model and capture provenance of an overall learning process for verification. Awan et al. [32] propose to record data produced in a learning task in a blockchain for verification. We use a blockchain as a trusted third party to verify the behaviours of the parameter server. Since a verification process usually could be separated from a learning process, the FL and the blockchain could work at their own paces.

In summary, our contributions are as follows.

•

We give a definition of DTAHE for FL and provide a basic linear aggregation protocol based on the definition. The basic protocol supports a parameter server to linearly operate ciphers of model parameters from different users.
•

We provide a secure linear aggregation protocol against an active adversary with security model and proofs. It shows how to use a blockchain smart contract to help a secure linear aggregation protocol.
•

We provides a new method to construct a lattice based instance of the DTAHE scheme with security proofs. Evaluations show that the DTAHE scheme leads to a lightweight computation on the server side.

2 Preliminaries

2.1 Basic Notations

For any set $X$ , we denote by $|X|$ the number of elements of the set $X$ . If $x$ is a string, $|x|$ denotes its bit length. And if $x$ is a vector, $|x|$ denotes the dimension of the vector. $x||y$ denotes the bit catenation of two strings $x$ and $y$ .

Let $R=\mathbb{Z}[x]/(f(x))$ be a polynomial ring where $f(x)$ is a monic irreducible polynomial of degree $d$ . Elements of the ring $R$ is denoted by vectors. For $\vec{a}\in R$ , the coefficients of $\vec{a}$ is denoted by $a_{i}$ such that $\vec{a}=\sum_{i=0}^{d-1}a_{i}\cdot x^{i}$ . The infinity norm of $||\vec{a}||$ is defined as $max_{i}|a_{i}|$ and the expansion factor of R is defined as $\delta_{R}=max\{||\vec{a}\cdot\vec{b}||/(||\vec{a}||\cdot||\vec{b}||):\vec{a},\vec{b}\in R\}$ .

Let $h>1$ be an integer. Then $\mathbb{Z}_{h}$ denotes a set of integers $(-\frac{h}{2},\frac{h}{2}]$ . The symbol $\mathbb{Z}/q\mathbb{Z}$ denotes a ring on integers $\{0,\ldots,q-1\}$ . For $x\in\mathbb{Z}$ , $[x]_{h}$ denotes the unique integer in $\mathbb{Z}_{h}$ with $[x]_{h}=x\bmod h$ . For $\vec{x}\in R$ , $[\vec{x}]_{h}$ denotes the element in $R$ obtained by applying $[\cdot]_{h}$ to all its coefficients. For $x\in\mathbb{R}$ , $\lfloor x\rceil$ denotes rounding to the nearest integer and $\lfloor x\rfloor$ , $\lceil x\rceil$ denote rounding up or down.

Let $\lambda$ be an integer as the security parameter. A function $negl(\lambda)$ is negligible in $\lambda$ if $negl(\lambda)=o(1/\lambda^{c})$ for every $c\in\mathbb{N}$ . An event occurs with negligible probability if the probability of the event is $negl(\lambda)$ . An event occurs with overwhelming probability if its complement occurs with negligible probability.

Given a probability distribution $\mathcal{D}$ , we use $x\leftarrow\mathcal{D}$ to denote that $x$ is sampled from $\mathcal{D}$ . For a set $X$ , $x\leftarrow X$ denotes that $x$ is sampled uniformly from $X$ . A distribution $\chi$ over integers is called $B$ -bounded if it is supported on $[-B,B]$ .

2.2 Federated Learning

McMahan et al. [1] described the FL framework. We give a briefly review to see their federated averaging algorithm. As shown in Fig. 1, there is a parameter server $S_{1}$ and many users. $S_{1}$ exchanges model parameters with users to collaboratively build a better learning model. The federated averaging algorithm runs periodically. In each period, $S_{1}$ selects $n$ users randomly. In Fig. 1, we intended to show two periods with totally different users. Next we focus on a period $e$ where users in a set $U_{e}$ .

Refer to caption — Figure 1: The framework of FL

The FL in a period is as follows.

1.

A user $u\in U_{e}$ is activated to download the global parameters from the parameter server.
2.

The user $u$ feeds their local data samples to get updated model parameters. And the parameters or their gradients $\omega_{u}$ with the number of local data samples $n_{u}$ should be sent to $S_{1}$ .
3.

The parameter server $S_{1}$ compute the updated global model parameters $\omega=\frac{1}{\sum_{u\in U_{e}}n_{u}}\sum_{u\in U_{e}}n_{u}\omega_{u}$ .

The parameter server $S_{1}$ continues to randomly activate a set of users in the next period until the parameters $\omega$ converge to that of the global optimal learning model.

Users could upload the $n_{u}$ together with $\omega_{u}$ so that the server could get the value $\sum_{u\in U_{e}}n_{u}$ and compute the updated $\omega$ . So a sum only aggregation is enough for the federated averaging algorithm. However, a parameter server may execute more computations in other federated learning models [16, 17], where a sum only aggregation is not enough.

2.3 The Model Of Secure Aggregation For FL

We adapt the notion of message-driven entities in [33] to the FL security model [5] to describe the participants in a FL scenario. A message-driven entity is initially invoked by an environment process. Each entity has their initial states. Once invoked, an entity waits for an activation that can happen for a message from the network or an environment process. On activation, the entity processes the incoming message or the arguments from a process with its current internal state, and generates a new internal state, outgoing messages and a cumulative output. Once an activation is completed, the entity waits for the next activation if it does not stop.

A protocol $\pi$ consists of a server side progress $\pi_{S}$ and a client side progress $\pi_{u}$ . An FL process in $S_{1}$ invokes $\pi_{S}$ , which runs periodically until $\pi_{S}$ is stopped by the FL process. In each period, $\pi_{S}$ randomly selects a set of users $U$ and activates their client side progresses. A user $u$ should have voluntarily invoked their $\pi_{u}$ before $S_{1}$ activates them. On activation, $\pi_{u}$ exchanges messages with $\pi_{S}$ until $\pi_{S}$ finishes a period or $\pi_{u}$ is stopped by the user. There may be some auxiliary entities for the security of the protocol $\pi$ . A common setup includes a certificate authority (CA) [5]. A $CA$ is invoked and runs permanently to receive certificate requests from users and issue certificates to users. In our protocol, we also introduce blockchain miners $BM$ . Miners are invoked to receive transactions from users and the server, and to execute smart contracts on commonly consented transactions.

The server and each user have a bi-directional secure channel between them. Practically, it reuses a transport layer secure channel established by FL processes. Users do not have direct links in the protocol $\pi$ . The communications of users are transferred by the server. Users and the $CA$ may exchange messages at an initial stage. When the protocol $\pi$ runs, the $CA$ could be offline. Blockchain miners usually have a point-to-point (P2P) network to deliver transactions and blocks. When a blockchain is used in $\pi$ , FL entities should have the ability to send and receive transactions to and from the P2P network.

An active adversary $\mathcal{A}$ could corrupt the server and users in $\pi$ . For each period, the number of corrupted parties is at most $n_{c}$ [5]. When $\mathcal{A}$ corrupts an entity, it knows the internal states and long-term secrets of the entity, and controls all the behaviours of the entity from that point. Obviously, the server $S_{1}$ could be corrupted to act as a malicious server. When $\mathcal{A}$ corrupts a user, it could send transactions on behalf of the user.

We does not allow an adversary to corrupt $CA$ or $BM$ since we do not try to model the security of a CA system or a blockchain system. We simply make assumptions about them:

•

A transaction from a user or the server will be executed and recorded correctly by blockchain miners within a bounded delay.
•

A certificate from a $CA$ proves the binding relation of a verification key and the real identity of a user.

We define $GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x},\vec{r})$ as a global output for $\pi$ in a period $e$ where $\vec{x}=\{x_{S}\}\cup\{x_{u}\}_{u\in U_{e}}$ denotes the inputs of the server $S_{1}$ and the inputs of users in the set $U_{e}$ of the period $e$ , and $\vec{r}=\{r_{0},r_{S}\}\cup\{r_{u}\}_{u\in U_{e}}$ denotes the random inputs of the adversary $\mathcal{A}$ , server $S_{1}$ and users in $U_{e}$ . Let

GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x})

be the random variable about $GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x},\vec{r})$ . The randomness comes from the adversary $\mathcal{A}$ , server $S_{1}$ and users in $U_{e}$ .

We define the input privacy of a user [5] as follows.

Definition 1

(Input Privacy of a User). For any uncorrupted user $u\in U_{e}$ in a period $e$ , the privacy of its input is assured if

GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x})\approx GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x}^{\prime})

where $\vec{x}^{\prime}=\{x_{S}\}\cup\{x_{v}\}_{v\in U_{e}\backslash\{u\}}\cup\{0\}$ and “ $\approx$ ” means computationally indistinguishable.

The security goal in [5] is input privacy of all uncorrupted users. Apparently, if the privacy of all users are protected, the privacy of a user is protected. If the privacy of each uncorrupted user in protected, the privacy of all uncorrupted users are protected. So the two definitions are equivalent.

2.4 DTAHE

Boneh et al. [20] give a definition of decentralized threshold fully homomorphic encryption (DTFHE). We refine it as a DTAHE definition for the FL scenario. A DTAHE scheme is a tuple of probabilistic polynomial time (PPT) algorithms $DTAHE=(Setup$ , $KeyGen$ , $Share$ , $CombKey$ , $Enc$ , $Eval$ , $ParDec$ , $FinDec)$ .

1.

$Setup(1^{\lambda})\rightarrow parm$ : It takes as input a security parameter $\lambda$ , outputs system parameters $parm$ .
2.

$KeyGen(parm)\rightarrow(sk_{u},pk_{u})$ : It takes as input the system parameter $parm$ to produce public and private keys for a user $u$ .
3.

$Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,sk_{u})\rightarrow\{e_{v,u}\}_{v\in U\backslash\{u\}}$ : It takes as input the system parameter $parm$ , public keys of users in a set $U$ excluding the user $u$ , a threshold value $t$ and private keys $sk_{u}$ of the user $u$ , to produce encrypted shares $e_{v,u}$ for each user $v\in U\backslash\{u\}$ .
4.

$CombKey(parm,\{pk_{u}\}_{u\in U})\rightarrow pk$ : It takes as input the system parameter $parm$ , public keys of a set of users in $U$ , and produces an encryption key $pk$ .
5.

$Enc(parm,pk,m_{u})\rightarrow c_{u}$ : It takes as input the system parameter $parm$ , a public key $pk$ and a message $m_{u}$ from a user $u$ , and produces a ciphertext $c_{u}$ .
6.

$Eval(parm,\{c_{u}\}_{u\in U},\{\alpha_{u}\}_{u\in U})\rightarrow\hat{c}$ : It takes as the system parameter $parm$ , ciphers $\{c_{u}\}_{u\in U}$ and coefficients $\{\alpha_{u}\}_{u\in U}$ , and produces an evaluated cipher $\hat{c}=\sum_{u\in U}\alpha_{u}\cdot c_{u}$ .
7.

$ParDec(parm,\hat{c},\{e_{u,v}\}_{v\in U})\rightarrow\hat{m}_{u}$ : It takes as input the system parameter $parm$ , the cipher $\hat{c}$ , and a set of encrypted shares $\{e_{u,v}\}_{v\in U\backslash\{u\}}$ to the user $u$ , and produces a partially decrypted value $\hat{m}_{u}$ .
8.

$FinDec(parm,t,\hat{c},\{\hat{m}_{u}\}_{u\in V})\rightarrow m$ : It takes as input the system parameter $parm$ , the threshold value $t$ , the cipher $\hat{c}$ and partially decrypted ciphers $\{\hat{m}_{u}\}_{u\in V}$ from users in a set $V$ with $|V|\geq t$ , and produces a plaintext $m$ .

One could simply give an ElGamal based DTAHE instance following the constructions in [21], which justifies the correctness of the DTAHE definition.

2.5 DTAHE Model

We adapt the model of DTFHE in [20] for the DTAHE. The first definition is evaluation correctness.

Definition 2

(Evaluation Correctness). A DTAHE scheme for a set of users $U$ satisfies evaluation correctness if for all $\lambda$ and $t$ , the following holds:

For an evaluated cipher

\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U},\{\alpha_{u}\}_{u\in U})

the probability

Pr\left[\begin{gathered}FinDec\left(\begin{gathered}parm,t,\hat{c},\hfill\\ {\{ParDec(parm,\hat{c},{\{{e_{u,v}}\}_{v\in U}})\}_{u\in V}}\hfill\\ \end{gathered}\right)\hfill\\ =\sum\limits_{u\in U}{{\alpha_{u}}}\cdot{m_{u}}\hfill\\ \end{gathered}\right]

is overwhelming where

c_{u}\leftarrow Enc(parm,pk,m_{u}),

(\{e_{v,u}\}_{v\in U})\leftarrow Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,sk_{u}),

(sk_{u},pk_{u})\leftarrow KeyGen(parm),

pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U}),

and

parm\leftarrow Setup(1^{\lambda}).

The second definition is sematic security. It captures the privacy of messages.

Definition 3

(Sematic Security). We say that a DTAHE scheme for a user set $U$ satisfies sematic security if for all $\lambda$ , the following holds:

For any PPT adversary $\mathcal{A}$ , the following experiments $Expt_{\mathcal{A},Sem}(1^{\lambda})$ outputs $1$ with probability $\frac{1}{2}+negl(\lambda)$ :

•

$Expt_{\mathcal{A},Sem}(1^{\lambda})$ :

1.

The adversary outputs $U$ and $V$ where $|U|=n$ and $|V|=t$ specify an access structure.

The challenger runs

parm\leftarrow Setup(1^{\lambda}),

(sk_{u},pk_{u})\leftarrow KeyGen(parm),

{\{{e_{v,u}}\}_{v\in U\backslash\{u\}}}\leftarrow Share\left(\begin{gathered}parm,{\{p{k_{v}}\}_{v\in U\backslash\{u\}}},\hfill\\ t,s{k_{u}}\hfill\\ \end{gathered}\right),

pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U}),

and provides $(parm,pk,\{\{e_{v,u}\}_{v\in U\backslash\{u\}}\}_{u\in U})$ to $\mathcal{A}$ .

3.

$\mathcal{A}$ outputs a set $S\subseteq U$ such that $|S|<t$ . It submits message vectors $\{m_{u,0},m_{u,1}\}_{u\in U}$ and $S$ to the challenger.
4.

The challenger provides $\mathcal{A}$ the shares $\{\{s_{u,v}\}_{v\in U}\}_{u\in S}$ and a cipher set

$\{c_{u}\leftarrow Enc(parm,pk,m_{u,b})\}_{u\in U}\text{, for }b\in\{0,1\}.$
5.

$\mathcal{A}$ outputs a guess bit $b^{\prime}$ . The experiment outputs $1$ if $b=b^{\prime}$ .

The last definition is simulation security. It captures the privacy of shared secrets and private keys of users.

Definition 4

(Simulation Security). A DTAHE scheme satisfies simulation security if for all $\lambda$ , the following holds:

There is a stateful PPT algorithm $\mathcal{C}=(\mathcal{C}_{1},\mathcal{C}_{2})$ such that for any PPT adversary $\mathcal{A}$ , the following experiments $Expt_{\mathcal{A},Real}(1^{\lambda})$ and $Expt_{\mathcal{A},Ideal}(1^{\lambda})$ are indistinguishable:

•

$Expt_{\mathcal{A},Real}(1^{\lambda})$ :

1.

The adversary outputs $U$ and $V$ where $|U|=n$ and $|V|=t$ specify an access structure.

The challenger runs

parm\leftarrow Setup(1^{\lambda}),

(sk_{u},pk_{u})\leftarrow KeyGen(parm),

({\{{e_{v,u}}\}_{v\in U}})\leftarrow Share\left(\begin{gathered}parm,{\{p{k_{v}}\}_{v\in U\backslash\{u\}}},\hfill\\ t,s{k_{u}}\hfill\\ \end{gathered}\right),

pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U}),

and provides $(parm,pk,\{\{e_{v,u}\}_{v\in U}\}_{u\in U})$ to $\mathcal{A}$ .

3.

$\mathcal{A}$ outputs a set $S^{*}\subseteq U$ with $|S^{*}|=t-1$ and messages $\{m_{u}\}_{u\in U}$ .
4.

The challenger provides $\mathcal{A}$ the shares $\{\{s_{u,v}\}_{v\in U}\}_{u\in S^{*}}$ in $\{\{e_{u,v}\}_{v\in U}\}_{u\in S^{*}}$ and a cipher set

$\{c_{u}\leftarrow Enc(parm,pk,m_{u})\}_{u\in U}.$
5.

$\mathcal{A}$ issues a polynomial number of adaptive queries of the form $(S\subseteq U,\{c_{u}\}_{u\in U^{*}},\{\alpha_{u}\}_{u\in U^{*}})$ where $U^{*}\subseteq U$ . For each query, the challenger computes $\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U^{*}},\{\alpha_{u}\}_{u\in U^{*}})$ and provides $\mathcal{A}$

$\{\hat{m}_{u}\leftarrow ParDec(parm,\hat{c},\{e_{u,v}\}_{v\in U})\}_{u\in S}.$
6.

At the end of the experiment, $\mathcal{A}$ outputs a distinguishing bit $b$ .

•

$Expt_{\mathcal{A},Ideal}(1^{\lambda})$ :

1.

The adversary outputs $U$ and $V$ where $|U|=n$ and $|V|=t$ specify an access structure.
2.

The challenger runs

$(parm,pk,\{\{e_{v,u}\}_{v\in U}\}_{u\in U},st)\leftarrow\mathcal{C}_{1}(1^{\lambda},U,t)$

and provides $(parm,pk,\{\{e_{v,u}\}_{v\in U}\}_{u\in U})$ to $\mathcal{A}$ .
3.

$\mathcal{A}$ outputs a set $S^{*}\subseteq U$ with $|S^{*}|=t-1$ and messages $\{m_{u}\}_{u\in U}$ .
4.

The challenger provides $\mathcal{A}$ shares $\{\{s_{u,v}\}_{v\in U}\}_{u\in S^{*}}$ and ciphers

$\{c_{u}\leftarrow Enc(parm,pk,m_{u})\}_{u\in U}.$

$\mathcal{A}$ issues a polynomial number of adaptive queries of the form $(S\subseteq U,\{c_{u}\}_{u\in U^{*}},\{\alpha_{u}\}_{u\in U^{*}})$ where $U^{*}\subseteq U$ . For each query, the challenger runs

{\{{\hat{m}_{u}}\}_{u\in S}}\leftarrow{\mathcal{C}_{2}}\left(\begin{gathered}S,{\{{c_{u}}\}_{u\in{U^{*}}}},{\{{\alpha_{u}}\}_{u\in{U^{*}}}},\hfill\\ {\{{m_{u}}\}_{u\in U}},{\{{c_{u}}\}_{u\in U}},st\hfill\\ \end{gathered}\right)

and provides $\{\hat{m}_{u}\}_{u\in S}$ to $\mathcal{A}$ .

6.

At the end of the experiment, $\mathcal{A}$ outputs a distinguishing bit $b$ .

3 Secure Linear Aggregation Protocol

We provides two protocols in this section. The first is a basic protocol showing how to embed a DTAHE scheme to the secure aggregation framework in [5]. The second is the secure linear aggregation protocol against an active adversary.

3.1 A Basic Protocol

As shown in Fig. 2, initially a server $S_{1}$ runs $parm\leftarrow Setup(1^{\lambda})$ to provide a system-wide parameters for all users, and each user runs $(sk_{u},pk_{u})\leftarrow KeyGen(parm)$ to produce their key pairs. The server and users then runs a four-round protocol $\pi$ with an agreed threshold value $t$ . We next focus on a period $e$ to show the protocol.

1.

Round 1 (AdvertiseKeys). When a user $u$ is active, it packs a message as $m_{u,1}=(u,pk_{u})$ and sends the message to the server $S_{1}$ . The user then waits for the first response from $S_{1}$ or stops.

The server $S_{1}$ makes a set $U_{e}^{1}=\emptyset$ . After it receives a message $m_{u,1}=(u,pk_{u})$ , it sets $U_{e}^{1}=U_{e}^{1}\cup\{u\}$ . When $|U_{e}^{1}|>t$ , $S_{1}$ packs a message as $m_{S,1}=\{m_{u,1}\}_{u\in U_{e}^{1}}$ . The message $m_{S,1}$ is broadcasted to all the users in $U_{e}^{1}$ as their responses.

Round 2 (ShareKeys). When a user $u$ receives $m_{S,1}$ , it makes a $U_{e}^{1}$ set from $m_{S,1}$ and checks $|U_{e}^{1}|\geq t$ . If the verification passes, the user executes as follows:

(a)

It makes a public key set $\{pk_{v}\}_{v\in U_{e}^{1}\backslash\{u\}}$ from $m_{S,1}$ and produce encrypted shares by

{\{{e_{v,u}}\}_{v\in U_{e}^{1}}}\leftarrow Share\left(\begin{gathered}parm,{\{p{k_{v}}\}_{v\in U_{e}^{1}\backslash\{u\}}},\hfill\\ t,s{k_{u}}\hfill\\ \end{gathered}\right).

(b)

It packs $m_{u,2}=(u,\{(v,e_{v,u})\}_{v\in U_{e}^{1}\backslash\{u\}})$ and sends the message to $S_{1}$ . Then $u$ waits for the second response or stops.

The server $S_{1}$ builds a set $U_{e}^{2}$ in the same way as $U_{e}^{1}$ satisfying $U_{e}^{2}\subseteq U_{e}^{1}$ . When $|U_{e}^{2}|>t$ , $S_{1}$ packs a message $m_{S,2,v}=(\{(u,e_{v,u})\}_{u\in U_{e}^{2}\backslash\{v\}})$ for each user $v\in U_{e}^{2}$ , and sends them to the users in $U_{e}^{2}$ .

3.
Round 3 (CipherCollection). When a user $u$ receives a message $m_{S,2,u}$ , it makes a set $U_{e}^{2}$ from $m_{S,2,u}$ . If $|U_{e}^{2}|\geq t$ , $u$ executes as follows:
- •
  
  It runs $pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U_{e}^{2}})$ .
- •
  
  It runs $c_{u}\leftarrow Enc(parm,pk,m_{u})$ to compute the cipher of its private input $m_{u}$ .
- •
  
  It packs a message $m_{u,3}=(u,c_{u})$ and sends the message to $S_{1}$ . Then $u$ waits for the third response or stops.
The server $S_{1}$ builds a set $U_{e}^{3}$ in the same way as $U_{e}^{1}$ satisfying $U_{e}^{3}\subseteq U_{e}^{2}$ . It runs

$\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U_{e}^{3}},\{\alpha_{u}\}_{u\in U_{e}^{3}})$

and sends $m_{S,3}=\hat{c}$ to users in $U_{e}^{3}$ .
4.

Round 4 (Decryption). When a user $u$ receives $m_{S,3}$ , it runs

$\hat{m}_{u}\leftarrow ParDec(parm,\hat{c},\{e_{u,v}\}_{v\in U_{e}^{2}}).$

$u$ then packs a message $m_{u,4}=(u,\hat{m}_{u})$ and sends the message to the server $S_{1}$ . The user $u$ finishes the client protocol $\pi_{u}$ for the period $e$ and stops.

The server $S_{1}$ builds a set $U_{e}^{4}$ in the same way as $U_{e}^{1}$ satisfying $U_{e}^{4}\subseteq U_{e}^{3}$ . If $|U_{e}^{4}|\geq t$ , it runs

$m\leftarrow FinDec(parm,\hat{c},\{\hat{m}_{u}\}_{u\in U_{e}^{4}})$

to get $m$ as an aggregated value. The server finishes the server side protocol $\pi_{S}$ for the period $e$ and waits for the next period.

Remark 1

The basic protocol uses the framework in [5] to tolerate user dropping out. It needs the sever to be honest. A malicious server may simply return a cipher $c_{u}$ as $m_{S,3}$ to decrypt the cipher.

3.2 The Secure Linear Aggregation Protocol

Bonawitz et al. [5] use a certificate authority (CA) and an extra consistency round to defend against an active adversary. We also use a CA but keep our protocol four rounds. We introduce a smart contract in a blockchain to encourage the server $S_{1}$ and users to be honest.

The blockchain is described as $\vec{\sigma}_{\lambda_{t}+1}=\Upsilon(\vec{\sigma}_{\lambda_{t}},Tx)$ [34] where $\Upsilon$ is a state transition function, $\vec{\sigma}_{\lambda_{t}}$ is the system state in a block height $\lambda_{t}$ , and $Tx$ is a transaction in the system. A transaction could be described as

Tx=(from,to,value,data,aux,sig)

where $from$ and $to$ fields are the sender and receiver accounts of the transaction, $value$ field is the values to be transferred from the sender to the receiver, $data$ field is arbitrary data of the sender, $aux$ field is the auxiliary information used in the blockchain and $sig$ field is the sender’s signature. The instance of the blockchain certainly could be the Ethereum [34]. Any other blockchain system supporting state transition is fine.

Initially, the server $S_{1}$ has an account $acc_{S}$ and a user $u$ has an account $acc_{u}$ in the blockchain. The server should deploy a smart contract $EncCheck$ on the blockchain which will have an account $acc_{E}$ after deployment.

Definition 5

( $EncCheck$ ). The smart contract includes three functions. A function $Init$ is for the server $S_{1}$ to deposit values and set parameters. A function $Record$ is for a user to record their encrypted inputs. A function $Check$ is for the server $S_{1}$ and users to check the correctness of a protocol transcript.

•

$Init$ : If the transaction $Tx$ is signed by the server $S_{1}$ , the $data$ field of the transaction includes “ $Init$ ” as the function name, and a threshold value $t$ and the number of periods $prds$ supported by the contract as the arguments, the function checks a deposit variable $Dep$ of the server $S_{1}$ . If $Dep<MinValue*prds$ , it checks that $Dep+value\geq MinValue*prds$ where $MinValue$ is the smallest deposit value for a period. If the check fails, it stops. Otherwise, it transfers values from the server account to the contract account, and stores $(acc_{S},t,Dep,prds)$ in the contract.
•

$Record$ : If the $data$ field of the transaction includes “ $Record$ ” as the function name, and an epoch session id $esid_{u}$ and a cipher $c_{acc_{u}}$ as the arguments, the function stores $(esid_{u},acc_{u},c_{acc_{u}})$ in the contract.
•

$Check$ : If the transaction $Tx$ is singed by the server $S_{1}$ , the $data$ field of the transaction includes “ $Check$ ” as the function name, and a termination flag $draw$ , an epoch session id $esid_{S}$ , a cipher $c_{S}$ , a list of user accounts $L_{A}$ , and coefficients $\{\alpha_{acc_{u}}\}_{acc_{u}\in L_{A}}$ as the arguments, the function checks whether $esid_{S}$ is new. If the $esid_{S}$ is used before in another transaction $Tx^{\prime}$ , it checks whether the two transactions are the same. If they are different, the check fails. If they are the same, the function stops. The function forms a set

$A_{A}=\{acc_{u}:acc_{u}\in LA\wedge esid_{u}=esid_{S}\}.$

If $esid_{S}$ is new, it checks that the size $|A_{A}|\geq t$ , and

$c_{S}=Eval(parm,\{c_{acc_{u}}\}_{acc_{u}\in A_{A}},\{\alpha_{acc_{u}}\}_{acc_{u}\in A_{A}}).$

If any check fails, the value $MinValue$ is shared by users in the $A_{A}$ set. Otherwise, it updates $prds=prds-1$ , and if $draw$ is $true$ , it transfers the deposit back to the account of the server after six new blocks. When the deposit is cleared, the contract suicides.

Now the server $S_{1}$ has two tasks to initialize the protocol. At first, the server $S_{1}$ runs $parm\leftarrow Setup(1^{\lambda})$ to provide system-wide parameters for all users. Secondly, the server $S_{1}$ produces a transaction as

Tx_{S,1}=(acc_{S},acc_{E},value,Init||t||prds,aux,sig)

to initialize the deployed smart contract.

A user $u$ has three tasks to participate in the protocol II. At first, it runs $(sk_{u},pk_{u})\leftarrow KeyGen(parm)$ to produce protocol keys. Secondly, it produces a transaction

Tx_{u,1}=(acc_{u},acc_{u},0,pk_{u},aux,sig)

and sends the transaction to the blockchain to store its public keys. Thirdly, the user $u$ should produce a signature key pair $(sk_{Sigu},vk_{Sigu})\leftarrow SIG.Gen(1^{\lambda})$ using a signature scheme $(SIG.Gen,SIG.Sign,SIG.Ver)$ that is existential unforgeable against adaptive chosen messages (EUF-CMA). And then $u$ applies for a certificate $cert_{u}$ of the verifying key $vk_{Sigu}$ from a CA.

The CA and miners are auxiliary entities in the protocol. A CA is used to defend against a “Sybil” attack where an adversary may create many blockchain accounts to act as users. The miners of a blockchain receive transactions, pack them into blocks, reach a consensus on a block and update the global state. The public keys of users, parameters of the server $S_{1}$ , and deposited values of the server $S_{1}$ are states of the blockchain maintained by the miners. The whole picture of the protocol is in Fig. 3. We next focus on a period $e$ to show the protocol.

1.

Round 1 (AdvertiseAccounts). When a user $u$ is active, it looks up the deposit $Dep$ , supported periods $prds$ and the threshold $t$ in the smart contract account $acc_{E}$ indexed by the server account $acc_{S}$ . If $Dep$ , $t$ , or $prds$ are too small for the user’s local policy, the user simply stops. Otherwise, it produces a signature

$\sigma_{u}\leftarrow SIG.Sign(sk_{Sigu},acc_{u}||T_{u})$

where $T_{u}$ is a time stamp of the user. It packs a message as $m_{u,1}=(u,acc_{u},T_{u},\sigma_{u},cert_{u})$ and sends the message to the server $S_{1}$ . The user $u$ then waits for the first response from $S_{1}$ or stops.

The server $S_{1}$ builds a set $U_{e}^{1}$ as in the basic protocol. It then packs a message as $m_{S,1}=(\{m_{u,1}\}_{u\in U_{e}^{1}},esid)$ where $esid$ is a session number for the period. The message $m_{S,1}$ is broadcasted to all the users in $U_{e}^{1}$ set as their response.
2.

Round 2 (ShareKeys). When a user $u$ receives $m_{S,1}$ , it parses $m_{S,1}$ to extract the identities of users which form a set $U_{e}^{1}$ . If checks that $|U_{e}^{1}|\geq t$ , and the time stamps, signatures and certificates are correct. If the verifications pass, $u$ looks up the blockchain to find public keys of users in $U_{e}^{1}$ by their accounts which form a set $\{pk_{v}\}_{v\in U_{e}^{1}\backslash\{u\}}$ , and executes as in the basic protocol.

The server $S_{1}$ executes in the same way as in the basic protocol to produce $m_{S,2,v}$ for a user $v$ .

Round 3 (CipherCollection). When a user $u$ receives a message $m_{S,2,u}$ , it makes a set $U_{e}^{2}$ from $m_{S,2,u}$ . If $|U_{e}^{2}|\geq t$ , it executes in the same way as in the basic protocol to compute $pk$ and $c_{u}$ . Then $u$ creates a transaction

Tx_{u,2}=(acc_{u},acc_{E},0,Record||esid||c_{u},aux,sig)

and sends the transaction to the blockchain. The user $u$ then waits for a response transaction of the server $S_{1}$ from the blockchain network or stops.

The server $S_{1}$ makes a set $U_{e}^{3}=\emptyset$ . After it receives a transaction $Tx_{u,2}$ , if the transaction includes the same $esid$ as the current session number, $S_{1}$ sets $U_{e}^{3}=U_{e}^{3}\cup\{u\}$ where the identity $u$ is indexed by the $acc_{u}$ . It then runs

\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U_{e}^{3}},\{\alpha_{u}\}_{u\in U_{e}^{3}}),

and packs a transaction

T{x_{S,2}}=\left(\begin{gathered}ac{c_{S}},ac{c_{E}},0,Check||false||\hfill\\ esid||\hat{c}||{\{ac{c_{u}}\}_{u\in U_{e}^{3}}},aux,sig\hfill\\ \end{gathered}\right)

and sends the transaction to the blockchain.

4.

Round 4 (Decryption). When a user $u$ receives $Tx_{S,2}$ , it builds a $U_{e}^{3}$ set indexed by $\{acc_{u}\}_{u\in U_{e}^{3}}$ . If $U_{e}^{3}\subseteq U_{e}^{2}$ , $|U_{e}^{3}|\geq t$ , and the $esid$ field is the current session number, $u$ then executes in the same way as in the basic protocol to interact with the server $S_{1}$ .

The server $S_{1}$ executes in the same way as in the basic protocol to get an aggregated result.

When the FL process is to stop, $S_{1}$ may send a $Tx_{S,2}$ with the $true$ termination flag. The $EncCheck$ contract will delay the withdraw action until six blocks are generated. This is intentionally for users to check the correctness of the server. Before the deposit is given back to the server, a user could send $Tx_{S,2}$ received in an period to the contract. If there are different $Tx_{S,2}$ transactions with the same $esid$ field, the deposit will be shared by users in that period.

Remark 2

In a period, two transactions are transferred by the blockchain network. Miners, the parameter server and users will receive transactions, parse them, and process them independently.

3.3 Security Analysis

We prove the security of the secure linear aggregation protocol against an active adversary. In a high level, we divide the global outputs into four views, and analyze the advantages of the adversary as a new view appears.

Theorem 3.1

Suppose that a DTAHE scheme is sematic and simulation secure, a signature scheme $SIG$ is EUF-CMA secure, and the secret sharing scheme in the DTAHE has a privacy property [20]. Let $t$ be the threshold value, $n_{c}$ be the maximal number of corrupted users in a period, and $\lambda_{m}$ be the minimal number of private inputs to make the final aggregated value hide one input. For any uncorrupted user $u$ , against an adversary $\mathcal{A}$ , if $\lambda_{m}\geq t-n_{c}$ and the deposit of the server $S_{1}$ in the smart contract $EncCheck$ has not lost, the protocol satisfies $GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x})\approx GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x}^{\prime})$ .

Proof: A global output $GO_{\pi,e,\{S_{1}\}\cup U_{e},\mathcal{A}}(\vec{x})$ consists of four views $\{view_{i}\}_{i\in\{1,...4\}}$ . The $view_{1}$ includes the outputs of the initialization procedures and the first round of the server $S_{1}$ and users. The views $\{view_{2},view_{3},view_{4}\}$ include the outputs of their corresponding rounds. We consider two global outputs $GO$ and $GO^{\prime}$ with inputs $\vec{x}$ and $\vec{x}^{\prime}$ . Note that the only difference of the two inputs are the inputs of the user $u$ . We denote the four views of $GO^{\prime}$ as $\{view_{i}^{\prime}\}_{i\in\{1,...4\}}$ .

•

Initially, $view_{1}$ and $view_{1}^{\prime}$ include the same certificates, public keys and accounts in the initialization procedure of the protocol $\pi$ .
•

When $\pi$ executes, $view_{1}$ and $view_{1}^{\prime}$ include message-signature pairs of users and the first response of the server. $view_{2}$ and $view_{2}^{\prime}$ include encrypted shares and the second response of the server. Since private inputs of the user $u$ are not involved, the two global outputs are indistinguishable until now.
•
The $view_{3}$ and $view_{3}^{\prime}$ consist of two kinds of transactions, separately. Transactions of users in $U_{e}^{3}$ could form a ciphertext set $\{c_{u}\}_{u\in U_{e}^{3}}$ . The transaction of the server $S_{1}$ includes $\hat{c}$ . In the $view_{3}$ , the ciphertext of the target user $u$ is $c_{u}\leftarrow Enc(parm,pk,m_{u})$ , and the evaluated ciphertext of the server $S_{1}$ is $\hat{c}\leftarrow Eval(parm,\{c_{u}\}_{u\in U_{e}^{3}},\{\alpha_{u}\}_{u\in U_{e}^{3}})$ . In the $view_{3}^{\prime}$ , the ciphertext of the target user $u$ is $c_{u}\leftarrow Enc(parm,pk,0)$ , and the evaluated ciphertext of the server is produced in the same way. Now the chances of $\mathcal{A}$ increases.
- –
  
  $\mathcal{A}$ could exploit the differences of the two views $view_{3}$ and $view_{3}^{\prime}$ . Since the only difference is the message $m_{u}$ and the message $0$ , if the DTAHE is sematic secure, $\mathcal{A}$ has only negligible advantages to distinguish the two views.
- –
  
  $\mathcal{A}$ could exploit the combined views of $view_{3}\cup view_{2}$ and $view_{3}^{\prime}\cup view_{2}^{\prime}$ . Suppose that $\mathcal{A}$ could get the shared secrets in $view_{2}$ and $view_{2}^{\prime}$ . Then $\mathcal{A}$ could decrypt $c_{u}$ by the $ParDec$ and $FinDec$ algorithms. However, since the DTAHE is simulation secure, the secret shares are protected well. $\mathcal{A}$ has only negligible advantages to distinguish the two combined views.
- –
  
  $\mathcal{A}$ could exploit the combined views of $view_{3}\cup view_{2}\cup view_{1}$ and $view_{3}^{\prime}\cup view_{2}^{\prime}\cup view_{1}^{\prime}$ in the two global outputs.
  
  At first, Suppose that $\mathcal{A}$ corrupts users in $C^{*}$ and $|C^{*}|=n_{c}$ . It then knows secret keys $\{sk_{u}\}_{u\in C^{*}}$ which could be used to decrypt ciphers in $view_{2}$ . The decrypted shares may help $\mathcal{A}$ to decrypt $c_{u}$ . However, since we assume $n_{c}<t$ and the secret sharing scheme has a privacy property[20], this event happens negligibly.
  
  Secondly, $\mathcal{A}$ may make special $view_{1}$ and $view_{1}^{\prime}$ . $\mathcal{A}$ could establish $t$ blockchain accounts and store public keys in each account. Then $\mathcal{A}$ acts as users to cheat the target user $u$ . However, since messages in $view_{1}$ and $view_{1}^{\prime}$ are signed by users who have certificates from a CA, the accounts of $\mathcal{A}$ could not be identified by the user $u$ . And since we explicitly include a time stamp in the first message of a user, $\mathcal{A}$ could not replay messages of users in other periods. To satisfy the claims, the signature scheme $SIG$ should be EUF-CMA secure.
•
$view_{4}$ and $view_{4}^{\prime}$ consists of partially decrypted ciphers and the final aggregated value. $\mathcal{A}$ has more chances:
- –
  
  $\mathcal{A}$ could exploit the partially decrypted ciphers to discover secret shares. Since the DTAHE is simulation secure, $\mathcal{A}$ has only negligible advantages.
- –
  
  $\mathcal{A}$ could exploit the final aggregated value. An obvious strategy is that in the transaction of the server $Tx_{S,2}$ , only $c_{u}$ is included. To make the strategy more practical, $\mathcal{A}$ may aggregate some ciphers of corrupted users and the target user. Then the final output is the sum of $n_{c}+1$ inputs.
  
  The strategy is a possible way to get inputs of a user at the cost of the deposit in the smart contract. Since $Tx_{S,2}$ comes from the blockchain network, miners will execute the $Check$ function in the $EncCheck$ smart contract. The function takes $\{acc_{u}\}_{u\in U_{e}^{3}}$ as $L_{A}$ , and checks the correctness of the evaluation of the server $S_{1}$ . The smart contract builds a $A_{A}\subseteq L_{A}$ and checks that $|A_{A}|\geq t$ . So if $n_{c}+1<t$ , the check fails and the server $S_{1}$ will be punished. To avoid the punishment, $\mathcal{A}$ should prepare a $L_{A}$ set with $|L_{A}|\geq n_{c}+\lambda_{m}\geq t$ .
  
  $\mathcal{A}$ may try to withdraw their deposit before the penalty is paid. However, the smart contract needs extra six blocks to confirm the request of the server. The six-block waiting time is the last chance of users to check the behaviours of the server. Users are encouraged to send the received transaction of the server to the blockchain since they are the possible beneficiaries.

In summary, if the cryptographic primitives are secure, $\lambda_{m}\geq t-n_{c}$ , and the server pays no penalties, the adversary has only negligible advantage to distinguish the two outputs $GO$ and $GO^{\prime}$ .

Remark 3

We explicitly introduce a parameter $\lambda_{m}$ to describe the number of private inputs of uncorrupted users in a period. The parameter is in fact used in [5] implicitly. The consistency check round [5] makes sure that at least $t$ private inputs are summed. If $n_{c}$ inputs are known by an adversary $\mathcal{A}$ , then in their protocol $\lambda_{m}\geq t-n_{c}$ .

4 A DTAHE Instance

Since all known construction methods of DTAHE schemes have some weaknesses. We provide a new construction method to produce a lattice based DTAHE scheme for an interactive protocol.

4.1 The Instance

We use the BFV scheme [22, 35] as a basic building block which is implemented in the SEAL library [36]. We need a sematic secure hybrid encryption scheme [37] $HPKE=(HPKE.Gen,HPKE.Enc,HPKE.Ver)$ to encrypt shares. A lattice based $HPKE$ instance will give us a fully lattice DHAHE instance that may be post quantum secure. We also need the Shamir secret sharing scheme denoted by $SS=(SS.Split,SS.Recover)$ .

1.

$Setup(1^{\lambda})\rightarrow parm$ : It takes as input a security parameter $\lambda$ , produces a parameter set $parm=(d,f(x),h,R,R_{h},\chi,\mu,\vec{a},l,\lambda)$ , where $d$ is the degree depending on $\lambda$ of a cyclotomic polynomial $f(x)$ , $h\geq 2$ is an integer depending on $\lambda$ , $R$ is a ring $R=\mathbb{Z}[x]/(f(x))$ , $R_{h}$ is the set of polynomials in $R$ with coefficients in $\mathbb{Z}_{h}$ , $\chi$ here is in fact defined as discrete Gaussian distribution, $\mu$ is a uniform distribution, $\vec{a}$ is uniformly selected from $R_{h}$ as $\vec{a}\leftarrow R_{h}$ , and $l$ is an integer depending on $\lambda$ .
2.

$KeyGen(parm)\rightarrow(sk_{u},pk_{u})$ : It selects $\vec{s}_{u}\leftarrow R_{3}$ and samples $\vec{e}_{u}\leftarrow\chi$ . It sets $sk_{u,0}=\vec{s}_{u}$ and $pk_{u,0}=[-(\vec{a}\cdot\vec{s}_{u}+\vec{e}_{u})]_{h}$ . It runs $(pk_{u,1},sk_{u,1})\leftarrow HPKE.Gen(1^{\lambda})$ . The output is $sk_{u}=(sk_{u,0},sk_{u,1})$ and $pk_{u}=(pk_{u,0},pk_{u,1})$ .
3.

$Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,sk_{u})\rightarrow\{e_{v,u}\}_{v\in U\backslash\{u\}}$ : It samples $\vec{e}_{u,2}\leftarrow\chi$ , computes $n=|\{pk_{v}\}_{v\in U\backslash\{u\}}|+1$ , sets $ss_{u}=\{sk_{u,0},\vec{e}_{u,2}\}$ , and for each coefficient $ss_{u,i}$ of the elements in $ss_{u}$ , computes $\{\vec{s}_{v,u,i}\}_{v\in U}\leftarrow SS.Split(ss_{u,i},n,t,h)$ . For all the shares to $v\in U\backslash\{u\}$ , it computes a cipher $e_{v,u}=HPKE.Enc(pk_{v,1},\{\vec{s}_{v,u,i}\}_{i\in|ss_{u}|*d})$ .
4.

$CombKey(parm,\{pk_{u}\}_{u\in U})\rightarrow pk$ : It computes $pk=[\sum_{u\in U}pk_{u,0}]_{h}$ .
5.

$Enc(parm,pk,m_{u})\rightarrow c_{u}$ : It selects $\vec{u}_{u}\leftarrow R_{3}$ , $\vec{e}_{u,0},\vec{e}_{u,1}\leftarrow\chi$ , computes $c_{u,0}=[\vec{a}\cdot\vec{u}_{u}+\vec{e}_{u,0}]_{h}$ and $c_{u,1}=[pk\cdot\vec{u}_{u}+\vec{e}_{u,1}+\lfloor h/l\rfloor\cdot m_{u}]_{h}$ . It sets $c_{u}=(c_{u,0},c_{u,1})$ .
6.

$Eval(parm,\{c_{u}\}_{u\in U},\{\alpha_{u}\}_{u\in U})\rightarrow\hat{c}$ : It computes $\hat{c}_{0}=[\sum_{u\in U}\alpha_{u}c_{u,0}]_{h}$ , $\hat{c}_{1}=[\sum_{u\in U}\alpha_{u}c_{u,1}]_{h}$ and sets $\hat{c}=(\hat{c}_{0},\hat{c}_{1})$ .
7.

$ParDec(parm,\hat{c},\{e_{u,v}\}_{v\in U})\rightarrow\hat{m}_{u}$ : It decrypts the shares for the user $u\in U$ from the user $v\in U$ as $\vec{s}_{u,v}\leftarrow HPKC.Dec(sk_{u,1},e_{u,v})$ . It parses the shares of coefficients as shares of elements in $R$ , sets $(\vec{se}_{u,v},\vec{ssk}_{u,v})=\vec{s}_{u,v}$ , then computes $\hat{m}_{u}=[\hat{c}_{0}\cdot\sum_{v\in U}\vec{ssk}_{u,v}+\sum_{v\in U}\vec{se}_{u,v}]_{h}$ .
8.

$FinDec(parm,t,\hat{c},\{\hat{m}_{u}\}_{u\in V})\rightarrow m$ : It recovers $cs=[\sum_{u\in V}li_{u}\hat{m}_{u}]_{h}$ where $li_{u}$ is the Lagrange coefficient of the user $u$ with respect to the user set $V$ , and computes the final output $m=[\lfloor\frac{l\cdot[\hat{c}_{1}+cs]_{h}}{h}\rceil]_{l}$ .

Remark 4

If the data dimension of the message $m_{u}$ is greater than $d$ , the number of noise samples $\vec{e}_{u,2}\in R$ increases.

4.2 Security Analysis

The security of our scheme could be reduced to a variant of the classical ring version decisional learning with errors (RLWE) problem [38, 22]. The variant is named $n$ -Decision-RLWE problem.

Definition 6

( $n$ -Decision-RLWE). For a random set $\{\vec{s}_{i}\in R_{h}\}_{i\in\{1,\ldots,n\}}$ and a distribution $\chi$ over $R$ , denote with $A_{\{\vec{s}_{i}\in R_{h}\}_{i\in\{1,\ldots,n\}},\chi,\mu}$ the distribution by choosing a uniformly random element $\vec{a}\leftarrow R_{h}$ and $n$ noise term $\{\vec{e}_{i}\leftarrow\chi\}_{i\in\{1,\ldots,n\}}$ and outputting $(\vec{a},\{[\vec{a}\cdot\vec{s}_{i}+\vec{e}_{i}]_{h}\}_{i\in\{1,\ldots,n\}})$ . The problem is then to distinguish between the distribution $A_{\{\vec{s}_{i}\in R_{h}\}_{i\in\{1,\ldots,n\}},\chi,\mu}$ and a uniform distribution $\mu$ over $R_{h}^{n+1}$ .

By a hybrid argument, one could conclude that if an adversary has an advantage at least $\epsilon_{n\text{-}RLWE}$ to solve the $n$ -Decision-RLWE problem, the adversary has an advantage at least $\frac{1}{n}\epsilon_{n\text{-}RLWE}$ to solve the classical RLWE problem in [38, 22].

The first proof is about the evaluation correctness in the definition 2.

Theorem 4.1

Assume that $U$ is the user set, $\chi$ is $B$ -bounded and the maximal infinity norm of elements in the set $\{\alpha_{u}\}_{u\in U}$ is $A$ , the evaluation of the DTAHE is correct with probability $1$ if $|U|B(1+\delta_{R}A(1+2\delta_{R}|U|))<\frac{h}{2l}$ .

Proof

Since

\hat{m}_{u}=[\hat{c}_{0}\cdot\sum_{v\in U}\vec{ssk}_{u,v}+\sum_{v\in U}\vec{se}_{u,v}]_{h},

we have the equation (1) due to the Lagrange interpolation.

\begin{split}cs&=[\sum_{u\in V}li_{u}\cdot(\hat{c}_{0}\cdot\sum_{v\in U}\vec{ssk}_{u,v}+\sum_{v\in U}\vec{se}_{u,v})]_{h}\\ &=[\sum_{u\in V}li_{u}\cdot(\hat{c}_{0}\cdot\sum_{v\in U}\vec{ssk}_{u,v})\\ &+\sum_{u\in V}li_{u}\cdot(\sum_{v\in U}\vec{se}_{u,v})]_{h}\\ &=[\hat{c}_{0}\cdot(\sum_{u\in V}li_{u}\cdot(\sum_{v\in U}\vec{ssk}_{u,v}))+\sum_{u\in U}\vec{e}_{u,2}]_{h}\\ &=[\hat{c}_{0}\cdot\sum_{u\in U}sk_{u,0}+\sum_{u\in U}\vec{e}_{u,2}]_{h}\\ \end{split}

(1)

Let $X=[\hat{c}_{1}+cs]_{h}$ . Since

\hat{c}_{0}=[\sum_{u\in U}\alpha_{u}c_{u,0}]_{h},

\hat{c}_{1}=[\sum_{u\in U}\alpha_{u}c_{u,1}]_{h},

c_{u,0}=[\vec{a}\cdot\vec{u}_{u}+\vec{e}_{u,0}]_{h},

c_{u1}=[pk\cdot\vec{u}_{u}+\vec{e}_{u,1}+\lfloor h/l\rfloor\cdot m_{u}]_{h},

pk=[\sum_{u\in U}pk_{u,0}]_{h},

and

pk_{u}=[-(\vec{a}\cdot sk_{u,0}+\vec{e}_{u})]_{h},

we have the equation 2.

\begin{split}X&=[\hat{c}_{1}+\hat{c}_{0}\cdot\sum_{u\in U}sk_{u,0}+\sum_{u\in U}\vec{e}_{u,2}]_{h}\\ &=[\sum_{u\in U}\alpha_{u}(pk\cdot\vec{u}_{u}+\vec{e}_{u,1}+\lfloor h/l\rfloor\cdot m_{u})\\ &+\sum_{u\in U}\alpha_{u}(\vec{a}\cdot\vec{u}_{u}+\vec{e}_{u,0})\cdot(\sum_{u\in U}sk_{u,0})\\ &+\sum_{u\in U}\vec{e}_{u,2}]_{h}\\ \end{split}

(2)

Let

NS_{0}=\sum_{u\in U}\alpha_{u}\vec{e}_{u,1}+(\sum_{u\in U}\alpha_{u}\vec{e}_{u,0})\cdot(\sum_{u\in U}sk_{u,0})+\sum_{u\in U}\vec{e}_{u,2},

and

MP=\sum_{u\in U}\alpha_{u}\lfloor h/l\rfloor\cdot m_{u}.

Let

Y=[X-NS_{0}-MP]_{h}.

We then have the equation 3.

\begin{split}Y&=[\sum_{u\in U}\alpha_{u}(pk\cdot\vec{u}_{u})+\sum_{u\in U}\alpha_{u}(\vec{a}\cdot\vec{u}_{u})\cdot(\sum_{u\in U}sk_{u,0})]_{h}\\ &=[\sum_{u\in U}\alpha_{u}(\vec{u}_{u}\cdot(\sum_{u\in U}-(\vec{a}\cdot sk_{u0}+\vec{e}_{u})))\\ &+\vec{a}\cdot(\sum_{u\in U}\alpha_{u}\vec{u}_{u})\cdot(\sum_{u\in U}sk_{u,0})]_{h}\\ &=[-\sum_{u\in U}\alpha_{u}\vec{u}_{u}\cdot\sum_{u\in U}\vec{e}_{u}]_{h}\\ \end{split}

(3)

Let $NS=NS_{0}-\sum_{u\in U}\alpha_{u}\vec{u}_{u}\cdot\sum_{u\in U}\vec{e}_{u}$ , then $X=[NS+MP]_{h}$ . We then have the equation 4.

\begin{split}m&=[\lfloor\frac{l\cdot[\hat{c}_{1}+cs]_{h}}{h}\rceil]_{l}\\ &=[\lfloor\frac{l\cdot X}{h}\rceil]_{l}\\ &=[\lfloor\frac{l\cdot(NS+\sum_{u\in U}\alpha_{u}\lfloor h/l\rfloor\cdot m_{u})}{h}\rceil]_{l}\\ &=[\sum_{u\in U}\alpha_{u}m_{u}]_{l}+[\lfloor\frac{l\cdot NS}{h}\rceil]_{l}\\ \end{split}

(4)

If $NS<\frac{h}{2l}$ , the above decryption is correct. Since $\vec{u}_{u},sk_{u0}\leftarrow R_{3}$ , $\vec{e}_{u0}$ , $\vec{e}_{u1}$ , $\vec{e}_{u}$ , $\vec{e}_{u2}\leftarrow\chi$ , the maximal infinity norm of elements in the set $\{\alpha_{u}\}_{u\in U}$ is $A$ , the infinity norm of $NS$ is

\begin{split}||NS||&\leq\delta_{R}A|U|B+\delta_{R}^{2}A|U|^{2}B+|U|B+\delta_{R}^{2}A|U|^{2}B\\ &=|U|B(1+\delta_{R}A(1+2\delta_{R}|U|))\\ \end{split}

(5)

The second proof is for the privacy of messages in the definition 3.

Theorem 4.2

If there is an adversary $\mathcal{A}$ with advantage $\epsilon_{sem}$ to make the experiment $Expt_{\mathcal{A},Sem}(1^{\lambda})$ output $1$ , one could construct a challenger to break the $n$ -decision-RLWE problem with an advantage $\frac{1}{2}\epsilon_{sem}$ under the condition that the secret sharing scheme $SS$ has the privacy property [20] and the hybrid encryption scheme $HPKE$ is sematic secure.

Proof

With $|U|$ and $t$ , the challenger samples a $|U|$ -decision-RLWE instance $(x_{0},\{x_{u}\}_{u\in U})$ . It embeds the problem instance into the DTAHE instance as follows:

parm\leftarrow Setup(1^{\lambda}),

parm=parm\backslash\{\vec{a}\}\cup\{x_{0}\},

(sk_{u},pk_{u})\leftarrow KeyGen(parm),

sk_{u,0}=0;pk_{u,0}=x_{u},

\{e_{v,u}\}_{v\in U\backslash\{u\}}\leftarrow Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,sk_{u}),

pk\leftarrow CombKey(parm,\{pk_{u}\}_{u\in U}).

It then provides $(parm,pk,\{\{e_{v,u}\}_{v\in U\backslash\{u\}}\}_{u\in U})$ to $\mathcal{A}$ .

The challenger plays with $\mathcal{A}$ by $\{sk_{u,1}\}_{u\in U}$ until $\mathcal{A}$ outputs $b^{\prime}$ .

If the $HPKE$ scheme is sematic secure, the ciphers $\{e_{v,u}\}_{v\in U\backslash\{u\}}$ leak nothing about shares. Then from the privacy property of the $SS$ scheme, if $|S|<t$ , $\mathcal{A}$ can not distinguish a secret $sk_{u,0}$ from zero. So $\mathcal{A}$ should produce an educated guess $b^{\prime}$ .

The strategy of the challenger is to use the guess of $\mathcal{A}$ . If the $Expt_{\mathcal{A},Sem}(1^{\lambda})$ outputs $0$ , the challenger believes that the $|U|$ -decision-RLWE instance is a uniform random sample from $R_{h}^{n+1}$ .

When the input is indeed a uniform random sample from $R_{h}^{n+1}$ , the advantage of $\mathcal{A}$ is simple negligible since the messages are masked by random values. Otherwise, the adversary has an advantage $\epsilon_{sem}$ by assumption. So the advantage of the challenger is $\frac{1}{2}\epsilon_{sem}$ .

The third proof is for the privacy of secret keys and shares in the definition 4.

Theorem 4.3

If the secret sharing scheme $SS$ has the privacy property [20] and the hybrid encryption scheme $HPKE$ is sematic secure, the adversary $\mathcal{A}$ has negligible advantage to distinguish the two experiments $Exp_{\mathcal{A},Real}(1^{\lambda})$ and $Expt_{\mathcal{A},Ideal}(1^{\lambda})$ .

Proof

The proof needs a serial of hybrid experiments between an adversary $\mathcal{A}$ and a challenger.

•

$H_{0}$ : This is the experiment $Exp_{\mathcal{A},Real}(1^{\lambda})$ in the definition 4.

•

$H_{1}$ :Same as $H_{0}$ , except that the challenger simulates the $ParDec$ algorithm to produce $\hat{m}_{u}$ for queries of $\mathcal{A}$ . Note that $\mathcal{A}$ has given the challenger a set $S^{*}$ with the size $|S^{*}|=t-1$ . From $S^{*}$ , the challenger could construct a set $S_{C}=S\backslash S^{*}$ . For each party $u\in S_{C}$ , $|S^{*}\cup\{u\}|=t$ . The challenger sets $\hat{m}_{u}$ as

\tilde{m}_{u}=[li_{u}^{-1}(\lfloor{h/l}\rfloor\sum_{v\in U}\alpha_{v}m_{v}+NS-\hat{c}_{1}-\sum_{v\in S^{*}}li_{v}\hat{m}_{v})]_{h}

where $NS$ is defined in the theorem 4.1. If $u\in S^{*}$ , the challenger computes $\hat{m}_{u}$ as in the game $H_{0}$ .

The correctness of the simulation is obviously since

\begin{split}\tilde{m}_{u}&=[li_{u}^{-1}(MP+NS-(\hat{c}_{1}+\sum_{v\in S^{*}}li_{v}\hat{m}_{v}))]_{h}\\ &=[li_{u}^{-1}(MP+NS-(\hat{c}_{1}+cs-li_{u}\hat{m}_{u}))]_{h}\\ &=[li_{u}^{-1}(MP+NS-X+li_{u}\hat{m}_{u})]_{h}\\ &=\hat{m}_{u}\\ \end{split}

(6)

•

$H_{2}$ : Same as $H_{1}$ , except that the challenger shares zero as

$\{e_{v,u}\}_{v\in U\backslash\{u\}}\leftarrow Share(parm,\{pk_{v}\}_{v\in U\backslash\{u\}},t,0).$

By the privacy property [20] of the $SS$ scheme and the sematic security of the $HPKE$ scheme, $H_{2}$ and $H_{1}$ are indistinguishable.

•

$H_{3}$ : Same as $H_{2}$ , except that $NS$ is replaced by $\tilde{NS}$ as

\begin{split}\tilde{NS}&=NS-(\sum_{u\in U}sk_{u,0})(\sum_{u\in U}\alpha_{u}\vec{e}_{u,0})\\ &+(\sum_{u\in U}\vec{u}_{u,0})(\sum_{u\in U}\alpha_{u}\vec{e}_{u,0})\\ \end{split}

(7)

where $\vec{u}_{u,0}\leftarrow R_{3}$ .

Since $sk_{u,0}\leftarrow R_{3}$ , $H_{3}$ and $H_{2}$ have the same distribution. In fact, $\tilde{NS}$ may appear in an experiment when the $\{\vec{u}_{u,0}\}_{u\in U}$ happens to be part of the secret keys of users. Now the challenger does not use the private keys of users $\{sk_{u}\}_{u\in U}$ or secret shares of users in $U$ . So the ideal experiment $Expt_{\mathcal{A},Ideal}(1^{\lambda})$ could be simulated indistinguishably.

5 Performance

We evaluate the communication and computation costs of the secure linear aggregation protocol.

5.1 Communication

We have stated that there are mainly three methods [21, 19, 20] to construct a DTAHE. We concrete the method in [21] based on an elliptic curve version ElGamal (EC-ElGamal) scheme, and other methods based on the BFV [35, 13] scheme. The user side communication overhead of the secure linear aggregation protocol is calculated as follows:

\begin{split}\sum_{i\in\{1,\ldots,4\}}m_{u,i}&=(n+2)|u|+(n-1)|e_{v,u}|+|c_{u}|\\ &+|\hat{m}_{u}|+3|acc_{u}|+|T_{u}|+|\sigma_{u}|+|Sig|\\ &+|cert_{u}|+|aux|+|Record|+|esid|\end{split}

(8)

where $n$ is the number of users. Table I shows the main components in the equation (8). The security parameter $\lambda$ is $128$ . An element in EC-ElGamal takes $33$ bytes where one byte is for $y$ -coordinate. $LR$ denotes the size of a ring element in $R_{h}$ , $SN$ the number of shares to each user and $LN$ the number of ciphers to encrypt the user input $m_{u}$ . $LR^{\prime}$ and $LN^{\prime}$ have the same meaning as $LR$ and $LN$ with $LR\neq LR^{\prime}$ and $LN\neq LN^{\prime}$ . The method to use Shamir secret sharing in [20] is denoted as BGGJK-2, and the other is denoted as BGGJK-1.

Table 1: Communication Overheads of Main Components in the Protocol

	$\|e_{v,u}\|$	$\|c_{u}\|$	$\|\hat{m}_{u}\|$
Pedersen[21]	$33+32$	$66*\|m_{u}\|$	$33*\|m_{u}\|$
BD[19]	$33+LR*SN$	$2LRLN$	$LRLNSN$
BGGJK-1[20]	$33+LR*n^{4}$	$2LRLN$	$LRLNn^{4}$
BGGJK-2[20]	$33+LR^{\prime}$	$2LR^{\prime}LN^{\prime}$	$LR^{\prime}*LN^{\prime}$
Ours	$33+LR(1+LN)$	$2LRLN$	$LR*LN$

Fig. 4 shows the main communication overhead of the protocol on user side with different DTAHE constructions when the number of user increases. We mainly consider the components in Table I. We set $|m_{u}|=10^{5}$ for all instances, and set $d=2048$ and $|h|=54$ so that $LR=d*|h|/8=13824$ bytes and $LN=\lceil|m_{u}|/d\rceil=49$ . We set $t=\lceil n*2/3\rceil$ and compute $SN=\left(\begin{array}[]{l}n-1\\ t-1\end{array}\right)$ . The values of $LR^{\prime}$ and $LN^{\prime}$ are relative to the number of users since the noise element in the $ParDec$ algorithm should be multiplied by $(n!)^{2}$ , which are limited by the Theorem 4.1 and the equation (6) in [22].

From Fig. 4, we exclude the BD[19] and BGGJK-1[20] methods to distribute many shares to a user. Our method is better than the BGGJK-2[20] when the user number is greater than $26$ . The EC-ElGamal method has the best communication performance when the number of users is greater than $20$ .

5.2 Computation

Since the computation costs of the protocol are dominated by a DTAHE scheme. We give Table II to show the time cost of some DTAHE algorithms. We set the user number as $n=35$ and implement three DTAHE schemes for comparison. The schemes are implemented in Python. The “pyOpenSSL” is used to implement the EC-ElGamal based DTAHE. The small discrete logarithm of a group element is found by the well-known “Baby-Step-Giant-Step” method. An open source library “bfv-python” is adapted to implement our scheme and the BGGJK-2 method in [20]. For simplicity, we use a public python module “eciespy” as an instance of the $HPKE$ scheme. The CPUs are Intel(R) Core(TM) i7-8550U (1.80GHz, 1.99GHz) and the RAM is 16GB.

We use the “secp256r1” curve as the parameter set to implement the EC-ElGamal based DTAHE. For the security parameter $\lambda=128$ , we set $d=2048$ , $|h|=54$ and $|l|=17$ for our DTAHE scheme. When $n=35$ , the scheme constructed by the BGGJK-2 method in [20] requires $|h|\geq 426$ . So we set $d=16384$ according to the parameter table in [39]. Each data element in $m_{u}$ in our test occupies $8$ bits. The time costs of the three implementations are listed in the table II, which are measured in seconds. Apparently, a secure linear aggregation protocol with the EC-ElGamal based DTAHE takes the least time on user side. The protocol with our DTAHE scheme takes the least time on server side.

Table 2: Computation Costs of Some DTAHE Algorithms with Fixed Number of Users

	$Share$	$CombKey$ and $Enc$	$Eval$	$ParDec$	$FinDec$
[21]	$0.03$	$6.95$	$14.62$	$6.32$	$441.25$
[20]-2	$4.03$	$15.48$	$2.40$	$17.29$	$2.98$
Ours	$17.46$	$7.50$	$1.52$	$51.17$	$1.64$

6 Conclusion

This paper shows a secure linear protocol mainly for complex federated learning models. When the communication cost is not the dominate factor, the protocol could be deployed in a federated learning model to protect the private inputs of users. The DTAHE schemes in the protocol may be further optimized to reduce the computation time. For example, parallel computing technologies may reduce the time cost of the EC-ElGamal based DTAHE, and multi-secret sharing schemes may reduce the communication and computation costs of our DTAHE scheme.

Acknowledgment

This work is supported by the National Key R&D Program of China (2017YFB0802500), Guangdong Major Project of Basic and Applied Basic Research(2019B030302008) and Natural Science Foundation of Guangdong Province of China (2018A0303130133).

References

[1] H. B. McMahan, E. Moore, D. Ramage, and B. A. y Arcas, “Federated learning of deep networks using model averaging,” CoRR, vol. abs/1602.05629, 2016. [Online]. Available: http://arxiv.org/abs/1602.05629
[2] M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” in Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’15. New York, NY, USA: ACM, 2015, pp. 1322–1333. [Online]. Available: http://doi.acm.org/10.1145/2810103.2813677
[3] M. Al-Rubaie and J. M. Chang, “Reconstruction attacks against mobile-based continuous authentication systems in the cloud,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 12, pp. 2648–2663, Dec 2016.
[4] C. Di, W. Leye, C. Kai, and Y. Qiang, “Secure federated matrix factorization,” in FML 2019 : The 1st International Workshop on Federated Machine Learning for User Privacy and Data Confidentiality, 2019.
[5] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS’17. New York, NY, USA: Association for Computing Machinery, 2017, pp. 1175–1191. [Online]. Available: https://doi.org/10.1145/3133956.3133982
[6] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’16. New York, NY, USA: Association for Computing Machinery, 2016, pp. 308–318. [Online]. Available: https://doi.org/10.1145/2976749.2978318
[7] K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. S. Quek, and H. Vincent Poor, “Federated learning with differential privacy: Algorithms and performance analysis,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3454–3469, 2020.
[8] V. Rastogi and S. Nath, “Differentially private aggregation of distributed time-series with transformation and encryption,” in Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’10. New York, NY, USA: Association for Computing Machinery, 2010, p. 735–746. [Online]. Available: https://doi.org/10.1145/1807167.1807247
[9] M. Jawurek and F. Kerschbaum, “Fault-tolerant privacy-preserving statistics,” in Privacy Enhancing Technologies, S. Fischer-Hübner and M. Wright, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 221–238.
[10] M. Joye and B. Libert, “A scalable scheme for privacy-preserving aggregation of time-series data,” in Financial Cryptography and Data Security, A.-R. Sadeghi, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 111–125.
[11] I. Leontiadis, K. Elkhiyaoui, and R. Molva, “Private and dynamic time-series data aggregation with trust relaxation,” in Proceedings of the 13th International Conference on Cryptology and Network Security - Volume 8813. Berlin, Heidelberg: Springer-Verlag, 2014, p. 305–320. [Online]. Available: https://doi.org/10.1007/978-3-319-12280-9_20
[12] E. Shi, T.-H. Chan, E. Rieffel, R. Chow, and D. Song, “Privacy-preserving aggregation of time-series data,” vol. 2, 01 2011.
[13] T. H. H. Chan, E. Shi, and D. Song, “Privacy-preserving stream aggregation with fault tolerance,” in Financial Cryptography and Data Security, A. D. Keromytis, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 200–214.
[14] I. Leontiadis, K. Elkhiyaoui, M. Önen, and R. Molva, “Puda – privacy and unforgeability for data aggregation,” Cryptology ePrint Archive, Report 2015/562, 2015, https://eprint.iacr.org/2015/562.
[15] Q. Li and G. Cao, “Efficient privacy-preserving stream aggregation in mobile sensing with low aggregation error,” in Privacy Enhancing Technologies, E. De Cristofaro and M. Wright, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 60–81.
[16] Y. Liu, Y. Liu, Z. Liu, Y. Liang, C. Meng, J. Zhang, and Y. Zheng, “Federated forest,” IEEE Transactions on Big Data, pp. 1–1, 2020.
[17] H. H. Zhuo, W. Feng, Y. Lin, Q. Xu, and Q. Yang, “Federated deep reinforcement learning,” 2020.
[18] J. L. H. Crawford, C. Gentry, S. Halevi, D. Platt, and V. Shoup, “Doing real work with FHE: the case of logistic regression,” in Proceedings of the 6th Workshop on Encrypted Computing & Applied Homomorphic Cryptography, WAHC@CCS 2018, Toronto, ON, Canada, October 19, 2018, 2018, pp. 1–12. [Online]. Available: https://doi.org/10.1145/3267973.3267974
[19] R. Bendlin and I. Damgård, “Threshold decryption and zero-knowledge proofs for lattice-based cryptosystems,” in Theory of Cryptography, D. Micciancio, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 201–218.
[20] D. Boneh, R. Gennaro, S. Goldfeder, A. Jain, S. Kim, P. M. R. Rasmussen, and A. Sahai, “Threshold cryptosystems from threshold fully homomorphic encryption,” in Advances in Cryptology – CRYPTO 2018, H. Shacham and A. Boldyreva, Eds. Cham: Springer International Publishing, 2018, pp. 565–596.
[21] T. P. Pedersen, “A threshold cryptosystem without a trusted party,” in Advances in Cryptology — EUROCRYPT ’91, D. W. Davies, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 1991, pp. 522–526.
[22] J. Fan and F. Vercauteren, “Somewhat practical fully homomorphic encryption,” Cryptology ePrint Archive, Report 2012/144, 2012, https://eprint.iacr.org/2012/144.
[23] X. Zhu, H. Li, and Y. Yu, “Blockchain-based privacy preserving deep learning,” in Information Security and Cryptology, F. Guo, X. Huang, and M. Yung, Eds. Cham: Springer International Publishing, 2019, pp. 370–383.
[24] X. Bao, C. Su, Y. Xiong, W. Huang, and Y. Hu, “Flchain: A blockchain for auditable federated learning with trust and incentive,” in 2019 5th International Conference on Big Data Computing and Communications (BIGCOM), 2019, pp. 151–159.
[25] Y. Lu, X. Huang, K. Zhang, S. Maharjan, and Y. Zhang, “Blockchain empowered asynchronous federated learning for secure data sharing in internet of vehicles,” IEEE Transactions on Vehicular Technology, vol. 69, no. 4, pp. 4298–4311, 2020.
[26] Y. Qu, L. Gao, T. H. Luan, Y. Xiang, S. Yu, B. Li, and G. Zheng, “Decentralized privacy using blockchain-enabled federated learning in fog computing,” IEEE Internet of Things Journal, vol. 7, no. 6, pp. 5171–5183, 2020.
[27] J. Weng, J. Weng, J. Zhang, M. Li, Y. Zhang, and W. Luo, “Deepchain: Auditable and privacy-preserving deep learning with blockchain-based incentive,” IEEE Transactions on Dependable and Secure Computing, pp. 1–1, 2019.
[28] S. R. Pokhrel and J. Choi, “Federated learning with blockchain for autonomous vehicles: Analysis and design challenges,” IEEE Transactions on Communications, pp. 1–1, 2020.
[29] H. Kim, J. Park, M. Bennis, and S. Kim, “Blockchained on-device federated learning,” IEEE Communications Letters, vol. 24, no. 6, pp. 1279–1283, 2020.
[30] Q. Wang, Y. Guo, X. Wang, T. Ji, L. Yu, and P. Li, “Ai at the edge: Blockchain-empowered secure multiparty learning with heterogeneous models,” IEEE Internet of Things Journal, pp. 1–1, 2020.
[31] K. Sarpatwar, R. Vaculin, H. Min, G. Su, T. Heath, G. Ganapavarapu, and D. Dillenberger, Towards Enabling Trusted Artificial Intelligence via Blockchain. Cham: Springer International Publishing, 2019, pp. 137–153. [Online]. Available: https://doi.org/10.1007/978-3-030-17277-0_8
[32] S. Awan, F. Li, B. Luo, and M. Liu, “Poster: A reliable and accountable privacy-preserving federated learning framework using the blockchain,” in Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’19. New York, NY, USA: Association for Computing Machinery, 2019, p. 2561–2563. [Online]. Available: https://doi.org/10.1145/3319535.3363256
[33] M. Di Raimondo and R. Gennaro, “New approaches for deniable authentication,” in Proceedings of the 12th ACM Conference on Computer and Communications Security, ser. CCS’ 05. New York, NY, USA: Association for Computing Machinery, 2005, pp. 112–121. [Online]. Available: https://doi.org/10.1145/1102120.1102137
[34] D. G. Wood, “Ethereum: a secure decentralised g generalised transaction ledger homestead,” http://gavwood.com/paper.pdf, 2014, [Online, Accessed 26-June-2020].
[35] Z. Brakerski, “Fully homomorphic encryption without modulus switching from classical gapsvp,” in Advances in Cryptology – CRYPTO 2012, R. Safavi-Naini and R. Canetti, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 868–886.
[36] “Microsoft SEAL (release 3.5),” https://github.com/Microsoft/SEAL, Apr. 2020, microsoft Research, Redmond, WA.
[37] J. Herranz, D. Hofheinz, and E. Kiltz, “Some (in)sufficient conditions for secure hybrid encryption,” Inf. Comput., vol. 208, no. 11, p. 1243–1257, Nov. 2010. [Online]. Available: https://doi.org/10.1016/j.ic.2010.07.002
[38] V. Lyubashevsky, C. Peikert, and O. Regev, “On ideal lattices and learning with errors over rings,” in Advances in Cryptology – EUROCRYPT 2010, H. Gilbert, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 1–23.
[39] M. Albrecht, M. Chase, H. Chen, J. Ding, S. Goldwasser, S. Gorbunov, S. Halevi, J. Hoffstein, K. Laine, K. Lauter, S. Lokam, D. Micciancio, D. Moody, T. Morrison, A. Sahai, and V. Vaikuntanathan, “Homomorphic encryption security standard,” HomomorphicEncryption.org, Toronto, Canada, Tech. Rep., November 2018.