BGTplanner: Maximizing Training Accuracy for Differentially Private Federated Recommenders via Strategic Privacy Budget Allocation
Abstract
To mitigate the rising concern about privacy leakage, the federated recommender (FR) paradigm emerges, in which decentralized clients co-train the recommendation model without exposing their raw user-item rating data. The differentially private federated recommender (DPFR) further enhances FR by injecting differentially private (DP) noises into clients. Yet, current DPFRs, suffering from noise distortion, cannot achieve satisfactory accuracy. Various efforts have been dedicated to improving DPFRs by adaptively allocating the privacy budget over the learning process. However, due to the intricate relation between privacy budget allocation and model accuracy, existing works are still far from maximizing DPFR accuracy. To address this challenge, we develop BGTplanner (Budget Planner) to strategically allocate the privacy budget for each round of DPFR training, improving overall training performance. Specifically, we leverage the Gaussian process regression and historical information to predict the change in recommendation accuracy with a certain allocated privacy budget. Additionally, Contextual Multi-Armed Bandit (CMAB) is harnessed to make privacy budget allocation decisions by reconciling the current improvement and long-term privacy constraints. Our extensive experimental results on real datasets demonstrate that BGTplanner achieves an average improvement of 6.76% in training performance compared to state-of-the-art baselines.
Index Terms:
data-sharing platforms, differential privacy, budget allocation, online algorithm.I Introduction
Nowadays, machine learning has been widely applied in various recommender systems, necessitating the processing of substantial user data. Consequently, there is increasing concern from users about the security and privacy of their data [1, 2]. To protect private raw data, the federated recommender (FR) paradigm [3, 4, 5] has emerged, offering a solution to the conundrum of balancing data-driven progress with the preservation of individual privacy. In this evolving landscape, organizations mainly exchange training parameters with distributed data sources for improving recommendations while respecting user privacy [6].
Nevertheless, revealing training parameters as opposed to raw data proves inadequate for ensuring privacy protection, invoking the development of differentially private federated recommenders (DPFRs) [6, 7]. Studies indicate that training parameters, including gradient values, can facilitate the reconstruction of a segment of original data [8], or enable inference about whether the records pertained by a specific participant [9, 10]. To ensure data privacy, the additional safeguard, i.e., differential privacy (DP) [11, 12, 13], is involved by DPFRs, which inject noises to distort training parameters, and have emerged as a key consideration in privacy-preserving recommender systems.
However, the practical application of DPFRs is impeded by substantially compromised recommendation accuracy due to the distortion caused by DP noise. The challenge is rooted in the complicated relationship between privacy budget consumption and learning improvement over multiple times of training parameter exposure in the DPFR learning process. On the one hand, to guarantee privacy, the total privacy budget is constrained, and only a fixed small amount of privacy budget can be consumed by each exposure [14, 15]. Whereas, a smaller privacy budget implies a larger noise scale and more significant detriment to DPFR accuracy. On the other hand, the FR training process is highly dynamic with heterogeneous data scattered among users, resulting in a dynamic learning process [16]. Consequently, it is difficult to predict the improvement of each learning round in advance.
When facing multiple times of exposures, the naive approach evenly allocates a fixed amount of privacy budget to each training round regardless of discrepant learning improvements during the dynamic training process. Such an approach fails to maximize recommendation accuracy because the privacy budget is probably wasted if it is spent on protecting users bringing little learning improvement. Recently, a few works have attempted to tune the privacy budget allocation over multiple learning rounds adaptively. However, most of these works such as ADPML [17] are heuristics-based algorithms, far away from the optimal budget allocation.
Notation | Definition |
The client index and the set of all clients. | |
The training round index, the total training rounds with early termination and the maximum training rounds. | |
The original context matrix of training data information and the SVD-reduced vector in round . | |
The hyper-parameter representing the dimension of the vector . | |
The total privacy budget vector and the privacy budget cost vector for all clients in training round . | |
The privacy budget of client in round . | |
The budget generating function of DP noise on client . | |
/ / | The parameters of the local item embedding, the local user embedding and recommender model for clients in round . |
/ / | The gradients of the local item embedding, the local user embedding and recommender model for clients in round . |
The budget allocation action in round and the action space. | |
The size of action space. | |
/ | The reward of model training in round, which can be defined as the change of training loss or recommender accuracy between round and . / The vector of all observed rewards until round . |
The concatenated input vector of action and context vector for GPR model. | |
or | The predicted reward by GPR model with action and context vector . |
/ | The mean function and variance function of the conditional distribution of GPR model to predict reward . |
The vector of dual variables (Lagrange multipliers) for long-term budget constraint in round . | |
/ | The space of dual variables and the -radius of the space . |
The total rounds for playing each action in the initial stage of BGTplanner. | |
The CMAB score function for action in round . | |
The probability for CMAB to choose action in round . |
To address the challenges in allocating privacy budgets in DPFR, we introduce a novel online algorithm, called BGTplanner (Budget Planner). Different from prior approaches, our method employs the Gaussian process regression (GPR) to predict learning improvement on the fly based on historical information and dynamic contexts (e.g., dynamic information of training items). Next, Contextual Multi-Armed Bandit (CMAB) is leveraged to balance the tradeoff between the current learning improvement and long-term privacy budget constraints, which can avoid greedily depleting the privacy budget in a single round. Our contributions can be summarized as follows:
-
•
We formulate an online privacy budget allocation problem for the DPFR system. Then, we exploit historical information and Gaussian process regression to predict the dynamic learning improvement under different contexts.
-
•
We harness the Contextual Multi-Armed Bandit (CMAB) method to make privacy budget allocation decisions for reconciling the current learning improvement and long-term privacy budget constraints.
-
•
We provide sufficient theoretical analyses for BGTplanner. Furthermore, through extensive experiments, we demonstrate the superiority of our method compared to state-of-the-art baselines, with an average improvement of 6.76% in RMSE and 0.71% in F1-score.
The remainder of this paper is organized as follows. In Section II, we review the related works. Section III provides the necessary background on DPFR and CMAB. Our system model and problem formulation are discussed in Section IV. Section V introduces the design of our algorithm, along with an analysis of differential privacy and regret. In Section VII, we present comprehensive performance evaluations. Finally, we conclude the paper in Section VIII.
II Related Work
II-A Training Federated Recommender with DP
Training recommender models often relies on sensitive data, such as users’ historical interactions with items. To protect origin data from leaking to malicious attackers, FR, a distributed learning framework [18] has been widely adopted for training recommender models [3, 6]. Nevertheless, revealing training parameters rather than raw data has proven inadequate for ensuring privacy protection [8, 9, 10], leading to the development of DPFR. To safeguard data privacy, DPFRs incorporate DP mechanisms [11], which enhance privacy by injecting noises to distort exposed training parameters. This approach has emerged as a key consideration in privacy-preserving recommenders [19], despite that its adoption in practice is still hindered by lowered recommendation accuracy.
II-B Differential Privacy Budget Allocation
The privacy budget is a non-recoverable resource, which must be split into multiple small portions to restrict the privacy loss per training round [14, 15]. It has been reported in [19] that DP noises generated by a small privacy budget per round can severely impair model accuracy. Existing studies have initially attempted to address the influence of noise on DPFR. For example, [20] explored dynamic techniques for adjusting the learning rate, batch size, noise magnitude, and gradient clipping to improve DPFR accuracy. Yet, this work does not account for dynamic factors in the recommender training process, such as online interactive ratings and randomly generated noises [3]. The dynamic strategy introduced by [17] focused on privacy budget allocation in a heuristic-based method. However, when applied to DPFR, it confronts the limitation that a heuristically designed strategy can be far from the optimal privacy budget allocation.
CMBA is an online decision-making scheme applicable in complex dynamic environments [21]. [22] applied CMBA to improve personalized FL. [23, 21] applied CMBA to improve client selection for FL framework by overcoming dynamic training environments. Because of the dynamic nature of user interactions with recommenders, we are among the first to utilize CMAB for optimizing the allocation of the privacy budget in DPFR.
III Preliminaries
III-A Differentially Privacy and Federated Recommender
Differential privacy (DP) [24, 12] is a mechanism designed to limit an attacker’s ability to gain additional knowledge about individuals in a dataset. A typical approach to achieve DP is by introducing randomness into computations, which conceals the details of individual data records. Formally, DP is characterized by , which defines the upper bound on the difference between two neighboring inputs, and , which represents the residual probability. The formal definition of classic ()-DP is provided in Definition 1.
Definition 1 (()-DP).
For any and , a randomized algorithm : satisfies ()-DP if for any two neighboring inputs and any subset of output , the following holds:
(1) |
In the DP mechanism, revisiting the same dataset increases the risk of privacy leakage compared to a single access [25]. The composition theorem [26] states that the privacy budget per query should decrease as the number of accesses to the dataset grows. The most fundamental form of this is the sequential composition theorem, presented in Theorem 1.
Theorem 1 (Sequential Composition [26]).
Let the randomized algorithm : satisfy ()-DP for all . Then, the composition of all algorithms for the dataset , defined by , satisfies ()-DP.
Following this, in federated learning [18, 27], clients can securely update the recommender models using private datasets and upload the noisy gradients to the server for global aggregation in each training round. Specifically, a client in the differentially private federated recommender (DPFR) framework updates the gradients for the local item embedding , the local user embedding , and other recommender parameters , respectively [3, 28]. Before uploading these local recommender models to the server, the client distorts the local gradients by adding noise: and , where represents the DP noise generated by a specific DP noise mechanism (e.g., Gaussian [14] or Laplace [15]) based on the allocated privacy budget . 111The user embedding is generally stored locally after being updated, and therefore does not require noise addition.
III-B Contextual Multi-Arm Bandit Algorithm
The foundational concept of the Contextual Multi-Armed Bandit (CMAB) algorithm is essential for addressing decision-making challenges that involve balancing exploration and exploitation. Analogous to rows of slot machines (one-armed bandits), the CMAB algorithm deals with selecting actions over successive rounds to maximize cumulative rewards. Let denote the number of arms (or actions), and represent discrete time steps. At each time step , the algorithm selects an action from the space of action based on observed environmental information (e.g., any context vector matrix ). The reward corresponding to the chosen action is then observed. The goal of the algorithm is to iteratively refine its action-selection strategy to maximize the expected cumulative reward over time, which can be expressed as:
(2) |
IV SYSTEM MODEL
This section encompasses preliminary knowledge of DPFR, problem formulation, and BGTplanner overview. To facilitate the readability, we have compiled a summary of notations used in this paper in Table I.
IV-A System Overview
In Fig. 1, we present the workflow of a typical DPFR system. There are two main components: the server and clients. The server plays two functions: aggregating model parameters of the recommender and allocating the privacy budget to each client to determine the noise scale [29, 30]. Clients are responsible for conducting local model updates with private datasets. Before uploading local recommender models to the server, a DP noise adder will generate noises according to the allocated privacy budget to distort local models [3, 19, 6].
In the system, there are clients, denoted by the set . The DPFR can conduct the maximum rounds of model training. The total privacy loss of each client is constrained by a fixed total privacy budget to be consumed throughout the entire training. During the training process, clients and the server need to exchange information multiple times via the following five steps.

In Step ①, all clients upload ID numbers of item samples used for the local training. The server can reform the ID information as the context matrix denoted by . Here, represents the total number of items, and . indicates that item is used for training on client in this round, while signifies that the item is not used. Note that is commonly uploaded together with updated parameters in Step ④ for aggregating item embeddings in related works [3]. It makes no difference in uploading in either step222It is worth mentioning that before uploading can be obfuscated by clients with pseudo records to enhance privacy. Since our focus is not on how to obfuscate , we simply assume that existing obfuscation methods [3] will be applied..
In Step ②, the server allocates privacy budget to client in round . If a client exhausts its privacy budget, it quits the training process. Besides, the server can verify the noise scale [29, 30] and account cumulative privacy budget consumption using privacy accounting methods for different DP mechanisms, such as native composition for Laplace noises [14] and Moments Accountant for Gaussian noises [31].
In Step ③ and ④, client updates the recommender model via Local Trainer using the latest user-item interactions in local data. Local updating yields gradients of the local item embedding , the local user embedding and other recommender parameters , respectively [3]. After local training, client uploads and where represents DP noises generated by DP Noise Adder according to the allocated privacy budget .
In Step ⑤, Model Aggregator deployed at the server aggregates recommender parameters and item embedding vectors based on . For more details, please refer to [3]. Then, the server broadcasts aggregated parameters and item embedding vectors to all clients.
IV-B Online Privacy Budget Allocation Problem
Our study primarily focuses on designing a more advanced BGTplanner. In our design, we suppose that the server can select an action for privacy budget allocation, where represents different levels of privacy budgets. The action corresponds to the privacy budget , denoted by , for generating DP noises on client . Our objective is to maximize the final recommendation accuracy, gauged by reward . Here, denotes the reward of round , which can be defined as the change of training loss or recommender accuracy between round and . The FL training performance is highly dependent on data quality. Thus, should be related to the privacy budget allocated to round and training records contributed by clients. According to the process in Fig. 1, training records are represented by . Thus, we define the objective as . Given the limited total privacy budget, our problem can be formulated as follows:
(3a) | ||||
s.t. | (3b) | |||
(3c) | ||||
(3d) |
There are two challenges for solving . First, we need to specify the expression of . To our best knowledge, the exact expression is unavailable in existing works. Second, it cannot be maximized by a greedy algorithm. If we only consider for round , we should allocate all budget to such that is maximized, which unfortunately depletes the privacy budget for other training rounds, resulting in poor final recommendation accuracy.
V BGTplanner Design
Our solution for is depicted in Fig. 2. Briefly speaking, Predictor employs the Gaussian process regression (GPR) to predict reward based on historical information. To avoid the trap of greedy budget allocation, Constrainer properly penalizes each round allocation. Allocator makes final budget allocation decisions by considering both the current improvement reward and penalized budget cost.

V-A Instant Training Reward Predictor Design
In our problem, accurate prediction of rewards serves as a cornerstone for BGTplanner. Given an action , we employ GPR to model the relationship between the context input, i.e., , and the corresponding reward.
In a typical recommender, is a high-dimensional matrix, making it impossible to directly apply GPR. To reduce the dimension of , we conduct the Singular Value Decomposition (SVD) operation to obtain . Then, we rank diagonal elements in by a descending order to only reserve top elements, where is a tuneable hyper-parameter. Let denote the vector containing reserved elements. Unless otherwise specified, we use to denote the context vector hereafter.
With a much lower dimension, GPR can predict based on historical information. To distinguish with , let denote the predicted reward. According to [32], can be modelled as a random variable sampling value from the distribution of the Gaussian Process , where denotes the mean function333For simplicity and practicability, is assumed to be zero [32]. and is the covariance (or kernel) function for input vector and the historical input vector . To simplify our notation, we can concatenate any action (or a historical action ) with the context (or ) to form the vector (or ). 444 Note that is already fixed in round for .
To adapt the GPR to model the temporal dynamics of observations (i.e., inputs ), we construct a composite kernel that incorporates both the squared exponential kernel function and the Ornstein-Uhlenbeck temporal covariance function [33], yielding
(4) |
where is the set of historical inputs until the round , represents the temporal difference between and historical input , and is a hyper-parameter tuning the weight assigned to temporal correlations. The hyper-parameter represents the length scale for determining the influence of one input on another. Hence, a larger indicates a stronger correlation between and , implying the effectiveness of the historical observation with input in predicting the reward with input .
Until round , we can collect the input set and the reward vector including observed rewards. Thus, we can establish a GPR with variables to predict the distribution. The detailed joint distribution of GPR is presented in Appendix B. Using the joint distribution of GPR and Bayes’ theorem, we can predict for any action with context by the conditional distribution parameterized by the mean function in Eq. (5) and variance function ,555Due to the limited space, the expression of variance function is presented in Appendix B. where .
(5) |
Here, is the kernel vector between historical inputs and the current input, represents the kernel matrix among historical inputs, and denotes an identity matrix. Additionally, is a variance parameter of distribution , representing the deviation of the zero-mean random noise between the observed reward and the estimated reward [34].
For a given , it can be concatenated with the context to form the input . By leveraging Eq. (5), we can estimate the expected reward . By enumerating , we can predict the reward for choosing different actions. Note that our Predictor design is not based on a fixed . It can process dynamic contexts for predicting rewards with the evolving dynamics of user-item interactions over time.
V-B Long-Term Privacy Budget Constrainer Design
We next discuss the design of the long-term privacy budget constrainer in BGTplanner.
In , the privacy budget is constrained in the entire training process by (3b). To solve it, we decouple the time-dependent privacy budget constraint with an online queue optimization method [35, 36] so that we can consider budget allocation round by round in . For training round , considering that the inequality always holds when , we can relax the constraint in Eq. (3b) as . can be constrained in a similar way. For saving space, we simply use to denote the privacy budget hereafter.
The problem enforcing long-term privacy budget constraints can be transformed into queue recursion optimization with Lagrange multipliers Therefore, the Lagrange function of reward in Eq. (3a) can be written as: where
(6) |
Here, is the total privacy budget vector, is the per-round privacy consumption vector for all clients and is the function mapping actions and privacy budgets. Additionally, represents the inner product of two vectors. The problem is converted to maximizing the reward with the constraint punishment by solving By penalizing the privacy budget consumption per training round, we can simplify the problem by only considering how to optimize for a particular round .
By resorting to the online mirror descent (OMD) algorithm [37, 38], we can tackle the Lagrangian dual problem in every round , formulated as , where
(7) |
Here, is the space of the dual variable , and is the -radius of .
With the determined space set , Constrainer can update in the rest training rounds with the following rule:
(8) |
where is the Bregman divergence of the generating function and is the step factor of the OMD algorithm. Here, we select the generating function as the negative entropy function, the same as that in [36].
To sum up, Constrainer enables the integration of long-term privacy budget constraints into instant reward maximization and focuses on dynamically tuning to penalize the privacy budget consumption properly.
V-C Privacy Budget Allocator Design
The last component in BGTplanner is designed by leveraging contextual Multi-Armed Bandit (CMAB) to make final budget allocation decisions. Briefly speaking, there are two phases in BGTplanner, i.e., the initial stage and the exploration-exploitation stage, which are described as below.
V-C1 Initial Stage
The target of the initial stage is to establish the space of for Constrainer, so that the privacy budget consumption can be properly penalized in the exploration-exploitation stage. According to [36], we employ a regression-based algorithm to determine . The initial stage lasts training rounds to evaluate . Here, is the size of the action space, and is a hyper-parameter representing the number of times each action is selected.
To ensure that the OMD algorithm achieves regret bounds, our preferred choice is to set , where OPT is defined in Eq. (14) [36]. However, OPT cannot be solved accurately in an online scenario. Therefore, we employ a regression algorithm in the initial stage including training rounds to evaluate . Generally speaking, there are two steps in the initial stage.
Step ①: For the first rounds, we play different actions in CMAB evenly for times to gather historical records set for all inputs and reward outputs , where . With the historical records, we can formulate the GPR model as an oracle to predict the reward and the mean function of reward with input , where and is defined in Eq. (9c).
(9a) | ||||
(9b) | ||||
(9c) | ||||
(9d) | ||||
(9e) | ||||
(9f) |
Step ②: For the latter total rounds, we select actions randomly and observe the reward , where . Then, we use the observed rewards and GPR model to estimate by solving the linear programming problem (9), where is the solution matrix and we use the GPR model as the oracle to predict the reward and the mean of reward . Additionally, is the maximum training rounds, is the total number of clients, is the hyper-parameter of rounds to select each action, and is the smallest privacy budget among all clients. Therefore, we can obtain all the parameters needed to solve problem (9) using any general linear programming solver.
Upon obtaining the estimated value of , we can set
(10) |
where is defined in Eq. (9d). Consequently, the error between and OPT is bounded by the estimated error guarantee, as demonstrated in Lemma 1 in Appendix E, based on the analysis in [36]. The detailed algorithm for determining is provided in Alg. 1.
Input: Training rounds ; Exploration rounds ; Total privacy budget , Item embedding , user embedding , Recommender parameters .
Output: Radius ; Remained budget ; Historical record set .
Input: Training round ,
Clients set , Budget allocation .
Output: Training reward , Budget consumption .
// Clients //
// Server //
Input: Training rounds ; Exploration rounds ; Total privacy budget .
V-C2 Exploration-Exploitation Stage
In this stage, BGTplanner utilizes the outputs of Predictor and Constrainer, and the CMAB framework to allocate the privacy budget. In general, the Exploration-Exploitation stage includes three steps in each round , described as follows:
Step ① (Generating Action Scores): After observing context , BGTplanner acquires the estimated mean of rewards by Predictor, where , and calculates the predicted Lagrangian as an appropriate score function by considering both rewards and penalties:
(11) |
where is updated by Constrainer in round .
Step ② (Selecting the Optimal Action): Based on the estimated score for taking action , BGTplanner generates probabilities for all arms corresponding to different privacy budget allocation actions in the CMAB model. Let denote the action with the highest score , the probability to sample each action can be calculated as:
(12) |
where is a hyper-parameter representing the extent to which BGTplanner prefers exploitation over exploration. If is smaller, is closer to , implying that BGTplanner samples actions in a more uniform manner, and hence prefers exploration. The probability of is:
(13) |
The final action is sampled with the probability distribution in Eqs.(12) and (13) under context and the variable .
Step ③ (Updating Penalties): BGTplanner leverages Constrainer to update the new dual variable vector , which will be used for budget allocation in the next training round on the fly.
V-D Details of BGTplanner
We provide a detailed description of the BGTplanner algorithm in Alg. 3. The algorithm operates in two stages: an initial stage, described in Alg. 1, and a subsequent exploration-exploitation stage, which includes three main steps in each round. In each training round of the second stage, BGTplanner predicts the scores for all actions, selects the optimal action and adjusts the dual variables based on observed rewards and the remaining privacy budgets. With the chosen budget allocation action, the server and all clients execute one round of DPFR training, as detailed in Alg. 2, to update the model and embedding parameters. This process navigates a dynamic training environment, where the privacy budget is strategically allocated to maximize model performance while adhering to privacy constraints. The iterative process continues until the privacy budget is exhausted or the desired training performance is achieved.
VI Analysis and Implementation
Additionally, we undertake privacy, regret and complexity analysis to substantiate our approach’s theoretical guarantee.
VI-A Privacy Analysis.
Given , each client can allocate its budget as follows. If the Laplace mechanism is adopted, and each client can employ the native composition rule [14] to accumulate the total privacy loss during the multiple rounds. Otherwise, if the Gaussian mechanism is adopted, clients can convert the budget to the RDP (Rényi Differential Privacy) budget, which can then be simply split into multiple training rounds [15]. In this case, Eq. (3b) should be updated by the RDP budget. It has been proved in [14, 15] that if clients follow these methods to accumulate their privacy loss, -DP is satisfied on client .
VI-B Overhead Analysis.
BGTplanner incurs additional computation and memory overhead for making budget allocation decisions. Through analysis, we show that the overhead cost is lightweight and affordable for the server. For Predictor, the complexity for calculating the Gaussian Kernel is , with dimension of the input vector. The computational complexity for estimating the reward in Eq. (5) is . Calculating the score vector by Eq. (11) incurs a complexity of , and computing the probability vector via Eq. (12)-Eq. (13) has a complexity of . To conclude, the total computation complexity per round is . According to [37], the computational complexity of Constrainer is also small by using the MOD algorithm.
Additionally, the memory complexity is , primarily because of the storage requirements for 1) the GP kernels, 2) estimated means, scores and probabilities for all actions, and 3) client budgets and penalties information.
VI-C Regret Analysis.
We then discuss the regret performance of BGTplanner. We define the optimal expected reward for one round while ensuring resource utilization remains within a predefined budget as OPT [36], i.e.,
Definition 2.
Let denote the optimal regression to characterize the expectation of reward distribution and the static randomized policy as the optimal mapping function (defined on context distribution space ) assigning probabilities to actions in . We have
(14) | ||||
Here, and represent probability distribution over the action set and context space , respectively. is a mapping from action to privacy budget consumption.
With Definition 2, let denote the upper bound of the optimal reward in online scenarios [39], we can minimize the regret between the OPT and the actual reward obtained by BGTplanner online algorithm as:
(15) |
We first give the error guarantee of the estimation of -radius .
Lemma 1.
Denoting the optimal value of (9) by , and setting , we have with probability at least ,
(16) |
Here, when , we can donate , where the notation ‘’ denotes a represents multiple that is less than. The comprehensive proof of Lemma 1 can be found in Lemma 4.1 in [36] . With the estimation error guarantee of in Lemma 1, the OMD to update achieves optimal regret bounds (Lemma 2.2 in [36]). Therefore, we can obtain the following regret guarantee of Alg. 3 and we rewrite the theorem as follows:
Theorem 2.
Denote as the total training rounds, as the number of times each action is selected in the initial phase, and as the number of actions, we have of BGTplanner satisfying:
(17) | ||||
where is the minimum budget among the clients. Meanwhile, constrains with , where is the bias function of constraints of a linear programming problem defined in Appendix C. is the regret between the optimal regression function and GPR-based reward Predictor within total rounds, with an upper bound proved in [33].
Proof.
We can directly deduce the regret from the Theorem 4.1 in [36] by considering reward regression oracle as GPR model with an upper bound regret [33] and constraint regression oracle as an unbiased estimation. In Alg. 3, we use rounds in the initial stage, the expected regret incurred by Alg. 1 is upper bounded by , with the guarantee by in Lemma 1. For the second stage in Alg. 3, the expected regret for the exploration-exploitation stage Alg. 3 is upper bounded by , where is the upper bound of CMAB method employing GPR model as reward regression oracle and OMD method to update dual variables online in total rounds (Theorem 4.1 [36]). Then, Theorem 2 holds by adding the regret in these two stages together. ∎
Theorem 2 guarantees that BGTplanner achieves an regret bound. Such a sub-linear regret bound is highly desirable, indicating that BGTplanner will not significantly diverge from the optimal budget allocation solution even under dynamic contexts.



Methods | RMSE () | F1-score () | ||||||||||
Filmtrust | ML-100K | ML-1M | Filmtrust | ML-100K | ML-1M | |||||||
GM | LM | GM | LM | GM | LM | GM | LM | GM | LM | GM | LM | |
FedSGD | 0.2905 | 0.2017 | 0.2006 | 0.8354 | 0.9018 | 0.9137 | ||||||
OURS | 0.3746 | 0.4114 | 0.3223 | 0.3449 | 0.3135 | 0.3484 | 0.6685 | 0.6517 | 0.6444 | 0.6421 | 0.6524 | 0.6431 |
ADPML | 0.3898 | 0.4258 | 0.3442 | 0.3815 | 0.3558 | 0.4217 | 0.6395 | 0.6338 | 0.6358 | 0.6404 | 0.6407 | 0.6417 |
ADAP | 0.3970 | 0.4435 | 0.3553 | 0.4092 | 0.3958 | 0.4621 | 0.6389 | 0.6194 | 0.6366 | 0.6312 | 0.6404 | 0.6379 |
DPSGD | 0.4093 | 0.3356 | 0.3210 | 0.6680 | 0.6412 | 0.6501 |
Note: Lower is better (). Higher is better (). |
VII Performance Evaluation
In this section, we report the experimental results for comparing BGTplanner and the state-of-the-art baselines. To facilitate the peer review, we also anonymously open the source code of our system deployment666https://anonymous.4open.science/r/BGTplanner-AAAI.
VII-A Experimental Settings
VII-A1 Datasets
For our experiments, we exploit public rating datasets Filmtrust [40], MovieLens 100k (ML-100K) [41], and MovieLens 1M (ML-1M) [41], which are commonly used in evaluating recommender systems. Filmtrust includes 18,662 ratings from 874 users, ML-100k includes 100,000 ratings for 1,682 movies from 943 users, and ML-1M includes 1,000,209 ratings for 3,952 movies from 6,040 users. All rating scores are normalized to the range . We implement FedGNN [3] as the recommendation model and set the number of clients for FL according to the number of users in Filmtrust and ML-100K, respectively. Due to the larger number of users in ML-1M, we randomly allocate its 6,040 users to 605 clients.
Additionally, we split the training set and test set by the ratio of 4:1 for all rating datasets. To further simulate the dynamic training contexts with continuously generated rating records, the training set is further randomly divided into two equal-size parts. At the beginning of training, only ratings in the first part can be used for training. As the training progresses, training samples in the second part are randomly and evenly sampled and added to the first part for enhance training. Besides, all clients randomly select pseudo items from the set of non-interacted items for training FedGNN in each round [3], and the number is set to 50, 50, and 400 for Filmtrust, ML-100K, and ML-1M, respectively.



VII-A2 System Settings
Referring to [32], we set the hyper-parameter of GPR the length scale parameter and for the squared exponential kernel. For CMAB [36], the number of arms is set to , the number of initial rounds is set to and the trade-off parameter is set to in Eq. (12) according to [36]. By default, we set while the overall privacy budget is set to 10 and [3] for all clients in all datasets. Additionally, the mapping function uniformly distributes privacy budget values within the range , where is default as to control the maximum budget cost per round. We also conduct sensitivity experiments for the hyper-parameters , , and to further investigate their impact on the performance.
We adopt the Root Mean Square Error (RMSE) and F1-score as evaluation metrics, both of which are widely used for assessing recommenders. For the F1-score, we binarize the ratings and the outputs from the recommenders to calculate rating classification precision and recall, which are then combined to compute the F1-score. The binarization threshold is set to . The reward in round is quantified by . All experiments are tested by Python and executed on an Intel® CPU server with a Xeon® E5-2660 v4 CPU and 189 GB of RAM.
VII-A3 Baselines
To demonstrate the superiority of BGTplanner, we compare it with the following baselines. (1) FedSGD [18], the fundamental FL algorithm without DP noises, serves as the benchmark to establish the upper bound accuracy of DP-based baselines. (2) ADPML [17] determines the minimum privacy budget () and the maximum privacy budget () before training. It gradually increases the privacy budget used by clients during training. We set , and the increasing rate as in all datasets following the defaulted settings in [17]. (3) ADAP [42] dynamically adjusts the clip threshold during training and gradually increases the privacy budget used by clients based on the trend of the test loss in model performance. All configurations for this baseline follow the defaults specified in [42]. (4) DPSGD [31], designed a more advanced Gaussian mechanism (GM). It uses the Moments Accountant (essentially RDP) to accumulate and allocate privacy loss for the GM. We utilize the open source code with the same settings from [43] for our experiments.



VII-B Recommendation Performance Results
Table II shows the recommendation performance between BGTplanner and other baselines under both the Gaussian mechanism (GM) and Laplace mechanism (LM) with . In Fig. 3, we further compare final recommendation performance together with standard deviation constrained by various total privacy budgets. From Table II and Fig. 3 , we have the following observations:
-
1.
BGTplanner consistently achieves the recommendation performance closest to FedSGD (which does not incorporate DP noises for gradient protection) across all experimental cases. In Table II, BGTplanner improves RMSE by an average of 3.40% with the GM and 10.12% with the LM. Additionally, in terms of the F1-score, BGTplanner outperforms the second-best baseline by an average of 0.71% across all datasets and noise mechanisms.
-
2.
As shown in Figs. 3, the performance gain diminishes as the total privacy budget increases, due to the noise variance approaching the increased privacy budget, thereby shrinking the improvement space achievable by tuning privacy budget consumption. Meanwhile, the results demonstrate better training performance of BGTplanner in strict budget constraints, indicating that BGTplanner can allocate the privacy budget strategically with the CMAB method and the long-term budget Constrainter.
-
3.
DPSGD achieves the second-best performance among DP-based algorithms in most cases when using the GM. This is because DPSGD, using RDP, can achieve a tighter bound to track privacy loss during training. However, DPSGD’s uniform noise variance allocation across iterations fails to minimize noise impact in the training process, allowing the superior performance of BGTplanner.
-
4.
ADPML and ADAP rely on heuristics or prior knowledge to set privacy budgets and adjust hyper-parameters. These constraints are especially problematic when dealing with different privacy requirements and make it difficult for these methods to adapt to the dynamic nature of changing settings, which may lead to worse results.
VII-C Sensitive Experiments of Hyperparameters
Additionally, we conducted a series of experiments to analyze the sensitivity of the BGTplanner algorithm to key hyperparameters, specifically and , which control the range of action space and the maximum number of training rounds, respectively. The results of these experiments are presented in Figs. 4-5, where all other parameters are held constant while varying and (or ), respectively.
VII-C1 Overall Observations
In general, the experiments consistently show that as the total privacy budget increases, the model’s training performance improves. Moreover, the robustness of BGTplanner is evident from the relatively small variations in performance across different settings of and , compared to baseline methods. This stability underscores the algorithm’s superiority in various training scenarios. Additionally, as illustrated in Figs. 4-5, BGTplanner enhances overall training efficiency as the total privacy cost increases for both Gaussian and Laplace mechanisms. However, this efficiency comes at the expense of greater privacy disclosure, highlighting the need to balance privacy and performance.
VII-C2 Sensitivity to
Figs. 4 explores the impact of varying , which represents the minimum number of training rounds. A smaller implies a broader range of possible actions for the CMAB model, allowing it to allocate a larger portion of the privacy budget to each training round. Generally, we observe that reducing tends to improve the performance of privacy budget allocation, as it provides more flexibility for distributing the budget effectively across rounds. However, in scenarios where the total privacy budget is limited, an excessively small may lead to rapid consumption of the privacy budget, causing premature termination of training and resulting in suboptimal performance.
VII-C3 Sensitivity to
Figs. 5 investigates the influence of the maximum number of training rounds on the algorithm’s performance. The results reveal distinct trends between the Gaussian and Laplace mechanisms. Under the Gaussian mechanism, BGTplanner achieves better training efficiency with a larger ia large proportion of situations, owing to the advanced RDP method, which allows tighter tracking of privacy loss as the number of rounds increases. Conversely, under the Laplace mechanism, increasing tends to reduce training efficiency. This discrepancy likely arises because the basic composition theorem, used in the Laplace mechanism, is more sensitive to changes in , necessitating more careful budget allocation.
Other than the experiments presented above, we also examine the sensitivity of some crucial parameters in Appendix F, which illustrates how the hyperparameters affect the overall allocation performance. Overall, the results underscore the effectiveness of BGTplanner in achieving high model accuracy under conditions, demonstrating its superiority over other baseline methods.
VIII Conclusion
In this study, we devise a novel online privacy budget allocation solution, i.e., BGTplanner, for differentially private federated recommenders (DPFRs) under dynamic contexts, solving the challenging problem of privacy budget allocation in federated recommender systems. We have pioneered the representation of privacy budget allocation in federated recommender (FR) as a budget-constrained online problem, making a novel contribution in this field. Furthermore, Gaussian Process Regression (GPR) is employed to predict model learning progress. A Contextual Multi-Armed Bandit (CMAB) algorithm is devised to make final budget allocation decisions on the fly with regret guarantees. The empirical evidence from the extensive experiments demonstrates the superiority of our method over existing privacy budget allocation algorithms. To solidify and extend our findings, more comprehensive and detailed experimental analysis, and the customization of privacy budgets for individual clients deserve exploration in future research.
References
- [1] C. Li, A. He, Y. Wen, G. Liu, and A. T. Chronopoulos, “Optimal Trading Mechanism Based on Differential Privacy Protection and Stackelberg Game in Big Data Market,” IEEE Transactions on Services Computing, vol. 16, no. 5, pp. 3550–3563, 2023.
- [2] M. Hu, D. Wu, R. Wu, Z. Shi, M. Chen, and Y. Zhou, “RAP: A Light-Weight Privacy-Preserving Framework for Recommender Systems,” IEEE Transactions on Services Computing, vol. 15, no. 5, pp. 2969–2981, 2022.
- [3] C. Wu, F. Wu, L. Lyu, T. Qi, Y. Huang, and X. Xie, “A federated graph neural network framework for privacy-preserving personalization,” Nature Communications, vol. 13, no. 1, p. 3091, 2022.
- [4] Y. Guo, H. Xie, Y. Miao, C. Wang, and X. Jia, “FedCrowd: A Federated and Privacy-Preserving Crowdsourcing Platform on Blockchain,” IEEE Transactions on Services Computing, vol. 15, no. 4, pp. 2060–2073, 2022.
- [5] J. Wu, Y. Yang, M. Hu, Y. Zhou, and D. Wu, “FCER: A Federated Cloud-Edge Recommendation Framework with Cluster-based Edge Selection,” IEEE Transactions on Mobile Computing, vol. Early Access, pp. 1–13, 2024.
- [6] N. Agrawal, A. K. Sirohi, S. Kumar, and J. , “No Prejudice! Fair Federated Graph Neural Networks for Personalized Recommendation,” in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 38, no. 10, 2024, pp. 10 775–10 783.
- [7] J. Bian, J. Huang, S. Ji, Y. Liao, X. Li, Q. Wang, J. Zhou, D. Dou, Y. Wang, and H. Xiong, “Feynman: Federated Learning-Based Advertising for Ecosystems-Oriented Mobile Apps Recommendation,” IEEE Transactions on Services Computing, vol. 16, no. 5, pp. 3361–3372, 2023.
- [8] L. Melis, C. Song, E. D. Cristofaro, and V. Shmatikov, “Exploiting Unintended Feature Leakage in Collaborative Learning,” in IEEE Symposium on Security and Privacy (S&P). IEEE, 2019, pp. 691–706.
- [9] R. Liu, Y. Cao, Y. Wang, L. Lyu, Y. Chen, and H. Chen, “PrivateRec: Differentially Private Model Training and Online Serving for Federated News Recommendation,” in Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD). ACM, 2023, pp. 4539–4548.
- [10] C. Arroyo Arevalo, S. L. Noorbakhsh, Y. Dong, Y. Hong, and B. Wang, “Task-Agnostic Privacy-Preserving Representation Learning for Federated Learning against Attribute Inference Attacks,” in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 38, no. 10, 2024, pp. 10 909–10 917.
- [11] G. Cormode, S. Jha, T. Kulkarni, N. Li, D. Srivastava, and T. Wang, “Privacy at scale: Local differential privacy in practice,” in Proceedings of the International Conference on Management of Data (SIGMOD). ACM, 2018, pp. 1655–1658.
- [12] K. Ren, G. Liao, Q. Ma, and X. Chen, “Differentially Private Auction Design for Federated Learning With non-IID Data,” IEEE Transactions on Services Computing, vol. 17, no. 5, pp. 2236–2247, 2024.
- [13] J. He, N. Wang, T. Xiang, Y. Wei, Z. Zhang, M. Li, and L. Zhu, “ABDP: Accurate Billing on Differentially Private Data Reporting for Smart Grids,” IEEE Transactions on Services Computing, vol. 17, no. 5, pp. 1938–1954, 2024.
- [14] K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. S. Quek, and H. Vincent Poor, “Federated Learning with Differential Privacy: Algorithms and Performance Analysis,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3454–3469, 2020.
- [15] A. M. Girgis, D. Data, S. Diggavi, A. T. Suresh, and P. Kairouz, “On the Rényi Differential Privacy of the Shuffle Model,” in Proceedings of ACM SIGSAC Conference on Computer and Communications Security (CCS). ACM, 2021, p. 2321–2341.
- [16] X. Zhang, Y. Chen, C. Ma, Y. Fang, and I. King, “Influential Exemplar Replay for Incremental Learning in Recommender Systems,” in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 38, no. 8, 2024, pp. 9368–9376.
- [17] K. Pan and K. Feng, “Differential Privacy-Enabled Multi-Party Learning with Dynamic Privacy Budget Allocating Strategy,” Electronics, vol. 12, no. 3, p. 658, 2023.
- [18] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” in Proceedings of the International Conference on Artificial intelligence and statistics (AISTATS), vol. 54. PMLR, 2017, pp. 1273–1282.
- [19] C. Chen, H. Wu, J. Su, L. Lyu, X. Zheng, and L. Wang, “Differential Private Knowledge Transfer for Privacy-Preserving Cross-Domain Recommendation,” in Proceedings of the ACM Web Conference (WWW). ACM, 2022, pp. 1455–1465.
- [20] J. Hong, Z. Wang, and J. Zhou, “Dynamic Privacy Budget Allocation Improves Data Efficiency of Differentially Private Gradient Descent,” in ACM Conference on Fairness, Accountability, and Transparency (FAccT). ACM, 2022, pp. 11–35.
- [21] D. B. Ami, K. Cohen, and Q. Zhao, “Client Selection for Generalization in Accelerated Federated Learning: A Multi-Armed Bandit Approach,” 2023. [Online]. Available: https://arxiv.org/abs/2303.10373
- [22] C. Shi, C. Shen, and J. Yang, “Federated Multi-armed Bandits with Personalization,” in Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS). PMLR, 2021, pp. 2917–2925.
- [23] C. Wu, Y. Zhu, R. Zhang, Y. Chen, F. Wang, and S. Cui, “FedAB: Truthful Federated Learning With Auction-Based Combinatorial Multi-Armed Bandit,” IEEE Internet of Things Journal, vol. 10, no. 17, pp. 15 159–15 170, 2023.
- [24] X. Feng, W. Cheng, C. Cao, L. Wang, and V. S. Sheng, “DPFLA: Defending Private Federated Learning Against Poisoning Attacks,” IEEE Transactions on Services Computing, vol. 17, no. 4, pp. 1480–1491, 2024.
- [25] L. Xiao, X. Zhang, D. Wu, M. Hu, Y. Zhou, and S. Yu, “History-Aware Privacy Budget Allocation for Model Training on Evolving Data-Sharing Platforms,” IEEE Transactions on Services Computing, vol. Early Access, pp. 1–15, 2024.
- [26] C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy,” Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014.
- [27] H. Zhou, H. Wang, Z. Yu, G. Bin, M. Xiao, and J. Wu, “Federated Distributed Deep Reinforcement Learning for Recommendation-enabled Edge Caching,” IEEE Transactions on Services Computing, vol. Early Access, pp. 1–17, 2024.
- [28] T. Qi, F. Wu, C. Wu, Y. Huang, and X. Xie, “Privacy-Preserving News Recommendation Model Learning,” in Findings of the Association for Computational Linguistics: EMNLP 2020. ACL, 2020, pp. 1423–1432.
- [29] M. Nasr, R. Shokri, and A. Houmansadr, “Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning,” in IEEE Symposium on Security and Privacy (S&P). IEEE, 2019, pp. 739–753.
- [30] V. Shejwalkar, A. Houmansadr, P. Kairouz, and D. Ramage, “Back to the Drawing Board: A Critical Evaluation of Poisoning Attacks on Production Federated Learning,” in IEEE Symposium on Security and Privacy (S&P). IEEE, 2022, pp. 1354–1371.
- [31] M. Abadi, A. Chu, I. J. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep Learning with Differential Privacy,” in Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS). ACM, 2016, pp. 308–318.
- [32] Y. Yang, Y. Zhou, M. Hu, D. Wu, and Q. Z. Sheng, “BARA: Efficient Incentive Mechanism with Online Reward Budget Allocation in Cross-Silo Federated Learning,” in Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). Morgan Kaufmann, 2023, pp. 4478–4485.
- [33] N. Srinivas, A. Krause, S. M. Kakade, and M. W. Seeger, “Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design,” in Proceedings of International Conference on Machine Learning (ICML). Omnipress, 2010, pp. 1015–1022.
- [34] C. E. Rasmussen and C. K. I. Williams, Gaussian processes for machine learning, ser. Adaptive computation and machine learning. MIT Press, 2006. [Online]. Available: https://www.worldcat.org/oclc/61285753
- [35] T. Ouyang, K. Zhao, X. Zhang, Z. Zhou, and X. Chen, “Dynamic Edge-centric Resource Provisioning for Online and Offline Services Co-location,” in IEEE Conference on Computer Communications (INFOCOM). IEEE, 2023, pp. 1–10.
- [36] Y. Han, J. Zeng, Y. Wang, Y. Xiang, and J. Zhang, “Optimal Contextual Bandits with Knapsacks under Realizability via Regression Oracles,” in International Conference on Artificial Intelligence and Statistics (AISTATS). PMLR, 2023, pp. 5011–5035.
- [37] E. Hazan et al., “Introduction to online convex optimization,” Foundations and Trends® in Optimization, vol. 2, no. 3-4, pp. 157–325, 2016.
- [38] S. Shalev-Shwartz, “Online learning and online convex optimization,” Foundations and Trends in Machine Learning, vol. 4, no. 2, pp. 107–194, 2012.
- [39] S. Agrawal and N. R. Devanur, “Linear contextual bandits with knapsacks,” in Annual Conference on Neural Information Processing Systems (NeurIPS). Curran Associates Inc., 2016, p. 3458–3467.
- [40] F. Monti, M. M. Bronstein, and X. Bresson, “Geometric matrix completion with recurrent multi-graph neural networks,” in Proceedings of International Conference on Neural Information Processing Systems (NIPS). Curran Associates Inc., 2017, p. 3700–3710.
- [41] F. M. Harper and J. A. Konstan, “The MovieLens Datasets: History and Context,” ACM Transactions on Interactive Intelligent Systems, vol. 5, no. 4, 2015.
- [42] J. Fu, Z. Chen, and X. Han, “Adap DP-FL: Differentially Private Federated Learning with Adaptive Noise,” in IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). IEEE, 2022, pp. 656–663.
- [43] W. Yang, Y. Zhou, M. Hu, D. Wu, X. Zheng, J. H. Wang, S. Guo, and C. Li, “Gain without pain: Offsetting dp-injected noises stealthily in cross-device federated learning,” IEEE Internet of Things Journal, vol. 9, no. 22, pp. 22 147–22 157, 2022.
![]() |
Xianzhi Zhang received his B.S. degree from Nanchang University (NCU), Nanchang, China, in 2019 and an M.S. degree from the School of Computer Science and Engineering, Sun Yat-sen University (SYSU), Guangzhou, China, in 2022. He is currently pursuing a Ph.D. degree in the School of Computer Science and Engineering at Sun Yat-sen University, Guangzhou, China. He is also working as a visiting PhD student at the School of Computing, Macquarie University, Sydney, Australia. Xianzhi’s current research interests include video caching, privacy protection, federated learning, and edge computing. His researches have been published at IEEE TPDS and TSC, and won the Best Paper Award at PDCAT 2021. |
![]() |
Yipeng Zhou is a senior lecturer in computer science with School of Computing at Macquarie University, and the recipient of ARC DECRA in 2018. From Aug. 2016 to Feb. 2018, he was a research fellow with Institute for Telecommunications Research (ITR) of University of South Australia. From 2013.9-2016.9, He was a lecturer with College of Computer Science and Software Engineering, Shenzhen University. He was a Postdoctoral Fellow with Institute of Network Coding (INC) of The Chinese University of Hong Kong (CUHK) from Aug. 2012 to Aug. 2013. He won his PhD degree and Mphil degree from Informatio Engineering (IE) Department of CUHK respectively. He got Bachelor degree in Computer Science from University of Science and Technology of China (USTC). His research interests lie in federated learning, privacy protection and caching algorithm design in networks. He has published more than 80 papers including IEEE INFOCOM, ICNP, IWQoS, IEEE ToN, JSAC, TPDS, TMC, TMM, etc. |
![]() |
Miao Hu is currently an Associate Professor with the School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, China. He received the B.S. degree and the Ph.D. degree in communication engineering from Beijing Jiaotong University, Beijing, China, in 2011 and 2017, respectively. From Sept. 2014 to Sept. 2015, he was a Visiting Scholar with the Pennsylvania State University, PA, USA. His research interests include edge/cloud computing, multimedia communication and software defined networks. |
![]() |
Di Wu received the B.S. degree from the University of Science and Technology of China, Hefei, China, in 2000, the M.S. degree from the Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, in 2003, and the Ph.D. degree in computer science and engineering from the Chinese University of Hong Kong, Hong Kong, in 2007. He was a Post-Doctoral Researcher with the Department of Computer Science and Engineering, Polytechnic Institute of New York University, Brooklyn, NY, USA, from 2007 to 2009, advised by Prof. K. W. Ross. Dr. Wu is currently a Professor and the Associate Dean of the School of Computer Science and Engineering with Sun Yat-sen University, Guangzhou, China. He was the recipient of the IEEE INFOCOM 2009 Best Paper Award, IEEE Jack Neubauer Memorial Award, and etc. His research interests include edge/cloud computing, multimedia communication, Internet measurement, and network security. |
![]() |
Pengshan Liao received the BE degree and the MS degree from Sun Yat-sen University (SYSU), Guangzhou, China. His research interests include federated learning and recommendation system. |
![]() |
Mohsen Guizani received the B.S. (Hons.) and first M.S. degrees in electrical engineering, and the M.S. and Ph.D. degrees in computer engineering from Syracuse University, Syracuse, NY, USA, in 1984, 1986, 1987, and 1990, respectively. He is currently a Professor with the the Machine Learning Department, Mohamed Bin Zayed University of Artificial Intelligence and the Computer Science and Engineering Department, Qatar University, Qatar. Previously, he has served in different academic and administrative positions with the University of Idaho, Western Michigan University, the University of West Florida, the University of MissouriKansas City, the University of Colorado Boulder, and Syracuse University. He is the author of nine books and more than 600 publications in refereed journals and conferences. His research interests include wireless communications and mobile computing, computer networks, mobile cloud computing, security, and smart grid. Throughout his career, he received three teaching awards and four research awards. He was a recipient of the 2017 IEEE Communications Society Wireless Technical Committee Recognition Award, the 2018 Ad Hoc Technical Committee Recognition Award for his contribution to outstanding research in wireless communications and Ad-Hoc Sensor networks, and the 2019 IEEE Communications and Information Security Technical Recognition Award for outstanding contributions to the technological advancement of security. He guests edited a number of special issues in IEEE journals and magazines. He is also the Editor-in-Chief of IEEE Network Magazine. He serves on the editorial boards for several international technical journals, and the Founder and the Editor-in-Chief for Wireless Communications and Mobile Computing (Wiley). He also served as a member, the chair, and the general chair of a number of international conferences. He was the Chair of IEEE Communications Society Wireless Technical Committee and the Chair of the TAOS Technical Committee. He has served as the IEEE Computer Society Distinguished Speaker. He is currently an IEEE ComSoc Distinguished Lecturer. He is a Senior Member of ACM. |
![]() |
Michael Sheng is a Distinguished Professor and Head of School of Computing at Macquarie University. Before moving to Macquarie, Michael spent 10 years at School of Computer Science, the University of Adelaide, serving in senior leadership roles such as Interim Head of School and Deputy Head of School. Michael holds a PhD degree in computer science from the University of New South Wales (UNSW) and did his post-doc as a research scientist at CSIRO ICT Centre. From 1999 to 2001, Sheng also worked at UNSW as a visiting research fellow. Prior to that, he spent 6 years as a senior software engineer in industries. Prof. Michael Sheng’s research interests include Web of Things, Internet of Things, Big Data Analytics, Web Science, Service-oriented Computing, Pervasive Computing, and Sensor Networks. He is ranked by Microsoft Academic as one of the Most Impactful Authors in Services Computing (ranked Top 5 all time worldwide) and in Web of Things (ranked Top 20 all time). He is the recipient of the AMiner Most Influential Scholar Award on IoT (2019), ARC Future Fellowship (2014), Chris Wallace Award for Outstanding Research Contribution (2012), and Microsoft Research Fellowship (2003). Prof Michael Sheng is Vice Chair of the Executive Committee of the IEEE Technical Community on Services Computing (IEEE TCSVC), the Associate Director (Smart Technologies) of Macquarie’s Smart Green Cities Research Centre, and a member of the ACS Technical Advisory Board on IoT. |