\history

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000. 10.1109/ACCESS.2024.0429000

\tfootnote

finantial support acknowledgement

\corresp

Corresponding author: Hyejun Jeong (e-mail: [email protected]), Tai-Myoung Chung (e-mail: [email protected]).

FedCC: Robust federated learning against model poisoning attacks

HYEJUN JEONG1 HAMIN SON2 SEOHU LEE3 JAYUN HYUN4 Tai-Myoung Chung5 Department of Computer Science, University of Massachusetts Amherst, MA 01003 USA (e-mail: [email protected]) Department of Computer Science, University of California Davis, CA 95616, USA (e-mail: [email protected]) School of Medicine Biomedical Informatics and Data Science, Johns Hopkins University, Baltimore, MD 21205, USA (e-mail: [email protected]) Hippo T&C Inc., Suwon 16419, South Korea (e-mail: [email protected]) Department of Computer Science and Engineering, SungKyunKwan University, Suwon 16419, South Korea (e-mail: [email protected])

Abstract

Federated Learning, designed to address privacy concerns in learning models, introduces a new distributed paradigm that safeguards data privacy but differentiates the attack surface due to the server’s inaccessibility to local datasets and the change in protection objective–parameters’ integrity. Existing approaches, including robust aggregation algorithms, fail to effectively filter out malicious clients, especially those with non-Independently and Identically Distributed data. Furthermore, these approaches often tackle non-IID data and poisoning attacks separately. To address both challenges simultaneously, we present FedCC, a simple yet novel algorithm. It leverages the Centered Kernel Alignment similarity of Penultimate Layer Representations for clustering, allowing it to identify and filter out malicious clients by selectively averaging chosen parameters, even in non-IID data settings. Our extensive experiments demonstrate the effectiveness of FedCC in mitigating untargeted model poisoning and backdoor attacks. FedCC reduces the attack confidence to a consistent zero compared to existing outlier detection-based and first-order statistics-based methods. Specifically, it significantly minimizes the average degradation of global performance by 65.5%. We believe that this new perspective of assessing learning models makes it a valuable contribution to the field of FL model security and privacy. The code will be made available upon paper acceptance.

Index Terms:

Backdoor Attack, Defense, Federated Learning, Model Poisoning Attack, Non-IID, Robust Aggregation

\titlepgskip

=-21pt

I Introduction

Federated Learning (FL) [17] has emerged as a promising privacy-preserving model training approach in response to growing concerns about privacy breaches and data leakage caused by centralized learning. While Machine Learning (ML) and Deep Learning (DL) algorithms are popular for their capability, such as personalized recommendations and accurate forecasts, their centralized nature risks data privacy. FL allows data to remain on client devices but exchanges their local model parameters only, avoiding the need for central data collection.

FedAvg [17] is the first algorithm run on the server to aggregate clients’ model parameters. It involves a server initializing and advertising a global model to all clients, who then train the model with their private data and upload the computed parameters to the server. The server averages these parameters to update the global model. These steps are repeated until a stopping criterion is met. FL is especially suitable for data-sensitive environments since raw data remains on clients, reducing the risk of data leakage on the server or in-between communication.

However, due to its distributed nature, FL is susceptible to model poisoning attacks [12]. The server cannot directly examine local datasets’ data quality or model parameter integrity, allowing compromised clients or attackers to manipulate local models. The attacks can degrade global model performance indiscriminately (untargeted attacks) [8, 2, 18] or cause incorrect predictions on specific inputs (targeted attacks) [21, 23, 3, 25]. Backdoor attacks, a type of targeted attack, are stealthier by maintaining overall performance while misclassifying specific inputs [1].

Some preliminary defenses propose robust aggregation methods, but they may compromise privacy by sharing raw data or exposing data distribution to unreliable parties [6, 22]. Existing defenses often target IID data settings, which are rare in practice. Non-IID data, with variations in class distribution, features, or labels across clients [12], can gradually degrade model performance and complicate the detection of malicious clients [28, 27], as the impact of attacks increases with the degree of non-IID [16, 18].

In this study, we present FedCC, a novel defense mechanism for FL that mitigates untargeted model poisoning and targeted backdoor attacks, even with non-IID data. FedCC leverages the Centered Kernel Alignment (CKA) of Penultimate Layer Representations (PLRs) to distinguish between malicious and benign clients while maintaining data privacy by avoiding sharing raw data or additional information between the server and clients. PLRs contain highly distinguishable features, maximizing the differences in CKA scores between benign and malicious clients, even with non-IID data and no server knowledge of local data.

Empirical experiments show that FedCC outperforms existing defenses that rely on first-order statistics. It significantly improves global accuracy against untargeted attacks (65.5% compared to similar approaches) and reduces attack confidence in targeted backdoor attacks to almost zero while preserving main task accuracy. These results highlight FedCC’s effectiveness and robustness in mitigating attacks in FL settings, particularly in non-IID data settings.

Our contributions are summarized as follows:

•

We propose FedCC, a simple yet novel detection and aggregation method for FL, to defend against model poisoning attacks (untargeted model poisoning and targeted backdoor attacks) by comparing the CKA of PLRs of local and global models and averaging the selected local model parameters.
•

We justify that PLR is the most distinguishable and indicative feature to classify malicious and benign clients by measuring the distance between two clusters of clients.
•

We empirically show that CKA is the most sensitive and accurate metric to measure similarities between differently trained models, especially when data are non-IID, and the server does not have any knowledge about clients or any centralized data.
•

We empirically demonstrated the effectiveness of FedCC for defending against model poisoning attacks. Our experimental results show that comparing the CKA of PLRs is more reliable in filtering malicious clients out than using first-order statistics. FedCC outperforms existing defenses even when data are non-IID.

II Backgrounds and Related Works

II-A Poisoning Attacks in FL

FL is vulnerable to poisoning attacks due to its distributed nature and the server’s inability to examine dataset quality and model parameter integrity directly. Model poisoning is particularly destructive among various poisoning attacks, as adversaries stealthily manipulate local model parameters to maximize overall performance damage [3, 5]. Thus, this paper focuses on model poisoning attacks.

Model poisoning attacks allow client-side adversaries to alter local model parameters before sending them to the server. To avoid the global model from divergence, attackers optimize the compromised local model for both training loss and an adversarial objective, enhancing its stealthiness [5, 8]. They can also adjust model hyperparameters, such as the learning rate, local epochs, batch size, and optimization objective, to manipulate training rules before and during local training.

Based on the attacker’s goals, model poisoning attacks can be classified as untargeted or targeted. Untargeted attacks aim to degrade overall global test accuracy by manipulating compromised local models. In contrast, targeted attacks aim to mislead the model into misclassifying specific inputs, leaving other classes unaffected. This paper introduces a backdoor ability into targeted model poisoning attacks, where inputs belonging to a chosen class are intentionally misclassified to a chosen label, whereas the performance of other classes remains unaffected. Its added stealthiness makes the attack more challenging to detect [23, 21].

II-B Robust Aggregation in FL

Several Byzantine-robust aggregations based on summary statistics or anomaly detection have been proposed to mitigate model poisoning and backdoor attacks in FL. For example, Krum [4] selected the update with the smallest Euclidean distance to the remaining updates as the new global model. Still, it ignored outliers and assumed IID data only [2]. Median [26] selects the coordinate-wise median of updates, also ignoring outliers. Trimmed Mean [26] excludes extreme values before averaging, and Bulyan [11] combines Krum and Trimmed Mean for selection. These methods rely on knowing the number of attackers, which is often unavailable in practice.

Inspired by the fact that Sybil’s crafted model parameter’s directions are more similar to each other than to benign ones, Foolsgold [9] identified Sybils by measuring cosine similarity between local model gradients, marking the most similar ones as Sybils. However, it struggled with a single malicious client when there were no other malicious clients for comparison. Additionally, this approach may overfit non-IID data, ironically leading to suboptimal performance with IID data. FLTrust [6] used trust scores based on ReLU-clipped cosine similarities to weigh local clients’ parameters differently but risked privacy by requiring clients to share raw data with the server. Furthermore, precise weight scaling is paramount to ensure model convergence.

Previous works often used summary statistics (mean or median), direction (cosine similarity), or distance (Euclidean) of weight vectors, but they struggled to effectively distinguish between benign clients with non-IID data and malicious clients, often misclassifying benign clients as malicious. The lack of performance might be partially attributed to the inadequacy of Euclidean geometry as a metric to compare neural network representations. In contrast, our proposed approach utilizes kernel-based metric (Kernel CKA) to measure the similarity between global and local models, reliably identifying compromised model parameters among the mixture of clients and enhancing defense against model poisoning attacks in FL.

II-C Penultimate Layer Representation in FL

Refer to caption — Figure 1: CNN architecture and Penultimate Layer

The penultimate layer is the second-to-last layer (before the softmax layer) of a neural network model, as illustrated in Figure 1. Wang et al. [22] discovered that PLRs are effective in distinguishing malicious models from benign ones; within benign clients, benign PLRs follow the same distribution, whereas malicious PLRs do not, across multiple datasets and neural network architectures. Specifically, they demonstrated that distances among benign PLRs are smaller than those of benign and malicious PLRs. Their proposed method, FLARE, assigned a trust score to each client based on pairwise PLR discrepancies defined by Maximum Mean Discrepancy (MMD) among all model updates, allocating lower values to those farther from the benign distribution. The model updates are then scaled and averaged, weighted by each client’s trust score. However, FLARE continuously redirects the global model using server-owned raw data, increasing the risk of data leakage and jeopardizing data privacy.

II-D Centered Kernel Alignment

Centered Kernel Alignment (CKA) [13] is a highly accurate similarity metric that measures how similar two differently initialized or trained neural networks are, producing a value between 0 (not similar) and 1 (identical). CKA can be calculated by:

	$\displaystyle CKA(X,Y)$	$\displaystyle=\frac{HSIC(X,Y)}{\sqrt{HSIC(X,X)HSIC(Y,Y)}}$
		$\displaystyle=\frac{{\lVert K(X^{T})K(Y)\rVert}^{2}}{{\lVert K(X^{T})K(X)\rVert}{\lVert K(Y^{T})K(Y)\rVert}}$

where $HSIC$ is Hilbert-Schmidt Independence Criterion [10] (i.e., a not normalized version of CKA) and $K$ is the RBF kernel. It constructs similarity kernel matrices in the weight space and compares these matrices to characterize the representation space, allowing comparisons between layers with varying widths and different initialization schemes.

CKA includes several important properties for similarity measures of neural networks. It is non-invariant to invertible linear transformations, meaning similarity scores change if such transformations are applied. This property is vital; the gradient descent algorithm is not invariant to these transformations because similarity metrics invariant to linear transformations are inaccurate on models trained with gradient descent. It is also invariant to orthogonal transformations and isotropic scaling, which are desirable for neural networks trained with gradient descent owing to their stochastic nature. These properties make CKA a precise and reliable metric for measuring similarities between neural networks, particularly when comparing potential models from malicious clients.

CKA has been utilized to address data heterogeneity issues, taking into account that the similarity of non-IID data is notably lower than that of IID data [20], particularly in specific layers [19]. However, CKA has yet to be leveraged as a defense mechanism against attacks involving models from malicious clients where the data are distributed across the clients in a non-IID manner.

III Threat Model

III-A Attacker’s Goal

We consider two primary attack strategies in the context of FL: untargeted model poisoning and targeted backdoor attacks. Untargeted attacks (Fang attacks [8]) aimed to evade robust aggregation rules like Krum and Coordinate-wise Median or Trimmed Mean. Attackers manipulate the model parameters to degrade the global model’s overall accuracy. Targeted backdoor attacks, represented by Bhagoji et al. [3], aim to deceive the global model into misclassifying specific data samples the attacker chose. The objective is to assign a different label chosen by the attacker while maintaining high accuracy for the remaining classes, making the attack more inconspicuous.

III-B Attackers’ Capabilities

•

An attacker on the client side can control multiple compromised clients.
•

An attacker has full control over at most $k<n/2$ clients.
•

An attacker has knowledge about compromised clients’ data samples, current local model parameters, global model parameters of the previous round, and hyperparameters, such as learning rate, optimization method, loss function, batch size, and the number of local epochs.
•

An attacker has no knowledge or control over the server and honest clients.

III-C Attack Strategy

III-C1 Untargeted Model Poisoning Attacks

These attacks based on [8] specifically aim to break Krum and Coordinate-wise Median (Coomed) aggregation rules, known to be byzantine failure tolerant. Their primary objective is to disrupt the model training process, thereby diminishing global test accuracy. The attacks involve manipulating model parameters, such as flipping the sign of malicious parameters, to steer the model in the opposite direction from its uninterrupted trajectory. Specifically, we denote the attack against Krum as Untargeted-Krum and the attack against Coomed as Untargeted-Med.

Untargeted-Krum [8] alters malicious parameters resembling benign ones to maximize the chances of being selected by Krum. Specifically, the optimization problem is as follows:

	$\displaystyle\max_{\lambda}\lambda$
	$\displaystyle\text{subject to }w_{1}$	$\displaystyle=Krum(w_{1},...,w_{m},w_{m+1},...,w_{n})$
	$\displaystyle w_{1}$	$\displaystyle=w_{G}-\lambda s$
	$\displaystyle w_{i}$	$\displaystyle=w_{1},\text{ for }i=2,3,...,m$

where $m$ is the number of attackers, $n$ is the total number of selected clients, $w_{G}$ is the global model from the previous iteration, and $s$ is the sign of the global model parameter with no attack. The upper bound of the $\lambda$ is solved as follows:

	$\displaystyle\lambda\leq$	$\displaystyle\frac{1}{(n-2m-1)\sqrt{d}}\cdot\min_{m+1\leq i\leq n}\sum_{l\in\Gamma_{w_{i}}^{n-m-2}}D(w_{l},w_{i})$
		$\displaystyle+\frac{1}{\sqrt{d}}\cdot\max_{m+1\leq i\leq n}D(w_{i},w_{G})$

where $d$ is the number of parameters in the global model, $\Gamma$ is the set of $n-m-2$ benign local models having the smallest Euclidean distance to $w_{i}$ , and $D$ is the Euclidean distance. $\lambda$ is halved until one of the compromised models is selected, or $\lambda$ is less than a threshold.

Untargeted-Med [8] manipulates the model parameters based on their maximum and minimum so that chosen coordinate-wise median values direct toward an inverse direction. The attack starts with defining $w_{max,j}$ and $w_{min,j}$ , which indicates the maximum and minimum of the $j$ th local model parameters on the benign clients, respectively.

	$\displaystyle w_{max,j}$	$\displaystyle=\text{max}\{w_{(m+1),j},w_{(m+2),j},...,w_{n,j}\}$
	$\displaystyle w_{min,j}$	$\displaystyle=\text{min}\{w_{(m+1),j},w_{(m+2),j},...,w_{n,j}\}$

Also, to avoid sampled $m$ numbers being outliers, if $s_{j}=-1$ , $m$ numbers in [ $w_{max,j},b\cdot w_{max,j}$ ] (when $w_{max,j}>0$ ) or [ $w_{max,j},b/w_{max,j}$ ] (when $w_{max,j}\leq 0$ ) are randomly sampled. Otherwise, $m$ numbers [ $w_{min,j}/b,w_{min,j}$ ] (when $w_{min,j}>0$ ) or [ $b\cdot w_{min,j},w_{min,j}$ ] (when $w_{min,j}\leq 0$ ) are randomly sampled. Since the attack does not depend on $b$ , we set $b=2$ as the same as [8].

III-C2 Targeted Backdoor Attack

This attack is based on [3]. Each malicious client owns one mislabeled image samples and trains the local model on it. We trained the model on backdoor tasks along with the main task(s); the training went on for both malicious and benign tasks to maintain benign accuracy such that side-job training (i.e., backdoor training) remains stealthy. Then, the attack boosted malicious clients’ updates to negate the combined effect of the benign client:

w_{i}^{t}=w_{G}^{t-1}+\alpha_{m}(w_{i}^{t-1}-w_{G}^{t-1})\text{ for }i=1,...,m

where $t$ is the current epoch and $\alpha_{m}$ is a boosting factor. In CIFAR10, for example, we manipulate the model parameters to misclassify an image of ’airplanes’ as ’bird’ whereas normally classifying the image of other classes like ’cat,’ ’horse,’ or ’truck.’

IV FedCC: Robust Aggregation Against Poisoning Attacks

IV-A Overview of FedCC

Figure 2 depicts an overview of FedCC. (1) A server initializes and advertises a global model to $n$ participating clients, selected out of total $k$ clients with a probability of fraction $C$ . (2) The clients individually train their local models using their respective datasets and send the trained weights to the server for aggregation. Our proposed method intervenes at this moment. (3) The server extracts PLRs from the global model and each local model, and (4) compares RBF Kernel CKA values between them. (5) The CKA values are then clustered into two, potentially representing benign and malicious clients. (6) Since the server cannot assure whether the clients with higher similarities are malicious, it performs a coordinate-wise average of the weights from clients belonging to the cluster with a larger number of members, backed by an assumption that the number of malicious clients does not exceed $n/2-1$ . (7) Finally, the server updates the global model for the current epoch and proceeds to the next iteration by distributing the updated global model to newly selected $n$ clients.

IV-B Detailed Design

When filtering malicious clients in the presence of non-IID data, a major challenge is the server’s lack of knowledge regarding the data distribution. Specifically, even if the data is benign, local models can exhibit significant angular or magnitude differences if the data is non-IID. We thus require a similarity metric independent of data distribution and not influenced by the distance or direction of model parameters.

TABLE I: Comparison of Performance with Various Similarity Metrics

Method	Fang-Med		Fang-mKrum		Targeted
	IID	NIID	IID	NIID	IID	NIID
Kernel CKA	69.20	41.00	70.22	43.24	71.44/6e-07	54.62/0.0118
Linear CKA	10.00	13.13	64.09	39.55	71.02/0.0007	49.53/0.0616
MMD	63.39	40.90	69.69	32.27	70.85/1e-09	50.51/9e-05
Cosine	68.82	33.90	68.81	10.04	69.76/0.0002	53.66/0.0529
Euclidean	69.06	27.82	68.54	41.57	69.17/0.0221	52.20/0.0015

Penultimate Layer is the output of the second-to-last layer before the softmax layer in CNN. In federated learning, where the server cannot access local training data, it receives trained model parameters (weights) from selected clients. Therefore, the collected weights are the only available information for determining whether a client is malicious. PLR takes the end result of the convolution and pooling process that will reach a classification decision. Thus, we assumed that computed weights of the second last layer might contain the essential information that strongly influences classification results.

To justify the utilization of PLRs, we measure the distance between two clusters of clients (benign and malicious) using dendrograms with a single linkage metric and correlation distance method. Figure 3 presents cluster distances of each layer under different attack scenarios, such as untargeted-mKrum attacks in IID or non-IID settings. Observably, the cluster distance is the most considerable in PLR to other layers, indicating the highest degree of differences. Based on this observation, we conclude that PLR contains the most distinguishable and indicative information. Therefore, we utilize the weights of the penultimate layer to compare their similarity to the global model.

Kernel CKA[13] is a similarity metric to assess the similarity between two neural networks. It compares representations at different layers or models trained in various ways. The sophisticated similarity comparison ability of Kernel CKA makes it a suitable choice for our purpose.

To justify the use of Kernel CKA, we compared its performance with other similarity metrics such as linear CKA, MMD (not normalized CKA), cosine (angle), and Euclidean similarity (distance). We evaluated the test accuracy of FedCC using the CIFAR10 dataset under two untargeted attacks and one targeted attack, separately in IID and non-IID settings. The results are summarized in Table I. In the ’targeted’ column, the values represent the combination of test accuracy and backdoor confidence, with higher test accuracy and lower backdoor confidence indicating better performance. We observed that Kernel CKA consistently yielded the highest performance across multiple experiments. Therefore, we decided to utilize Kernel CKA to compare PLR similarities.

Algorithm 1 FedCC

Input global_w, n local_w $w_{1}$ , …, $w_{n}$
Output agg_w

1:for

i<n

\triangleright

Get PLRs of local and global models

2: local_plrs[

i

]

\leftarrow

local_ws[second_last_layer]

3:end for

4:glob_plr

\leftarrow

global_w[second_last_layer]

5:for

i<n

\triangleright

Apply kernel CKA to the plrs

6: cka[

i

]

\leftarrow

cka(glob_plr, local_plrs[

i

])

7:end for

\triangleright

Apply Kmeans clustering algorithm

8:kmeans = kmeans(n_cluster=2, cka[

i

])

9:labels = kmeans.labels_

10:count = counter(labels)

11:suspect

\leftarrow

12:if count[

0

]

\leq

count[

1

] then

13: suspect

\leftarrow

14:end if

\triangleright

Define suspect based on the size of cluster

15:suspects = where(label == suspect)

16:for

i

in range(

n

) do

17: for

s

in suspects do

\triangleright

Set scale to 0 for the suspects

18: scale[

s

]

\leftarrow 0

19: end for

20: if scale[

i

] != 0 then

21: scale[

i

]

\leftarrow 1

22: selected_param.append(scale[

i

] * local_ws)

23: end if

24:end for

\triangleright

Average the selected parameters

25:agg_weights = mean(scale*selected_param)

26:return agg_weights

FedCC is an aggregation method that combines selected weights without scaling, based on the measurement of Kernel CKA similarities between the global and local models’ PLRs. A pseudo algorithm is described in Algorithm 1. A server first extracts PLRs of the global and local models and calculates the kernel CKA between them. We specifically focus on comparing PLRs because they exhibit the most distinguishable differences among differently trained models, as demonstrated in previous sections. Additionally, we utilize RBF Kernel CKA, as it effectively captures similarity differences between non-IID and malicious clients. The similarity values obtained are then clustered using a simple K-means clustering algorithm. Any clustering method that produces only two clusters can be used. Considering the assumption that the number of attackers is less than half of the total number of clients ( $n/2$ ), we designate the clients in the cluster with fewer members as suspects. The indices of these suspicious clients are stored in a list called ‘suspects.’ We proceed by selectively averaging the weights of clients whose index does not belong to the ‘suspects’ list, excluding the weights of the suspicious clients from the aggregation. Note that the weights are averaged without scaling, as scaling could impede the model from convergence even when the loss reaches optima. We named our proposed method FedCC.

Compared to other methods such as Krum, Bulyan, and FLARE, which either require prior knowledge of the number of attackers or allow the server to have access to raw data, FedCC does not rely on any information about the clients or any instances of data. This ensures strong privacy preservation within the federated learning framework. Furthermore, FedCC effectively distinguishes between malicious and benign clients, even in practical but challenging scenarios where benign clients have non-IID data.

V Experiments

V-A Dataset

We use three benchmark vision datasets: Fashion-MNIST (fMNIST) [24], CIFAR10, and CIFAR100 [14].

fMNIST consists of 60,000 training and 10,000 testing samples of 28 $\times$ 28 grayscale images and associating labels of 10 classes. The labels include T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag, and Ankle boot.

CIFAR10 consists of 50,000 training and 10,000 testing examples of 32 $\times$ 32 color images and associating labels of 10 classes. The classes include airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck.

CIFAR100 is the same as CIFAR10 except for the categorization. It is labeled as 100 classes, each containing 600 images. Thus, there are 500 training and 100 testing images per class. Each class (i.e., ‘fine’ label) belongs to one of the 20 superclasses.

V-B Non-IID Simulation

Non-IID is a more feasible assumption considering the diverse and massive nature of data in practice. To standardize non-IID settings, we use Dirichlet distribution, which is well-known for its capability of representing real-world data [15]. A concentration parameter $\alpha$ controls the degree of non-IID; the smaller $\alpha$ is, the more likely the clients hold examples from only one randomly chosen class.

V-C Model Architecture

Considering that the end devices are normally incapable of handling heavy computation due to resource or communication constraints, we used lightweight CNN architectures for the experiments. For fMNIST, we used a 4-layer-CNN with two convolutions and two fully connected (FC) layers, incorporating dropout with probabilities of 0.25 and 0.5 before and after the FC layer, respectively. For CIFAR10, we utilized a 5-layer-CNN with three convolutions, each followed by a max-pool layer and two FC layers, applying dropout with a probability of 0.5 before the first FC layer. Finally, for CIFAR100, we adopted the LeNet architecture. Table VI, Table VII, and Table VIII in Appendix summarize model architectures for each benchmark dataset.

V-D Setup

We implement two untargeted attacks in [8] and one targeted attack [3], defense baselines, including Krum [4], Coomed [26], Bulyan [11], and FLARE [22], and FedCC. We have ten clients with a client participation fraction of 1.0 unless stated otherwise. The number of attackers is 30% of the total clients (three) in untargeted and 10% (one) in targeted attacks as in [22]. We also assumed all clients have the same NN architecture. Under the targeted backdoor attack, benign clients go through three, whereas compromised clients undergo six local epochs. We train each client’s local model using an Adam optimizer with a learning rate of 0.001, following the experiment in [22]. All results are averaged after running each experiment three times.

TABLE II: Accuracy of no-attack scenario, boldface and underlining refer to the highest and the second highest accuracy, respectively.

Scenarios	Dataset	FedAvg	Krum	Coomed	Bulyan	FLARE	FedCC
No Attack non-IID	fMNIST	69.68	43.10	76.12	68.78	42.82	71.92
	CIFAR10	53.06	16.16	45.55	46.67	27.33	50.80
	CIFAR100	17.22	7.27	12.82	15.21	0.99	17.62
No Attack IID	fMNIST	89.89	83.81	88.91	89.49	76.86	89.71
	CIFAR10	69.17	56.56	67.34	67.81	46.84	69.03
	CIFAR100	21.49	7.73	15.56	20.55	7.59	22.05

V-E Evaluation Metrics

We assessed the performance of our defense mechanism using two metrics: backdoor confidence (confidence) and global model test accuracy (accuracy). The confidence indicates the likelihood of misclassification, with values ranging from 0 to 1 (lower values indicating better defense performance). Accuracy measures the overall test accuracy of the global model when various defense methods are applied when attacks exist, with values ranging from 0 to 100 (higher values indicating better performance). We reported both metrics in the presence of backdoor attacks while focusing solely on accuracy for untargeted attacks—as the attacker’s goal is to deteriorate the overall accuracy.

VI Results and Discussion

In Table II, which represents the baseline experiments without any attacks, we can observe the accuracy of different methods in both non-IID and IID settings. FedCC achieves accuracy comparable to that of the best-performing method, FedAvg. The slightly higher accuracy of FedAvg than FedCC may be attributed to its wholesome averaging, which leads to a more generalized model when there is no attack. Nonetheless, FedCC surpasses all other baseline methods with CIFAR100, which has more classes, making it harder to achieve better accuracy in FL[7].

TABLE III: Accuracy under untargeted and targeted attacks in Non-IID (top) and IID setting (bottom)

Scenarios	Dataset	FedAvg	Krum	Coomed	Bulyan	FLARE	FedCC
Fang-Krum non-IID	fMNIST	64.63	16.51	57.12	13.39	49.54	66.27
	CIFAR10	33.69	15.38	35.70	19.23	17.03	37.40
	CIFAR100	2.27	1.00	4.85	0.98	7.46	14.51
Fang-Med non-IID	fMNIST	16.32	49.33	68.90	64.12	52.25	72.76
	CIFAR10	10.02	25.06	40.23	32.47	14.59	42.37
	CIFAR100	1.00	6.24	10.27	6.91	1.00	16.12
Targeted non-IID	fMNIST	79.07	43.46	71.25	73.58	60.57	84.54
	CIFAR10	30.28	16.05	28.87	25.77	10.52	36.20
	CIFAR100	9.84	7.24	11.02	12.01	1.04	13.81
Fang-Krum IID	fMNIST	75.55	31.66	87.62	50.30	79.16	89.23
	CIFAR10	49.67	40.86	57.40	12.67	25.77	67.12
	CIFAR100	13.72	1.04	6.17	1.59	7.49	18.47
Fang-Med IID	fMNIST	20.86	85.33	86.70	87.45	71.08	88.44
	CIFAR10	9.51	54.28	59.20	57.69	49.82	64.74
	CIFAR100	0.87	12.27	14.43	12.56	5.83	17.83
Targeted IID	fMNIST	89.32	85.52	88.45	89.35	67.65	90.16
	CIFAR10	52.20	45.37	49.27	55.77	10.09	51.57
	CIFAR100	12.73	8.37	15.91	16.59	1.71	18.08

VI-A Non-IID Data Environment

Following the aforementioned non-IID simulation, we set $\alpha$ to 0.2 to create a disjoint client training data and distribute each set of data to each client.

VI-A1 Untargeted Model Poisoning Attacks

The upper section of Table III provides the global model’s accuracy across three datasets under two untargeted (Fang-Krum and Fang-Med) and Targeted attacks in a non-IID environment. One notable finding is that robust aggregation algorithms often yield lower accuracy compared to simple averaging (no defense). This discrepancy arises from the imperfect identification of malicious clients, leading to the erroneous aggregation of their weights along with benign clients. When applying Krum and Bulyan against Fang-Krum, the accuracy drops by a substantial 59.30% and 65.68%, respectively, compared to FedAvg. This decline can be attributed to Fang-Krum’s intentional design to degrade accuracy against Krum. Additionally, Coomed struggles to effectively mitigate this attack as the median values of weights often fail to represent benign clients, particularly in non-IID settings, accurately. Furthermore, the relatively low accuracy of FLARE is due to weight scaling, even when the model already exhibits satisfactory performance. In contrast, FedCC, which selectively averages weights, stands out by neither boosting nor minimizing weights, resulting in higher performance than other methods. Notably, FedCC achieves the highest accuracy (66.27%, 37.40%, and 14.51% for fMNIST, CIFAR10, and CIFAR100, respectively), with a significant leap in experiments with CIFAR100.

Turning our attention to the Fang-Med attack, the accuracy of FedAvg experiences a substantial drop due to the attack’s tendency to deviate from median values, creating outliers. This behavior explains the relatively high accuracy when Coomed and Bulyan are applied because coordinate-wise median and trimmed mean methods disregard outliers. Meanwhile, FLARE performs poorly for a reason similar to the one mentioned above. For both attacks, we can observe that Krum, which selects a single local model as the global model, is less robust than Coomed, which aggregates multiple local models. Similarly, Bulyan, which takes averages of multiple chosen clients, exhibits less robustness than Coomed, particularly in non-IID data environments. Notably, FedCC achieves the highest accuracy (72.76%, 42.37%, and 16.12% for each dataset) without sharing raw data with the server or other clients with non-IID data. This exceptional performance of FedCC can be attributed to two factors: 1) highly distinguishable information within PLRs, and 2) higher CKA similarity between benign clients than the one between benign and malicious clients.

VI-A2 Targeted Backdoor Attacks

The fourth row of Table III summarizes accuracy under targeted backdoor attacks in a non-IID setting. To further illustrate the impact of these attacks, we include a visual representation of the backdoor confidence in the first row of Figure 4.

It becomes evident that Krum, which selects a single client’s weights as the global model, exhibits diminished robustness compared to other methods that aggregate the weights of multiple clients; coordinate-wise median (Coomed) and trimmed mean with multi-Krum (Bulyan) demonstrate relatively higher test accuracy. Conversely, FLARE decreases the main task test accuracy for similar reasons observed in the experiments mentioned above. Nonetheless, FedCC achieves the highest test accuracy. This outcome underscores the superiority of filtering malicious clients through CKA similarity comparison, surpassing the effectiveness of first-order statistical methods like mean or median.

Notably, Figure 4 illustrates how FedCC significantly nullifies backdoor confidence. Compared to oscillating values of other methods, the stable low confidence values further emphasize FedCC’s resilience and independence from the specific data distribution across clients. This advantage stems from its effective utilization of CKA, enabling the calculation of similarity values even when models are trained on different datasets with varying data distributions. Whereas it is true that other baselines, such as Bulyan, also reduce confidence, we argue that FedCC not only preserves but potentially increases accuracy by precisely discriminating against malicious clients.

VI-B IID Data Environment

In an IID setting, we evenly divide the training dataset among all clients so that each client’s data distribution is identical and of the same size.

VI-B1 Untargeted Model Poisoning Attacks

The bottom section of Table III (fifth to seventh rows) provides valuable insights into the test accuracy of a global model trained on three datasets under untargeted attacks in IID settings. Although the trends observed in non-IID experiments appear similar, the accuracy values are slightly higher in this case due to the identical nature of the data.

Under the Fang-Krum attack, we can observe relatively low accuracy values for Krum and Bulyan, as the attack explicitly targets Krum and its variations. However, when Coomed is applied, the accuracy increases from 75.55% (FedAvg) to 79.16% (Coomed), indicating that coordinate-wise median values closely align with the weights of the global model with minimal deviation. FLARE’s low accuracy is due to the continuous scaling of weights even when the model is already optimized. In contrast, FedCC does not scale weights but focuses solely on optimizing the model to the minimum by averaging chosen clients. As a result, FedCC achieves the highest accuracy (89.23%, 67.12%, and 14.51% for each dataset) by accurately identifying and filtering out compromised clients.

Under the Fang-Med attack, the accuracy of FedAvg experiences a significant drop (89.89% to 20.86%) due to the peculiarity of outliers when the clients have IID data. Notably, Krum, Coomed, and Bulyan demonstrate similar accuracy values, indicating that coordinate-wise median values indeed represent benign clients’ parameters. Meanwhile, FedCC consistently achieves the highest accuracy, with values of 88.44%, 64.74%, and 17.83% for each dataset. Notably, the remarkable growth observed in experiments with CIFAR100, with a 66% and 58.7% increment under Fang-Krum and Fang-Med attacks, respectively, surpasses the second-highest improvement. These results strongly validate the effectiveness of FedCC in mitigating untargeted model poisoning attacks in a simple IID setting.

VI-B2 Targeted Backdoor Attacks

The last row of Table III and the bottom three figures of Figure 4 are the test accuracy and backdoor confidence under targeted backdoor attacks in an IID setting. In targeted backdoor attacks, the global model is covertly trained to misclassify a specific attacker-chosen class to an attacker-chosen label while leaving the remaining classes intact. As a result, the test accuracy has remained relatively high across the different methods.

However, a critical observation is the remarkable confidence reduction achieved by FedCC. The backdoor confidence with FedCC approaches zero, effectively neutralizing the backdoor task. In comparison, the attack confidence of FedAvg and Bulyan, despite exhibiting higher test accuracy for CIFAR10, remains higher than that of FedCC. Note that the confidence of the other methods fluctuates by a large margin. Specifically, FLARE’s backdoor confidence oscillates even though it occasionally records lower confidence than ours (e.g., CIFAR10). This means that its performance is less stable than that of FedCC, which consistently minimizes backdoor confidence. It is crucial to emphasize that reducing attack confidence is as significant as maintaining the main test accuracy since the attacker’s objective is to train the model on backdoor tasks stealthily, but the defender’s goal is to nullify them. Hence, our proposed method achieves the highest accuracy for fMNIST and CIFAR100 and comparable accuracy for CIFAR10 while effectively nullifying the backdoor confidence. This outcome underscores the effectiveness and robustness of our approach in mitigating targeted backdoor attacks.

VII Performance in Varying FL Settings

We additionally measure the effectiveness of FedCC in various non-IID settings, including different numbers of malicious clients, fractions of participation, and numbers of local epochs. Since Coomed was as effective as ours, and FLARE also used PLRs, we compare the three methods to FedCC, including FedAvg.

VII-A Impact of the Number of Malicious Clients

We assess the impact of different numbers of malicious clients under untargeted attacks, specifically Fang-Krum and Fang-Med. In this set of experiments, we fix the total number of clients and participation fraction at 10 and 1, respectively. To adhere to the assumption that the number of attackers is less than half of the participating clients, we investigate the test accuracy considering scenarios involving up to four malicious clients.

Figure 5 illustrates the accuracy in the given experimental settings, and it is evident that FedCC outperforms the other defense methods. We observe a general trend where the accuracy of defense methods decreases as the number of attackers increases. Notably, Coomed experiences the most significant drop in accuracy when transitioning from one to four malicious clients. This can be attributed to the fact that as the participation of malicious clients increases, there is a higher likelihood that the median values will be influenced by the malicious contributions.

In contrast, FedCC does not rely solely on geometric measures such as angles or distances but instead leverages hidden correlations between networks to identify and filter out malicious clients. As a result, FedCC demonstrates superior accuracy in filtering out malicious clients, regardless of the number of attackers involved. It is important to note that the accuracy degradation observed as the number of malicious clients increases primarily due to aggregating fewer clients rather than the malicious clients themselves being selected. These findings underscore the robustness and effectiveness of FedCC in defending against untargeted attacks, as it consistently outperforms other defenses across varying numbers of malicious clients.

VII-B Impact of Fraction

TABLE IV: Accuracy under untargeted attacks with different fractions of clients being selected. fM refers to fMNIST, C10 refers to CIFAR10, Med refers to Coordinate wise median.

Frac	Data	FedAvg	Med	FLARE	FedCC	FedAvg	Med	FLARE	FedCC
		Fang-Krum				Fang-Med
0.1	fM	55.31	49.83	34.02	64.83	16.57	66.41	52.24	69.52
0.1	C10	10.06	22.55	14.50	20.49	10.00	15.33	10.00	29.81
0.3	fM	64.22	57.52	10.00	73.55	16.26	58.07	10.00	61.12
0.3	C10	24.24	12.59	10.00	27.81	10.98	22.61	10.00	38.27
0.5	fM	62.37	58.20	10.00	76.41	18.36	62.49	10.00	69.05
0.5	C10	23.99	17.28	10.06	34.32	9.87	27.83	10.00	37.27

We examine the impact of the participating fraction of clients where the number of total clients is fixed at 100. The fractions are 0.1, 0.3, and 0.5, such that the numbers of clients being aggregated are 10, 30, and 50, respectively. The proportion of malicious clients is fixed at 0.3, such that the malicious clients are 3, 9, and 15, respectively. Table IV summarizes the accuracy. We can observe that the accuracy tends to increase as the participation fraction is greater under Fang-Krum attacks. It implies that the more client participation in a non-IID setting, the more accurate the global model is if the attacks are mitigated. Under Fang-Med attacks, the accuracy is the highest with FedCC and second highest with Coomed. Throughout the experiment, FLARE prevented the global model from convergence, and the test accuracy fluctuated. The fluctuation is presumably due to the weight scaling while randomly selecting clients. Distinct from previous experiments, the server chooses a fraction of the client every round, meaning not-yet-trained models could be selected. In this environment, weight scaling leveraged by FLARE constantly improperly scales the local weights, causing divergence and fluctuation. These experimental results indicate that selectively choosing clients based on CKA similarity is more reliable than taking the median value of each coordinate in a non-IID setting. We also observed that as the participation fraction grows, the accuracy grows. This is due to the rising number of aggregated local models; the more models are aggregated, the more general global model is generated.

VII-C Impact of the Number of Local Epochs

TABLE V: Accuracy under Backdoor Attack with doubled numbers of local epochs

Targeted (Diff Local Epoch)
Dataset	FedAvg	Coomed	FLARE	FedCC
fMNIST	69.40	72.11	72.32	82.31
CIFAR10	53.85	49.18	21.61	54.94

To examine the impact of the number of local epochs, we doubled up the number of local epochs such that the local models are trained for six epochs in benign clients and twelve epochs in malicious clients. The main task accuracy and confidence are summarized and illustrated in Table V and Figure 6. As expected, increasing the number of local epochs would not affect the accuracy or confidence but bring the convergence forward while being better at maintaining test accuracy. It is true that Coomed and FLARE significantly reduce the backdoor confidence, as well as FedCC; however, the test accuracy has even increased with FedCC due to its precise discrimination against malicious clients.

VIII Conclusion and Future Work

FL has emerged in response to the rising concerns about privacy breaches while employing AI techniques. FL, however, is vulnerable to model poisoning attacks due to a server’s blindness to local datasets, making it challenging to assume data distribution and model parameter integrity. In line with this, existing robust aggregation algorithms and defense approaches are orthogonal to various attacks and lack consideration of non-IIDness. Throughout exhaustive experiments, FedCC mitigates both untargeted and targeted (or backdoor) attacks while demonstrating its effectiveness in non-IID data environments. We leave theoretical guarantees of FedCC and experiments when clients have distinct neural network architectures or tasks as future work.

Appendix

VIII-A Dendrograms to Measure Cluster Distance

Figure 7 illustrates the distance between two clusters of clients using dendrograms with a single linkage metric and correlation distance method. The first two and last columns correspond to untargeted (Fang-Krum, Fang-Med) and targeted attacks, respectively. The total number of clients is set to 10; the participation fraction is 1.0. The number of malicious clients is 3 for untargeted and 1 for a targeted attack, with fedAvg. Observably, the height difference between the root of clusters is maximum in the second last layer (PLR).

VIII-B Model Architectures

The tables below summarize model architectures for each benchmark dataset (fMNIST, CIFAR10, CIFAR100). Considering that the end devices are generally incapable of handling heavy computation due to resource or communication constraints, we used lightweight models for the experiments.

TABLE VI: CNN Architecture for fMNIST dataset

Layer	In	Out	Ker / Str / Pad	Activation
conv2d_1	1	64	$5\times 5$ / 1 / 0	ReLU
conv2d_2	64	64	$5\times 5$ / 1 / 0	ReLU
dropout	-	-	0.25	-
flatten	-	25600	-	-
fc_1	25600	128	-	-
dropout	-	-	0.5	-
fc_2	128	10	-	-

TABLE VII: CNN Architecture for CIFAR10 dataset

Layer	In	Out	Ker / Str / Pad	Activation
conv2d_1	3	64	$3\times 3$ / 1 / 0	ReLU
maxpool2d			$2\times 2$ / - / 0	-
conv2d_2	64	64	$3\times 3$ / 1 / 0	ReLU
maxpool2d			$2\times 2$ / - / 0	-
conv2d_3	64	64	$3\times 3$ / 1 / 0	ReLU
maxpool2d			$2\times 2$ / - / 0	-
flatten	-	256	-	-
dropout	-	-	0.5	-
fc_1	256	128	-	-
fc_2	128	10	-	-

TABLE VIII: LeNet Architecture for CIFAR100 dataset

Layer	In	Out	Ker / Str / Pad	Activation
conv2d_1	3	6	$5\times 5$ / 1 / 0	ReLU
maxpool2d			$2\times 2$ / 2 / 0	-
conv2d_2	6	16	$5\times 5$ / 1 / 2	ReLU
maxpool2d			$2\times 2$ / 2 / 0	-
flatten	-	44944	-	-
fc_1	44944	120	-	ReLU
fc_2	120	84	-	ReLU
fc_3	84	100	-	-

Acknowledgment

References

[1] Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. How to backdoor federated learning. In International Conference on Artificial Intelligence and Statistics, pages 2938–2948. PMLR, 2020.
[2] Gilad Baruch, Moran Baruch, and Yoav Goldberg. A little is enough: Circumventing defenses for distributed learning. Advances in Neural Information Processing Systems, 32, 2019.
[3] Arjun Nitin Bhagoji, Supriyo Chakraborty, Prateek Mittal, and Seraphin Calo. Analyzing federated learning through an adversarial lens. In International Conference on Machine Learning, pages 634–643. PMLR, 2019.
[4] Peva Blanchard, El Mahdi El Mhamdi, Rachid Guerraoui, and Julien Stainer. Machine learning with adversaries: Byzantine tolerant gradient descent. Advances in Neural Information Processing Systems, 30, 2017.
[5] Nader Bouacida and Prasant Mohapatra. Vulnerabilities in federated learning. IEEE Access, 9:63229–63249, 2021.
[6] Xiaoyu Cao, Minghong Fang, Jia Liu, and Neil Zhenqiang Gong. Fltrust: Byzantine-robust federated learning via trust bootstrapping. arXiv preprint arXiv:2012.13995, 2020.
[7] Chen Chen, Yuchen Liu, Xingjun Ma, and Lingjuan Lyu. Calfat: Calibrated federated adversarial training with label skewness. In Advances in Neural Information Processing Systems, 2022.
[8] Minghong Fang, Xiaoyu Cao, Jinyuan Jia, and Neil Gong. Local model poisoning attacks to $\{$ Byzantine-Robust $\}$ federated learning. In 29th USENIX Security Symposium (USENIX Security 20), pages 1605–1622, 2020.
[9] Clement Fung, Chris JM Yoon, and Ivan Beschastnikh. The limitations of federated learning in sybil settings. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020), pages 301–316, 2020.
[10] Arthur Gretton, Olivier Bousquet, Alex Smola, and Bernhard Schölkopf. Measuring statistical dependence with hilbert-schmidt norms. In International conference on algorithmic learning theory, pages 63–77. Springer, 2005.
[11] Rachid Guerraoui, Sébastien Rouault, et al. The hidden vulnerability of distributed learning in byzantium. In International Conference on Machine Learning, pages 3521–3530. PMLR, 2018.
[12] Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
[13] Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Similarity of neural network representations revisited. In International Conference on Machine Learning, pages 3519–3529. PMLR, 2019.
[14] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009.
[15] Qinbin Li, Yiqun Diao, Quan Chen, and Bingsheng He. Federated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th International Conference on Data Engineering (ICDE), pages 965–978. IEEE, 2022.
[16] Xinda Li. Improved model poisoning attacks and defenses in federated learning with clustering. Master’s thesis, University of Waterloo, 2022.
[17] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
[18] Virat Shejwalkar and Amir Houmansadr. Manipulating the byzantine: Optimizing model poisoning attacks and defenses for federated learning. In NDSS, 2021.
[19] Ha Min Son, Moon Hyun Kim, and Tai-Myoung Chung. Comparisons where it matters: Using layer-wise regularization to improve federated learning on heterogeneous data. Applied Sciences, 12(19):9943, 2022.
[20] Saeed Vahidian, Mahdi Morafah, Chen Chen, Mubarak Shah, and Bill Lin. Rethinking data heterogeneity in federated learning: Introducing a new notion and standard benchmarks. In Workshop on Federated Learning: Recent Advances and New Challenges (in Conjunction with NeurIPS 2022).
[21] Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vishwakarma, Saurabh Agarwal, Jy-yong Sohn, Kangwook Lee, and Dimitris Papailiopoulos. Attack of the tails: Yes, you really can backdoor federated learning. Advances in Neural Information Processing Systems, 33:16070–16084, 2020.
[22] Ning Wang, Yang Xiao, Yimin Chen, Yang Hu, Wenjing Lou, and Y Thomas Hou. Flare: Defending federated learning against model poisoning attacks via latent space representations. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, pages 946–958, 2022.
[23] Binhan Xi, Shaofeng Li, Jiachun Li, Hui Liu, Hong Liu, and Haojin Zhu. Batfl: Backdoor detection on federated learning in e-health. In 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS), pages 1–10. IEEE, 2021.
[24] Han Xiao, Kashif Rasul, and Roland Vollgraf. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
[25] Chulin Xie, Keli Huang, Pin-Yu Chen, and Bo Li. Dba: Distributed backdoor attacks against federated learning. In International conference on learning representations, 2020.
[26] Dong Yin, Yudong Chen, Ramchandran Kannan, and Peter Bartlett. Byzantine-robust distributed learning: Towards optimal statistical rates. In International Conference on Machine Learning, pages 5650–5659. PMLR, 2018.
[27] Naoya Yoshida, Takayuki Nishio, Masahiro Morikura, Koji Yamamoto, and Ryo Yonetani. Hybrid-fl for wireless networks: Cooperative learning mechanism using non-iid data. In ICC 2020-2020 IEEE International Conference On Communications (ICC), pages 1–7. IEEE, 2020.
[28] Lin Zhang, Yong Luo, Yan Bai, Bo Du, and Ling-Yu Duan. Federated learning for non-iid data via unified feature learning and optimization objective alignment. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4420–4428, 2021.

Hamin Son Ha Min Son received B.S. and M.S. degrees in Computer Science and Engineering from Sungkyunkwan University, Seoul, South Korea, in 2020 and 2022, respectively. He is currently pursuing a Ph.D. degree in Computer Science and Engineering at the University of California, Davis, DA, USA. From 2021 to 2023, he was a research engineer with the Digital Therapeutics startup Hippo T&C. His research interest includes the development of effective screening tools for medical diseases/disorders and Semi-Supervised Domain Generalization.

\EOD