This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Efficient Privacy-Preserving KAN Inference Using Homomorphic Encryption

Zhizheng Lai, Yufei Zhou, Peijia Zheng Lin Chen Corresponding author
Abstract

The recently proposed Kolmogorov-Arnold Networks (KANs) offer enhanced interpretability and greater model expressiveness. However, KANs also present challenges related to privacy leakage during inference. Homomorphic encryption (HE) facilitates privacy-preserving inference for deep learning models, enabling resource-limited users to benefit from deep learning services while ensuring data security. Yet, the complex structure of KANs, incorporating nonlinear elements like the SiLU activation function and B-spline functions, renders existing privacy-preserving inference techniques inadequate. To address this issue, we propose an accurate and efficient privacy-preserving inference scheme tailored for KANs. Our approach introduces a task-specific polynomial approximation for the SiLU activation function, dynamically adjusting the approximation range to ensure high accuracy on real-world datasets. Additionally, we develop an efficient method for computing B-spline functions within the HE domain, leveraging techniques such as repeat packing, lazy combination, and comparison functions. We evaluate the effectiveness of our privacy-preserving KAN inference scheme on both symbolic formula evaluation and image classification. The experimental results show that our model achieves accuracy comparable to plaintext KANs across various datasets and outperforms plaintext MLPs. Additionally, on the CIFAR-10 dataset, our inference latency achieves over 7×7\times speedup compared to the naive method.

Introduction

In recent years, deep learning has made notable advancements. However, training a model requires substantial data and computational power, which is often impossible for resource-limited companies and individuals. Additionally, model owners are frequently hesitant to share their deep models due to potential breaches of intellectual property or privacy concerns (Jegorova et al. 2022). Using Machine Learning as a Service (MLaaS) offers a potential solution, allowing users to leverage encrypted deep learning services. However, user data, such as images or bills, often contain personal information, which raises significant concerns about privacy exposure in MLaaS.

To address these security issues, deep private inference has garnered considerable attention in recent years. By utilizing Secure Multiparty Computation (MPC) (Patra et al. 2021) and Homomorphic Encryption (HE) (Marcolla et al. 2022), data owners and model owners can perform inference tasks without compromising each other’s privacy (Knott et al. 2021; Lee et al. 2022). MPC-based privacy-preserving inference schemes often require substantial communication overhead (Hao et al. 2022; Pang et al. 2024). In contrast, in HE-based privacy-preserving inference frameworks (Lee et al. 2022; Ran et al. 2023; Zimerman et al. 2024), there is no need for interaction between the user and the server. Specifically, the user encrypts the input and sends it to the server, which then executes the inference algorithm and returns the encrypted result to the user. In this paper, we focus on HE-based solutions.

Refer to caption
Figure 1: Obstacles in transitioning from privacy-preserving inference of existing neural networks to privacy-preserving inference of KAN.

Although many deep private inference schemes have been proposed recently, the interpretability of deep models remains an open problem. This year, Kolmogorov-Arnold Networks (KANs) were introduced with the aim of offering greater interpretability. KANs incorporate additional non-linear elements, which increase computational demands during training. As shown in Figure 1, due to structural differences between KAN networks and traditional neural network architectures, existing HE-based privacy-preserving inference methods of deep learing cannot be directly applied to KANs. First, KAN inference requires the computation of various activation functions, like SiLU. Additionally, KAN inference necessitates the approximation of B-splines, for which no HE-based computation currently exists (Knott 1999). HE schemes capable of performing Boolean operations, such as TFHE (Chillotti et al. 2020), can compute arbitrary functions but are highly inefficient. As a result, most privacy-preserving inference methods (Lee et al. 2022; Ran et al. 2023) rely on arithmetic fully homomorphic encryption (FHE) schemes, such as RNS-CKKS (Cheon et al. 2019), to implement privacy-preserving KAN inference. These schemes support only homomorphic addition and multiplication, necessitating polynomial approximations of activation functions and B-splines.

While ReLU activation functions have been well-approximated in current privacy-preserving inference work (Lee et al. 2022; Ran et al. 2023) through high-precision sign function approximations (Lee et al. 2021), approximating the SiLU function is more challenging due to its lack of a direct relationship with the sign function. Although better polynomial approximations can be achieved through training (Ao and Boddeti 2023), the training cost for KAN networks is higher than for other neural network models due to the additional activation functions. Furthermore, in many cases, the server only has access to the model, not the original training data.

To address these challenges, we propose a novel task-specific activation function approximation method that dynamically adjusts the approximation range to suit different KAN networks. This method leverages Chebyshev’s inequality, focusing on regions with dense data distribution while avoiding emphasis on sparse edge regions. Comparisons with existing approximation methods, such as Remez (Zimerman et al. 2024), show superior model performance. Additionally, we develop an approximation method for B-spline basis functions using comparison functions based on minimax composite polynomials (Lee et al. 2021). We introduce a repeat packing technique to parallelize the computation of B-spline basis functions, enhancing computational efficiency. Moreover, we employ lazy combination techniques to eliminate extra time costs associated with rearranging encrypted data, achieving a 5.54×5.54\times speedup compared to the naive method under larger parameter settings.

  • We propose a novel method for approximating activation functions by dynamically determining the approximation range and using weighted least squares. This approach yields a polynomial approximation of the SiLU activation function that performs effectively on real datasets.

  • We introduce an efficient approach for computing B-spline functions in the HE domain. By utilizing techniques such as repeat packing, lazy combination, and comparison functions, we achieve highly accurate B-spline function computation.

  • Our experiments validate that our HE KAN maintains competitive performance in both symbolic formula evaluation and image classification. Additionally, on the CIFAR-10 dataset, our inference latency achieves over 7×7\times speedup compared to the naive method.

Related Work

KANs in the Plaintext Domain

KAN, as a promising alternative to MLP, has been integrated with popular models by various researchers. In (Zhang and Zhang 2024), Zhang et al. combined KAN with Graph Neural Networks, utilizing KAN for feature extraction and experimentally demonstrating the effectiveness of their proposed approach. Xu et al. (Xu et al. 2024) introduced KAN to enhance the message-passing process in Graph Collaborative Filtering, proposing a more efficient recommendation model. In (Genet and Inzirillo 2024), Genet et al. incorporated Temporal KAN into the Temporal Fusion Transformer to further improve the model’s ability to handle time series data. Li et al. (Li et al. 2024) combined KAN with U-Net for the segmentation and generation of medical images. However, these works primarily focus on the performance of KAN within neural networks and do not address the issues of security and privacy leakage in practical applications.

Privacy-Preserving Deep Inference

In 2016, Ran et al. introduced CryptoNets (Gilad-Bachrach et al. 2016), achieving privacy-preserving inference based on HE, though with very low efficiency. Since then, researchers have continued to improve inference performance on convolutional networks (Chou et al. 2018; Brutzkus, Gilad-Bachrach, and Elisha 2019; Benaissa et al. 2021), though the network structures they implemented are relatively simple. Both Lee et al.(Lee et al. 2022) and Ran et al.(Ran et al. 2023) employ more efficient ciphertext packing methods to achieve privacy-preserving inference for complex convolutional networks. Zimerman et al. (Zimerman et al. 2024) achieved privacy-preserving inference for Transformers. Additionally, researchers have combined Graph Convolutional Networks (GCN) with HE to enhance privacy protection for cloud-based graph data inference (Ran et al. 2022; Peng et al. 2024; Ran et al. 2024). There are also many privacy-preserving inference schemes based on MPC (Juvekar, Vaikuntanathan, and Chandrakasan 2018; Srinivasan, Akshayaram, and Ada 2019; Hao et al. 2022; Pang et al. 2024), but the communication overhead of MPC-based schemes is enormous, making them unsuitable for resource-constrained users. KANs differs significantly from current deep models. The aforementioned methods rely heavily on specific model structures, making them unsuitable for direct application to KAN privacy-preserving inference. Therefore, it is necessary to design a new method to achieve privacy-preserving inference for KAN.

Preliminaries

Kolmogorov-Arnold Networks

Unlike traditional neural networks that use fixed activation functions, KANs employs learnable activation functions at the network’s edges, enabling each weight parameter to be replaced by a univariate function. However, identifying suitable basis functions can be challenging, as many cannot be represented effectively. To address this, KANs utilizes spline functions for parametric approximation, providing significant flexibility. This method allows for modeling complex functions with fewer parameters, thereby enhancing the model’s interpretability.

The flexibility of spline functions enables them to adaptively model complex relationships in data by adjusting their shape, minimizing approximation errors, and improving the network’s ability to learn subtle patterns from high-dimensional datasets.

In most cases, spline(x)\operatorname{spline}(x) is parametrized as a linear combination of B-splines as follows (Liu et al. 2024):

spline(x)=mncmBm,k(x)\operatorname{spline}(x)=\sum_{m}^{n}c_{m}B_{m,k}(x) (1)

where cmc_{m} are trainable parameters, and Bm,k(x)B_{m,k}(x) are the kk-order B-spline basis functions at the knots. The kk-order B-spline basis functions are defined recursively as follows:

Bm,0(x)={1if tmx<tm+10otherwiseB_{m,0}(x)=\begin{cases}1&\text{if }t_{m}\leq x<t_{m+1}\\ 0&\text{otherwise}\end{cases} (2)
Bm,k(x)=xtmtm+ktmBm,k1(x)+tm+k+1xtm+k+1tm+1Bm+1,k1(x)\begin{split}B_{m,k}(x)&=\frac{x-t_{m}}{t_{m+k}-t_{m}}B_{m,k-1}(x)\\ &+\frac{t_{m+k+1}-x}{t_{m+k+1}-t_{m+1}}B_{m+1,k-1}(x)\end{split} (3)

where t1,,tn+kt_{1},\cdots,t_{n+k} form a non-decreasing sequence of knots, which determine the domain and influence of the basis functions.

The overall network in KAN can be expressed by the following equations:

ϕ(x)=wbb(x)+wsspline(x)\phi(x)=w_{b}b(x)+w_{s}\operatorname{spline}(x) (4)
b(x)=silu(x)=x1+exb(x)=\operatorname{silu}(x)=\frac{x}{1+e^{-x}} (5)

Homomorphic Encryption

Following current work in privacy-preserving inference (Lee et al. 2022; Ran et al. 2023), we use RNS-CKKS (Cheon et al. 2019) as the underlying cryptographic scheme. Below, we briefly introduce the main operations in HE.

RNS-CKKS encrypts a vector at a time using Single Instruction, Multiple Data (SIMD). To clearly describe the operations of HE, we use bold lowercase letters to represent vectors, and \llbracket\cdot\rrbracket to denote encrypted ciphertexts. For example, we use 𝐱\mathbf{x} to represent the vector (x1,x2,,xn)(x_{1},x_{2},\cdots,x_{n}), and 𝐱\llbracket\mathbf{x}\rrbracket to denote its ciphertext. We use \oplus, \ominus, \otimes, and rot(\operatorname{rot}() represent homomorphic ciphertext addition, subtraction, multiplication, and rotation.

𝐱𝐲\displaystyle\llbracket\mathbf{x}\rrbracket\oplus\llbracket\mathbf{y}\rrbracket =(x1+y1,x2+y2,,xn+yn)\displaystyle=\llbracket\left(x_{1}+y_{1},x_{2}+y_{2},\cdots,x_{n}+y_{n}\right)\rrbracket (6)
𝐱𝐲\displaystyle\llbracket\mathbf{x}\rrbracket\ominus\llbracket\mathbf{y}\rrbracket =(x1y1,x2y2,,xnyn)\displaystyle=\llbracket\left(x_{1}-y_{1},x_{2}-y_{2},\cdots,x_{n}-y_{n}\right)\rrbracket (7)
𝐱𝐲\displaystyle\llbracket\mathbf{x}\rrbracket\otimes\llbracket\mathbf{y}\rrbracket =(x1×y1,x2×y2,,xn×yn)\displaystyle=\llbracket(x_{1}\times y_{1},x_{2}\times y_{2},\cdots,x_{n}\times y_{n})\rrbracket (8)
𝐱𝐲\displaystyle\llbracket\mathbf{x}\rrbracket\otimes\mathbf{y} =(x1×y1,x2×y2,,xn×yn)\displaystyle=\llbracket\left(x_{1}\times y_{1},x_{2}\times y_{2},\cdots,x_{n}\times y_{n}\right)\rrbracket (9)
rot(𝐱,t)\displaystyle\operatorname{rot}(\mathbf{x},t) =(xt+1,,xk,x1,,xt),t\displaystyle=\llbracket\left(x_{t+1},\cdots,x_{k},x_{1},\cdots,x_{t}\right)\rrbracket,t\in\mathbb{Z} (10)

where tt is the rotation step. When t>0t>0, the ciphertext rotates left; when t<0t<0, it rotates right. A more detailed theoretical analysis and security assessment can be found in (Cheon et al. 2017, 2019).

Threat Model

The threat model in this paper is similar to that of previous HE-based privacy-preserving inference schemes (Lee et al. 2022; Ran et al. 2023). We assume a cloud-based machine learning service scenario where the cloud server hosts a pre-trained KAN model with plaintext weights. Clients send their sensitive data to the cloud server to obtain inference results from the network model. We assume the cloud server is honest-but-curious, meaning it follows the protocol but may attempt to extract users’ private information through additional computations. To protect the privacy of the client’s sensitive data, the client can encrypt the data using HE before sending it to the cloud server for privacy-preserving inference. The cloud server performs computations on the ciphertext without decrypting it, and the client decrypts the inference results using their private key. This process ensures that no information about the input data is revealed to the cloud.

Method

Similar to MLPs, the KANs is composed of multiple KANLayers stacked linearly, with each KANLayer having an input dimension of nin_{i} and an output dimension of non_{o}. Therefore, by approximating a single KANLayer, we can approximate the entire network. The overview of our HE-friendly KAN is shown in Figure 2.

In our HE KAN, the input is a three-dimensional real-valued tensor 𝐀h×w×c\mathbf{A}\in\mathbb{R}^{h\times w\times c}, where hh and ww represent the height and width of the input, respectively, and cc represents the number of channels. We use a raster scan order (Lee et al. 2022) to convert the tensor 𝐀\mathbf{A} into a vector 𝐱\mathbf{x} of length ni=hwcn_{i}=h\cdot w\cdot c, which is encrypted using CKKS-RNS as 𝐱\llbracket\mathbf{x}\rrbracket. Each KANLayer includes SiLU, B-spline, and Linear Layer. Below, we describe the approximation method for each component in detail.

Refer to caption
Figure 2: Structure of HE-friendly KAN. The specific structure of the KANLayer is shown within the dashed box.

SiLU Approximation

Previous approximations of the ReLU function utilized the sign function, which allows the input to be scaled to the [1,1][-1,1] range, resulting in a good approximation. However, SiLU cannot be approximated using sign or comparison functions. The biggest challenge in approximating SiLU is determining the appropriate approximation range.

The input range of the activation function varies across different datasets (Lee et al. 2023). We estimate the mean μ\mu and variance σ2\sigma^{2} of the activation function’s input range by feeding the dataset into a pre-trained network. Then, we use Chebyshev’s inequality to estimate the approximate distribution of the activation function’s input values.

According to Chebyshev’s inequality,

P{|Xμ|ε}σ2ε2P\left\{\left|X-\mu\right|\geqslant\varepsilon\right\}\leqslant\frac{\sigma^{2}}{\varepsilon^{2}} (11)

when we set ε=5σ\varepsilon=5\sigma, we assume that approximately 96%96\% of the data falls within the [(μ5σ,μ+5σ)][(\mu-5\sigma,\mu+5\sigma)] interval. Therefore, we use [max(μ5σ,xmin),min(μ+5σ,xmax)][\max(\mu-5\sigma,x_{\text{min}}),\min(\mu+5\sigma,x_{\text{max}})] as the approximation range to ensure that most data lies within this interval.

Next, we uniformly sample points within the approximation range and use weighted least squares to fit the activation function. We focus on fitting the regions with dense data distribution and de-emphasize the sparse edge regions. This approach allows us to obtain a polynomial that better approximates the original activation function under the same polynomial degree constraints. Let the input be xix_{i}, the output of the activation function be yiy_{i}, and the output of the polynomial be f(xi)f(x_{i}). The optimization objective of weighted least squares is:

min(i=1nwi(yif(xi))2)\min(\sum_{i=1}^{n}w_{i}(y_{i}-f(x_{i}))^{2}) (12)

where nn is the number of sampled points, and wiw_{i} is the weight. The closer the data is to the mean, the higher the weight we assign. For example, the sum of squared residuals of data points within the [μ3σ,μ+3σ][\mu-3\sigma,\mu+3\sigma] interval is given a weight wiw_{i} of 10, while those outside this interval are given a weight of 1. We then solve for the polynomial coefficients to obtain the approximate polynomial for the activation function.

B-spline Basis Approximation

We observed that the calculation of B-spline basis functions involves many parallelizable components. Therefore, we use Repeat Packing to fully leverage the SIMD capabilities of HE to accelerate B-spline calculations.

Let the input be a vector 𝐱\mathbf{x} of length nin_{i}. After repeat packing, it becomes 𝐱|𝐱|𝐱|𝐱\llbracket\mathbf{x}|\mathbf{x}|\mathbf{x}\dots|\mathbf{x}\rrbracket, denoted as 𝐱rep\llbracket\mathbf{x}\rrbracket_{\text{rep}}, where 𝐱\mathbf{x} is repeated g+2kg+2k times, with gg being the number of B-spline grids and kk being the B-spline degrees, satisfying ni(g+2k)N2n_{i}\cdot(g+2k)\leq\frac{N}{2}.

Repeat packing data in ciphertexts is not easy, as ciphertext rotations are computationally expensive. We designed a fast packing method that reduces the number of rotations needed for repeat packing from g+2kg+2k to log2(g+2k)\lceil\log_{2}(g+2k)\rceil. We first rotate 𝐱\llbracket\mathbf{x}\rrbracket to the right by nin_{i} steps, then add it to the original 𝐱\llbracket\mathbf{x}\rrbracket to obtain a temporary ciphertext 𝐱|𝐱\llbracket\mathbf{x}|\mathbf{x}\rrbracket. The above steps are repeated on the temporary ciphertext, with rotation steps of ni,2ni,4ni,n_{i},2n_{i},4n_{i},\cdots. The specific algorithm is shown in Algorithm 1.

Algorithm 1 Fast Repeat Packing

Input: ciphertext 𝐱\llbracket\mathbf{x}\rrbracket, grid parameters gg and kk, input size nin_{i}
Output: packed ciphertext 𝐱rep\llbracket\mathbf{x}\rrbracket_{\text{rep}}

1:  𝐦=(1,1,,0,)\mathbf{m}=(1,1,\cdots,0,\dots) {There are nin_{i} ones in 𝐦\mathbf{m}.}
2:  𝐱rep𝐱𝐦\llbracket\mathbf{x}\rrbracket_{\text{rep}}\leftarrow\llbracket\mathbf{x}\rrbracket\otimes\mathbf{m}
3:  for j1j\leftarrow 1 to log2(g+2k)\lceil\log_{2}(g+2\cdot k)\rceil do
4:     𝐱reprot(𝐱rep,ni2j1)𝐱rep\llbracket\mathbf{x}\rrbracket_{\text{rep}}\leftarrow\operatorname{rot}(\llbracket\mathbf{x}_{\text{rep}}\rrbracket,-n_{i}\cdot 2^{j-1})\oplus\llbracket\mathbf{x}\rrbracket_{\text{rep}}
5:  end for
6:  return 𝐱rep\llbracket\mathbf{x}\rrbracket_{\text{rep}}
Refer to caption
Figure 3: Toy example of B-spline Basis Approximation when g=2g=2 and k=1k=1.

To simplify the description, we define the following operations:

  • 𝙲𝚘𝚕𝚃𝚒𝚕𝚎\mathtt{ColTile}(𝐀,l,r)(\mathbf{A},l,r): Concatenates columns ll to r1r-1 of matrix 𝐀\mathbf{A} into a vector.

  • 𝙿𝚘𝚕𝚢𝙲𝚘𝚖𝚙\mathtt{PolyComp}(a,b)(a,b): Returns 11 if a>ba>b, 0 if a<ba<b, and 12\frac{1}{2} if a=ba=b. We implement this using the method from (Lee et al. 2021).

Recall Equation 2 and Equation 3. We first need to determine whether the input 𝐱\mathbf{x} falls within the interval [tm,tm+1][t_{m},t_{m+1}], and then recursively compute the basis functions.

We combine all the intervals into a weight matrix 𝐆\mathbf{G}. Then, using 𝙲𝚘𝚕𝚃𝚒𝚕𝚎\mathtt{ColTile}, we obtain a vector 𝐠𝟏\mathbf{g_{1}} containing the left endpoints of all intervals [tm,tm+1][t_{m},t_{m+1}] and a vector 𝐠𝟐\mathbf{g_{2}} containing the right endpoints. Next, we use 𝙿𝚘𝚕𝚢𝙲𝚘𝚖𝚙\mathtt{PolyComp} to compare the packed input 𝐱rep\llbracket\mathbf{x}\rrbracket_{\text{rep}} with 𝐠𝟏\mathbf{g_{1}} and 𝐠𝟐\mathbf{g_{2}}, and then recursively calculate the basis functions according to Equation 3.

It should be noted that 𝙿𝚘𝚕𝚢𝙲𝚘𝚖𝚙\mathtt{PolyComp} operates within the range [1,1][-1,1]. If the input exceeds this range, the approximation may become inaccurate. Therefore, we scale the input to the [1,1][-1,1] range. We estimate the approximate range of the input using the training data, and represent the input within a larger range [R,R][-R,R] to avoid overflow. When using 𝙿𝚘𝚕𝚢𝙲𝚘𝚖𝚙\mathtt{PolyComp} to compute intervals, there may be cases where xx is exactly equal to the interval endpoint, such as x=tmx=t_{m}. In this case, 𝙿𝚘𝚕𝚢𝙲𝚘𝚖𝚙\mathtt{PolyComp} returns 12\frac{1}{2}, which does not match Equation 2. We can achieve the correct result by comparing 2x2x with 2tm+12t_{m}+1. However, the probability of equality is very small, so its impact on the overall approximation error is negligible. Therefore, in practice, we can ignore the equality case, similar to how existing papers approximate ReLU (Lee et al. 2022). The specific algorithm is shown in Algorithm 2.

Algorithm 2 B-spline Basis Approximation

Input: packed ciphertext 𝐱rep\llbracket\mathbf{x}\rrbracket_{\text{rep}}, grid parameters gg and kk, input size nin_{i}, weight matrix 𝐆\mathbf{G}, bound RR
Output: B-spline basis result 𝐛\llbracket\mathbf{b}\rrbracket

1:  rg+2k+1r\leftarrow g+2k+1
2:  𝐠𝟏\mathbf{g_{1}}\leftarrow𝙲𝚘𝚕𝚃𝚒𝚕𝚎\mathtt{ColTile}(𝐆,1,r)(\mathbf{G},1,r)
3:  𝐠𝟐\mathbf{g_{2}}\leftarrow𝙲𝚘𝚕𝚃𝚒𝚕𝚎\mathtt{ColTile}(𝐆,2,r+1)(\mathbf{G},2,r+1)
4:  𝐱𝟏\llbracket\mathbf{x_{1}}\rrbracket\leftarrow𝙿𝚘𝚕𝚢𝙲𝚘𝚖𝚙\mathtt{PolyComp}((𝐱rep𝐠𝟏)12R,0)((\llbracket\mathbf{x}\rrbracket_{\text{rep}}\ominus\mathbf{g_{1}})\otimes\frac{1}{2R},0)
5:  𝐱𝟐\llbracket\mathbf{x_{2}}\rrbracket\leftarrow𝙿𝚘𝚕𝚢𝙲𝚘𝚖𝚙\mathtt{PolyComp}((𝐠𝟐𝐱rep)12R,0)((\mathbf{g_{2}}\ominus\llbracket\mathbf{x}\rrbracket_{\text{rep}})\otimes\frac{1}{2R},0)
6:  𝐛𝐱𝟏𝐱𝟐\llbracket\mathbf{b}\rrbracket\leftarrow\llbracket\mathbf{x_{1}}\rrbracket\otimes\llbracket\mathbf{x_{2}}\rrbracket
7:  for j1j\leftarrow 1 to kk do
8:     𝐭𝟏\mathbf{t_{1}}\leftarrow𝙲𝚘𝚕𝚃𝚒𝚕𝚎\mathtt{ColTile}(𝐆,1,rj))(\mathbf{G},1,r-j))
9:     𝐭𝟐\mathbf{t_{2}}\leftarrow𝙲𝚘𝚕𝚃𝚒𝚕𝚎\mathtt{ColTile}(𝐆,j+1,r)(\mathbf{G},j+1,r)
10:     𝐛𝟏1𝐭𝟐𝐭𝟏(𝐱rep𝐭𝟏)𝐛\llbracket\mathbf{b_{1}}\rrbracket\leftarrow\frac{1}{\mathbf{t_{2}}-\mathbf{t_{1}}}\otimes(\llbracket\mathbf{x}\rrbracket_{\text{rep}}\ominus\mathbf{t_{1}})\otimes\llbracket\mathbf{b}\rrbracket
11:     𝐭𝟑\mathbf{t_{3}}\leftarrow𝙲𝚘𝚕𝚃𝚒𝚕𝚎\mathtt{ColTile}(𝐆,j+2,r+1)(\mathbf{G},j+2,r+1)
12:     𝐭𝟒\mathbf{t_{4}}\leftarrow𝙲𝚘𝚕𝚃𝚒𝚕𝚎\mathtt{ColTile}(𝐆,2,rj+1)(\mathbf{G},2,r-j+1)
13:     𝐛𝟐1𝐭𝟑𝐭𝟒(𝐭𝟑𝐱rep)rot(𝐛,ni)\llbracket\mathbf{b_{2}}\rrbracket\leftarrow\frac{1}{\mathbf{t_{3}}-\mathbf{t_{4}}}\otimes(\mathbf{t_{3}}\ominus\llbracket\mathbf{x}\rrbracket_{\text{rep}})\otimes\operatorname{rot}(\llbracket\mathbf{b}\rrbracket,n_{i})
14:     𝐛𝐛𝟏𝐛𝟐\llbracket\mathbf{b}\rrbracket\leftarrow\llbracket\mathbf{b_{1}}\rrbracket\oplus\llbracket\mathbf{b_{2}}\rrbracket
15:  end for
16:  return 𝐛\llbracket\mathbf{b}\rrbracket

Lazy Combination and Fusion Linear Layer

In the previous section, we described how to approximate B-spline basis functions. However, according to Equation 1, we need to multiply each basis function by a coefficient cmc_{m} and sum them to obtain the final B-spline function approximation. For a given input xx, this is an inner product between the coefficient vector 𝐜\mathbf{c} and the basis function vector 𝐛\mathbf{b}. Since we are using SIMD, computing the inner product requires rotations, which are expensive in the HE domain. Therefore, we adopt a lazy combination approach, postponing the computation of spline(x)\operatorname{spline(x)} until later to reduce the number of rotations. Specifically, we combine 𝐜\mathbf{c} with the weight coefficients 𝐖𝐬\mathbf{W_{s}} of the subsequent linear layer to obtain a new set of weights. This reduces the number of multiplications and rotations, improving computational efficiency.

The weight matrix 𝐖𝐬\mathbf{W_{s}} has dimensions no×nin_{o}\times n_{i}. When the input size is nin_{i}, the coefficients of the basis functions form a matrix 𝐂\mathbf{C} of size ni×(g+k)n_{i}\times(g+k). After combining 𝐖𝐬\mathbf{W_{s}} and 𝐂\mathbf{C}, we obtain a large matrix 𝐖\mathbf{W^{\prime}} of size no×ni(g+k)n_{o}\times n_{i}\cdot(g+k). Therefore, following the method of (Blealtan 2024), we directly use 𝐖\mathbf{W^{\prime}} during inference instead of the linear layer weights 𝐖𝐬\mathbf{W_{s}} and 𝐂\mathbf{C}.

However, due to the repeat encoding introduced during B-spline basis function computation, the result is in 𝙲𝚘𝚕𝚃𝚒𝚕𝚎\mathtt{ColTile} format, with an output vector of length ni(g+k)n_{i}\cdot(g+k). We need to repack the result of the basis function approximation into row-major order to take advantage of SIMD for fast matrix operations. This is essentially a ciphertext reordering problem, requiring multiplication by a permutation matrix 𝐏\mathbf{P} of size ni(g+k)×ni(g+k)n_{i}\cdot(g+k)\times n_{i}\cdot(g+k). The method for generating the permutation matrix is provided in Algorithm 3.

To reduce multiplicative depth and the number of rotations, lazy combination is also used as mentioned above. We multiply the permutation matrix 𝐏\mathbf{P} with the weight coefficients 𝐖\mathbf{W^{\prime}}. The final fusion linear layer weight is then given by:

𝐖𝐟=𝐖×𝐏\mathbf{W_{f}}=\mathbf{W^{\prime}}\times\mathbf{P} (13)

The computation of the fusion linear layer, as well as the linear layer following SiLU, involves matrix multiplication. We optimize this using the baby-step giant-step (BSGS) strategy (Halevi and Shoup 2021), reducing the number of ciphertext rotations required for matrix multiplication and lowering the time cost of the linear layer.

Algorithm 3 Generate Permutation Matrix

Input: number of rows nrn_{r}, number of columns ncn_{c}
Output: permutation matrix 𝐏\mathbf{P}

1:  Initialize 𝐏\mathbf{P} using zeros.{The shape is nrnc×nrncn_{r}n_{c}\times n_{r}n_{c}}
2:  for r1r\leftarrow 1 to nrn_{r} do
3:     for c1c\leftarrow 1 to ncn_{c} do
4:        jc(c1)×nr+rj_{c}\leftarrow(c-1)\times n_{r}+r
5:        jr(r1)×nc+cj_{r}\leftarrow(r-1)\times n_{c}+c
6:        P[jr][jc]1P[j_{r}][j_{c}]\leftarrow 1
7:     end for
8:  end for
9:  return PP

Experimental Evaluation

Experiment Setup

Environment.

We conducted our experiments on a machine equipped with an Intel Xeon Gold 6145 CPU @ 2.00GHz and 512 GB RAM. And we used C++ and the Microsoft SEAL version 3.6.6 (SEAL ) library to implement the privacy-preserving KAN inference based on the RNS-CKKS scheme (Cheon et al. 2019).

Encryption Parameters.

We set the degree of the polynomial modulus to 2162^{16} and used parameters recommended for a 128-bit security level according to (Albrecht et al. 2021). The bit-lengths of the base modulus and special modulus were set to 5151, while the bit-length of the scale factor was set to 4646. The multiplicative depth was set to 2020. For the Bootstrapping parameters, we used the same settings as in (Lee et al. 2022).

Datasets and Metric.

We represent the KAN network structure as an array (n1,n2,n3)(n_{1},n_{2},n_{3}), denoting the number of nodes per layer. The datasets used in the experiments were MNIST (LeCun et al. 1998), Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017), and CIFAR-10 (Krizhevsky, Hinton et al. 2009), and we used KAN models with (728,64,10)(728,64,10), (728,128,10)(728,128,10), and (3072,128,10)(3072,128,10), respectively. Note that when evaluating the model’s inference performance, we used the models trained in the plaintext domain. So we just used inference accuracy to evaluate model performance. For the comparison of effectiveness of approximation and symbolic formulas, we used RMSE as in (Liu et al. 2024). To minimize errors, our experimental metrics were obtained by averaging the results of 10 trials.

SiLU Approximation

Approximation Interval and Polynomial Degree.

Dataset Degree Acc. \uparrow RMSE \downarrow
MNIST 8 96.82 0.307
9 95.87 0.183
10 97.17 0.166
11 96.40 0.153
FMNIST 8 89.26 0.158
9 89.32 0.110
10 89.35 0.093
11 89.29 0.085
CIFAR-10 13 57.08 0.053
14 57.21 0.046
15 57.38 0.042
16 57.03 0.025
Table 1: Comparison of model performance for SiLU approximation with different polynomial degrees across various datasets using our approximation method.

Since the input range of the activation function varies across different datasets (Lee et al. 2023), different tasks require different approximation ranges. According to Equation 11, we estimated the optimal approximation interval using the 5σ5\sigma. We used approximation ranges of [12.4,14.74][-12.4,14.74] for MNIST, [9.77,11.01][-9.77,11.01] for Fashion-MNIST, and [11.90,10.99][-11.90,10.99] for CIFAR-10. Choosing the appropriate polynomial degree for approximation is also crucial for different tasks. Table 1 shows the impact of different polynomial degrees on accuracy across datasets. For better model performance, we select a polynomial degree of 1010 for the MNIST and Fashion-MNIST datasets, and a degree of 1515 for the CIFAR-10 dataset.

Comparison of Different Approximation Methods.

Method Metric MNIST FMNIST CIFAR-10
Remez RMSE \downarrow 0.087 0.046 0.016
Acc. \uparrow 96.76 89.16 57.17
OLS RMSE \downarrow 0.071 0.035 0.013
Acc. \uparrow 97.01 89.27 57.25
Ours RMSE \downarrow 0.166 0.093 0.041
Acc. \uparrow 97.17 89.35 57.38
Table 2: Comparasion of our method with Remez (Zimerman et al. 2024) and OLS (Zheng et al. 2022) for SiLU approximation.

To evaluate the effectiveness of our approach, we compare it with the Remez approximation method used by Zimerman et al. (2024) and the least squares method used by Zheng et al. (2022) in current privacy-preserving inference works. For a fair comparison, we restrict all methods to approximate within the same polynomial degree and range for each dataset. Table 2 presents the results show that our method achieves the highest approximation accuracy across all datasets. For example, on the MNIST dataset, our method achieves an approximation accuracy of 97.17%97.17\%, which is 0.41%0.41\% and 0.16%0.16\% higher than the Remez and the least squares method, respectively. Because our method prioritizes fitting densely distributed data regions while downplaying sparse edge regions, our method achieves better approximation in regions where the data is more concentrated. As shown in Figure 4, our method results in a polynomial that more closely approximates in [4,1][-4,1].

Refer to caption
Figure 4: Comparison of our method with Remez (Zimerman et al. 2024) and OLS (Zheng et al. 2022) for fitting the SiLU activation function on the MNIST dataset.

B-spline Basis Approximation

(ni,g,k)(n_{i},g,k) Naive (s) Ours (s) Speedup (×\times)
(64, 3, 2) 54.009 32.691 1.65
(128, 5, 3) 71.760 33.527 2.14
(256, 5, 3) 109.712 33.788 3.25
(256, 10, 3) 187.691 33.852 5.54
(256, 10, 5) 212.572 42.241 5.03
Table 3: Comparison of the HE inference latency between our B-spline basis approximation method and the naive implementation without lazy combination optimization.
Tasks Dataset Baseline MLPs Ours
Image \uparrow MNIST 97.53 97.20 97.27
FMNIST 89.56 87.10 89.35
CIFAR-10 57.62 54.67 57.37
Formula \downarrow Toy 0.00158 0.14072 0.00157
lpmv0 0.00460 0.01404 0.00461
lpmv1 0.03478 0.07378 0.03489
Table 4: Comparison of model performance across different datasets. For image classification, we evaluate model performance using accuracy, while for symbolic functions, we use RMSE against the original formula.

Compared to the naive method without lazy combination optimization, our proposed B-spline basis approximation method offers a significant efficiency improvement. Table 3 demonstrates that our method improves computational efficiency several times compared to the naive method. For example, when ni=256n_{i}=256, g=10g=10, and k=5k=5, our method takes only 42.24142.241 seconds, which is 5.03×5.03\times faster than the naive method. This improvement is because the naive method requires a permutation operation after B-spline basis approximation to rearrange the data. The permutation in HE domain introduces additional costs, including ni(g+k)n_{i}\cdot(g+k) HE ciphertext multiplications and ni(g+k)\sqrt{n_{i}\cdot(g+k)} HE rotations, leading to significant latency overhead. As the input dimension nin_{i}, B-spline grid gg, and B-spline degree kk increase, the speedup of our method improves.

Additionally, the results in Table 3 show that, as the parameters nin_{i} and gg increase, the HE inference latency remains relatively stable with our method. This is because our method is designed to take full advantage of SIMD of CKKS for parallel processing. Therefore, as long as the single ciphertext packing condition ni(g+2k)N2n_{i}\cdot(g+2k)\leq\frac{N}{2} is satisfied, increases in nin_{i} and gg do not significantly affect the HE inference latency. However, as parameter kk increases, we needs to iteratively compute higher-order B-spline basis functions, increasing the HE inference latency due to the added multiplicative depth. In practical applications, the performance of the B-spline basis approximation can be optimized by carefully selecting appropriate values for nin_{i}, gg, and kk.

Comparison of Model Performance

To evaluate the practical performance of our proposed privacy-preserving KAN inference scheme, we conducted experiments on various datasets. We compared with plaintext KAN as a baseline, as well as with MLP (Rumelhart, Hinton, and Williams 1986), which is a fundamental component in current HE-based privacy-preserving inference methods (Benaissa et al. 2021; Lee et al. 2022; Ran et al. 2023). Apart from image classification tasks, we also conducted experiments on symbolic formulas. We selected the following symbolic functions from (Liu et al. 2024) for our experiments. The Toy Formula dataset represents a high-dimensional symbolic function f(x1,,x100)=exp(1100i=1100sin2(πxi2))f(x_{1},\cdots,x_{100})=\exp(\frac{1}{100}\sum_{i=1}^{100}\sin^{2}(\frac{\pi x_{i}}{2})), while the lpmv0 and lpmv1 datasets consist of special functions from (Liu et al. 2024).

The experimental results show that the accuracy of our privacy-preserving KAN inference is very close to that of the plaintext domain and exceeds that of MLPs, as shown in Table 4. On image benchmark datasets, the accuracy decrease of our scheme is no more than 0.25%0.25\%, and it outperforms MLPs. On the symbolic formula dataset, the RMSE deviation from plaintext is no more than 0.000110.00011, with a lower RMSE than MLPs, confirming the effectiveness of our HE-based KAN inference scheme.

Refer to caption
Figure 5: Comparison of inference latency between our method and the naive implementation across different datasets

Inference Latency

We evaluated the inference latency for a single input across different datasets in a single-threaded environment, comparing our method with the naive implementation, as shown in the figure. For the small dataset Toy-formula, our method achieves an inference latency of 324 seconds, while the naive method has a latency of 468 seconds, resulting in approximately 1.44×1.44\times speedup. On the larger CIFAR-10 dataset, our inference latency is 380 seconds compared to 2789 seconds for the naive method, yielding over 7×7\times speedup. Additionally, since servers often handle data from multiple users, the amortized processing time for multiple datasets is also important. Due to page limitation, we provide details on the multi-threaded environment’s amortized inference latency and the time distribution of various computational components in the Appendix.

Conclusion

We have proposed an efficient scheme for privacy-preserving KAN inference using HE. We designed a novel task-specific activation function approximation method to closely approximate the SiLU activation function. This approximation enhances the performance of HE networks on real-world tasks without necessitating retraining of the network model. Additionally, we introduced an efficient method for computing B-spline functions in the HE domain. By leveraging techniques such as repeated packing, lazy composition, and comparison functions, we achieved high-precision B-spline function computations. Building on these two approximation methods, we developed an HE-based KAN model inference scheme. Our approach delivers about a 7×7\times speedup compared to the naive implementation in the HE domain, while maintaining a minimal accuracy loss of only 0.25%0.25\% compared to plaintext KANs on the CIFAR-10 dataset. Given the efficiency constraints of HE, in the future, we plan to explore more effective slot utilization strategies and more compact cryptographic parameters to further enhance efficiency.

References

  • Albrecht et al. (2021) Albrecht, M.; Chase, M.; Chen, H.; Ding, J.; Goldwasser, S.; Gorbunov, S.; Halevi, S.; Hoffstein, J.; Laine, K.; Lauter, K.; et al. 2021. Homomorphic encryption standard. Protecting privacy through homomorphic encryption, 31–62.
  • Ao and Boddeti (2023) Ao, W.; and Boddeti, V. N. 2023. Autofhe: Automated adaption of cnns for efficient evaluation over fhe. arXiv preprint arXiv:2310.08012.
  • Benaissa et al. (2021) Benaissa, A.; Retiat, B.; Cebere, B.; and Belfedhal, A. E. 2021. Tenseal: A library for encrypted tensor operations using homomorphic encryption. arXiv preprint arXiv:2104.03152.
  • Blealtan (2024) Blealtan. 2024. An Efficient Implementation of Kolmogorov-Arnold Network. https://github.com/Blealtan/efficient-kan.
  • Brutzkus, Gilad-Bachrach, and Elisha (2019) Brutzkus, A.; Gilad-Bachrach, R.; and Elisha, O. 2019. Low latency privacy preserving inference. In International Conference on Machine Learning, 812–821. PMLR.
  • Cheon et al. (2019) Cheon, J. H.; Han, K.; Kim, A.; Kim, M.; and Song, Y. 2019. A full RNS variant of approximate homomorphic encryption. In Selected Areas in Cryptography–SAC 2018: 25th International Conference, Calgary, AB, Canada, August 15–17, 2018, Revised Selected Papers 25, 347–368. Springer.
  • Cheon et al. (2017) Cheon, J. H.; Kim, A.; Kim, M.; and Song, Y. 2017. Homomorphic encryption for arithmetic of approximate numbers. In Advances in Cryptology–ASIACRYPT 2017: 23rd International Conference on the Theory and Applications of Cryptology and Information Security, Hong Kong, China, December 3-7, 2017, Proceedings, Part I 23, 409–437. Springer.
  • Chillotti et al. (2020) Chillotti, I.; Gama, N.; Georgieva, M.; and Izabachène, M. 2020. TFHE: fast fully homomorphic encryption over the torus. Journal of Cryptology, 33(1): 34–91.
  • Chou et al. (2018) Chou, E.; Beal, J.; Levy, D.; Yeung, S.; Haque, A.; and Fei-Fei, L. 2018. Faster cryptonets: Leveraging sparsity for real-world encrypted inference. arXiv preprint arXiv:1811.09953.
  • Genet and Inzirillo (2024) Genet, R.; and Inzirillo, H. 2024. A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting. arXiv:2406.02486.
  • Gilad-Bachrach et al. (2016) Gilad-Bachrach, R.; Dowlin, N.; Laine, K.; Lauter, K.; Naehrig, M.; and Wernsing, J. 2016. Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In International conference on machine learning, 201–210. PMLR.
  • Halevi and Shoup (2021) Halevi, S.; and Shoup, V. 2021. Bootstrapping for helib. Journal of Cryptology, 34(1): 7.
  • Hao et al. (2022) Hao, M.; Li, H.; Chen, H.; Xing, P.; Xu, G.; and Zhang, T. 2022. Iron: Private inference on transformers. Advances in neural information processing systems, 35: 15718–15731.
  • Jegorova et al. (2022) Jegorova, M.; Kaul, C.; Mayor, C.; O’Neil, A. Q.; Weir, A.; Murray-Smith, R.; and Tsaftaris, S. A. 2022. Survey: Leakage and privacy at inference time. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(7): 9090–9108.
  • Juvekar, Vaikuntanathan, and Chandrakasan (2018) Juvekar, C.; Vaikuntanathan, V.; and Chandrakasan, A. 2018. {\{GAZELLE}\}: A low latency framework for secure neural network inference. In 27th USENIX security symposium (USENIX security 18), 1651–1669.
  • Knott et al. (2021) Knott, B.; Venkataraman, S.; Hannun, A.; Sengupta, S.; Ibrahim, M.; and van der Maaten, L. 2021. Crypten: Secure multi-party computation meets machine learning. Advances in Neural Information Processing Systems, 34: 4961–4973.
  • Knott (1999) Knott, G. D. 1999. Interpolating cubic splines, volume 18. Springer Science & Business Media.
  • Krizhevsky, Hinton et al. (2009) Krizhevsky, A.; Hinton, G.; et al. 2009. Learning multiple layers of features from tiny images.
  • LeCun et al. (1998) LeCun, Y.; Bottou, L.; Bengio, Y.; and Haffner, P. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11): 2278–2324.
  • Lee et al. (2022) Lee, E.; Lee, J.-W.; Lee, J.; Kim, Y.-S.; Kim, Y.; No, J.-S.; and Choi, W. 2022. Low-complexity deep convolutional neural networks on fully homomorphic encryption using multiplexed parallel convolutions. In International Conference on Machine Learning, 12403–12422. PMLR.
  • Lee et al. (2021) Lee, E.; Lee, J.-W.; No, J.-S.; and Kim, Y.-S. 2021. Minimax approximation of sign function by composite polynomial for homomorphic comparison. IEEE Transactions on Dependable and Secure Computing, 19(6): 3711–3727.
  • Lee et al. (2023) Lee, S.; Lee, G.; Kim, J. W.; Shin, J.; and Lee, M.-K. 2023. HETAL: efficient privacy-preserving transfer learning with homomorphic encryption. In International Conference on Machine Learning, 19010–19035. PMLR.
  • Li et al. (2024) Li, C.; Liu, X.; Li, W.; Wang, C.; Liu, H.; and Yuan, Y. 2024. U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation. arXiv:2406.02918.
  • Liu et al. (2024) Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T. Y.; and Tegmark, M. 2024. Kan: Kolmogorov-arnold networks. arXiv preprint arXiv:2404.19756.
  • Marcolla et al. (2022) Marcolla, C.; Sucasas, V.; Manzano, M.; Bassoli, R.; Fitzek, F. H.; and Aaraj, N. 2022. Survey on fully homomorphic encryption, theory, and applications. Proceedings of the IEEE, 110(10): 1572–1609.
  • Pang et al. (2024) Pang, Q.; Zhu, J.; Möllering, H.; Zheng, W.; and Schneider, T. 2024. BOLT: Privacy-Preserving, Accurate and Efficient Inference for Transformers. In 2024 IEEE Symposium on Security and Privacy (SP), 130–130. IEEE Computer Society.
  • Patra et al. (2021) Patra, A.; Schneider, T.; Suresh, A.; and Yalame, H. 2021. {\{ABY2. 0}\}: Improved {\{Mixed-Protocol}\} secure {\{Two-Party}\} computation. In 30th USENIX Security Symposium (USENIX Security 21), 2165–2182.
  • Peng et al. (2024) Peng, H.; Ran, R.; Luo, Y.; Zhao, J.; Huang, S.; Thorat, K.; Geng, T.; Wang, C.; Xu, X.; Wen, W.; et al. 2024. Lingcn: Structural linearized graph convolutional network for homomorphically encrypted inference. Advances in Neural Information Processing Systems, 36.
  • Ran et al. (2023) Ran, R.; Luo, X.; Wang, W.; Liu, T.; Quan, G.; Xu, X.; Ding, C.; and Wen, W. 2023. SpENCNN: orchestrating encoding and sparsity for fast homomorphically encrypted neural network inference. In International Conference on Machine Learning, 28718–28728. PMLR.
  • Ran et al. (2022) Ran, R.; Wang, W.; Gang, Q.; Yin, J.; Xu, N.; and Wen, W. 2022. CryptoGCN: Fast and scalable homomorphically encrypted graph convolutional network inference. Advances in Neural information processing systems, 35: 37676–37689.
  • Ran et al. (2024) Ran, R.; Xu, N.; Liu, T.; Wang, W.; Quan, G.; and Wen, W. 2024. Penguin: parallel-packed homomorphic encryption for fast graph convolutional network inference. Advances in Neural Information Processing Systems, 36.
  • Rumelhart, Hinton, and Williams (1986) Rumelhart, D. E.; Hinton, G. E.; and Williams, R. J. 1986. Learning representations by back-propagating errors. nature, 323(6088): 533–536.
  • (33) SEAL. 2020. Microsoft SEAL (release 3.6). https://github.com/Microsoft/SEAL. Microsoft Research, Redmond, WA.
  • Srinivasan, Akshayaram, and Ada (2019) Srinivasan, W. Z.; Akshayaram, P.; and Ada, P. R. 2019. DELPHI: A cryptographic inference service for neural networks. In Proc. 29th USENIX secur. symp, volume 3.
  • Xiao, Rasul, and Vollgraf (2017) Xiao, H.; Rasul, K.; and Vollgraf, R. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
  • Xu et al. (2024) Xu, J.; Chen, Z.; Li, J.; Yang, S.; Wang, W.; Hu, X.; and Ngai, E. C. H. 2024. FourierKAN-GCF: Fourier Kolmogorov-Arnold Network – An Effective and Efficient Feature Transformation for Graph Collaborative Filtering. arXiv:2406.01034.
  • Zhang and Zhang (2024) Zhang, F.; and Zhang, X. 2024. GraphKAN: Enhancing Feature Extraction with Graph Kolmogorov Arnold Networks. arXiv:2406.13597.
  • Zheng et al. (2022) Zheng, P.; Cai, Z.; Zeng, H.; and Huang, J. 2022. Keyword spotting in the homomorphic encrypted domain using deep complex-valued CNN. In Proceedings of the 30th ACM international conference on multimedia, 1474–1483.
  • Zimerman et al. (2024) Zimerman, I.; Baruch, M.; Drucker, N.; Ezov, G.; Soceanu, O.; and Wolf, L. 2024. Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption. In Forty-first International Conference on Machine Learning.