FedKit: Enabling Cross-Platform Federated Learning for Android and iOS

Sichang He1, Beilong Tang1, Boyan Zhang1, Jiaoqi Shao12, Xiaomin Ouyang3, Daniel Nata Nugraha4, Bing Luo1 This work of Sichang He, Beilong Tang, and Boyan Zhang was supported by DKU Undergraduate Studies Office through the Summer Research Scholars program. The work of Jiaqi Shao and Bing Luo was supported by the Suzhou Frontier Science and Technology Program (SYG202310). (Corrsponding author: Bing Luo.) 1Duke Kunshan University, Jiangsu, China, 2The Hong Kong University of Science and Technology, China,
3University of California, Los Angeles, USA, 4Flower Labs GmbH, Winterhuder Weg 29, 22085 Hamburg, Germany

Abstract

We present FedKit, a federated learning (FL) system tailored for cross-platform FL research on Android and iOS devices. FedKit pipelines cross-platform FL development by enabling model conversion, hardware-accelerated training, and cross-platform model aggregation. Our FL workflow supports flexible machine learning operations (MLOps) in production, facilitating continuous model delivery and training. We have deployed FedKit in a real-world use case for health data analysis on university campuses, demonstrating its effectiveness. FedKit is open-source at https://github.com/FedCampus/FedKit.

I Motivation and Analysis

FL is promising for training shared ML models collaboratively on end devices while preserving data privacy [1]. Yet, most FL research relies on simulations on desktop computers, which may overlook constraints in realistic FL applications. To enhance FL algorithm design with real-world data, we aim to build a practical mobile FL system harnessing real user data.

However, existing accessible mobile FL systems exhibit important limitations, as outlined in Table I. Specifically, we identify three key challenges: (C0) To collaboratively train the same models across our users’ diverse smartphones, we require cross-platform on-device training and model aggregation. (C1) To customize FL algorithms and update models in production, we need maximal control over the FL process on user’ devices from our end. (C2) The unfamiliar mobile development environment hinders data scientists while developing models for mobile FL.

II Proposed Solution

FedKit is designed to enable practical cross-platform FL research on Android and iOS. In Sec. II-A, we present an FL pipeline that converts Python-based models, and trains and aggregates them across platforms, addressing (C0). For (C1), our FL workflow enables flexible MLOps from the backend in production, as we detail in Sec. II-B. The entire procedure tackles (C2) by ensuring a user-friendly development environment in Python. Overall, FedKit facilitates FL across Android and iOS client devices, coordinated by a single Backend server. Each client trains a local model with private data, and the Backend performs cross-platform aggregation of these local models to update the global model.

II-A Cross-Platform FL Model Pipeline

To enable cross-platform FL, especially cross-platform aggregation, we propose a pipeline comprising model conversion and unified training APIs, as shown in Fig. 1.

Functionality	[2]	[3]	[4]	[5]	FedKit
Android-Only	✓	✓	✓	✓	✓
iOS-Only	✗	✗	✓	✓	✓
Cross-Platform Aggregation	✗	✗	✗	✓	✓
Training Acceleration	✓	✓	✓	✗	✓
MLOps	✓	✓	✗	✗	✓
Open-Source Backend	✗	✗	✓	✓	✓

TABLE I: Functionality Comparison among On-Smartphone FL Systems.

Refer to caption — Figure 1: FedKit Model Pipeline for Cross-Platform FL.

II-A1 Model Conversion

To enable model creation in Python, we begin by converting models into formats compatible with Android (TensorFlow Lite or TFLite) and iOS (Core ML). Core ML defines a fixed model structure and provides the official converter CoreMLTools. For TFLite, we standardized a model format, and then developed a compliant TensorFlow converter. This standardized format includes four essential FL methods (train, infer, parameters, and restore).

II-A2 Unified Training APIs

FedKit provides TFLite Trainer and Core ML Trainer to train the converted models on Android and iOS devices, utilizing GPU and NPU acceleration. Moreover, both trainers expose unified APIs for retrieving and setting model parameters, model fitting, and evaluation. On Android, these APIs invoke the TFLite interpreter to call our standardized methods defined in Model Conversion. However, on iOS, our experimentation revealed that Core ML forbids directly setting parameters, which could impede FL. To overcome this constraint, we apply a mostly undocumented method that modifies the underlying ProtoBuf representations of Core ML models. Specifically, we employ Swift code generated from the relevant ProtoBuf definition files, and navigate nested model definitions to access parameters on iOS. Consequently, our unified training APIs exhibit comparable functionality on both iOS and Android platforms.

II-A3 Cross-Platform Aggregation

Aggregation necessitates uniform parameter representations, posing primary challenges in retrieving and setting parameters for Core ML and TFLite. 1) Core ML permits only specific layers to be updatable and only provides their parameters post-training. Thus, to obtain other layers’ parameters, we implemented a solution using ProtoBuf manipulation. This approach involves recording layer information in Model Conversion and utilizing it during training. 2) The TFLite interpreter only accepts inputs/outputs as maps from names to tensors. Therefore, during Model Conversion, we assign index-based names to each parameter layer and dynamically generate the concrete methods that accept these arguments. During training, we call the methods with these index-based names to access parameters. Finally, these unified parameters enable seamless cross-platform aggregation.

II-B Flexible MLOps in Production

In production, FL development faces challenges from the lack of direct control over end devices. FedKit empowers researchers to deploy models and algorithms continuously (MLOps). Leveraging our complete control of the self-hosted Backend, FedKit’s three-step FL workflow facilitates continuous delivery and training, as illustrated in Fig. 2.

II-B1 Continuous Cross-Platform Model Delivery

Traditionally, models for on-device training are embedded into client apps. However, this approach couples model delivery with app updates, resulting in complexities when submitting apps to app stores and garnering user adoption.

Circumventing this complexity, FedKit enables continuous model delivery without app updates by decoupling models from clients through Model Request, which allows new model deployment by uploading to the Backend. Specifically, clients request the Backend for a model (TFLite/Core ML) aligned with their platform (Android/iOS) and training data type. The Backend selects the appropriate model $M$ from the database and responds with detailed model information. Consequently, Model Request delivers the TFLite or Core ML model to clients for FL training.

II-B2 Customizable Continuous FL Training

FedKit manages continuous FL training by allowing multiple parallel FL training sessions through FL Server Setup. When clients request an FL Server to train their chosen model $M$ , the Backend either reuses a suitable FL Server $S_{\mathrm{F}}$ if it exists, or spawns a new one. Each FL Server operates as an independent Python subprocess of the Django Backend, occupying its own port that clients connect to for FL Training. This dynamic approach ensures that newly delivered models can be immediately trained with new FL Servers without affecting ongoing ones. Furthermore, these FL Servers employ the Flower FL Framework [4] for scheduling training and evaluation. This decision empowers our FL Servers to leverage Flower’s flexibility and allow for FL algorithm customizations in Python.

III Live Demonstrations

We demonstrate FedKit’s effectiveness in two settings.¹¹1 Demo video: https://www.youtube.com/watch?v=TONTBkp_l6M.

III-1 Model Deployment on Demo Android/iOS App

We demonstrate FL among devices running a Flutter client app and a laptop running a FedKit Backend. First, we demonstrate our seamless FL model pipeline. We convert a TensorFlow MNIST model, and conduct normal FL across an Android and an iOS device, despite the heterogeneity. Note that TFLite, Core ML, and the aggregation strategy determine the model performance. Second, to showcase MLOps, we modify the model and deploy its new version. As outlined in Table II, our telemetry shows that the iOS device is over 5 $\times$ faster in local training despite having 0.5 $\times$ RAM, illustrating how FedKit will provide real-world statistics to enhance FL algorithm design.

Device	System on a Chip	Accel.	RAM	Time
Nova 9 Pro	Snapdragon 778G	OpenCL	8GB LPDDR5	3.583s
iPhone 13	A15 Bionic, 4 GPU	CoreML	4GB LPDDR4X	0.656s

TABLE II: Configurations of Devices and Average Local Training Time Per Round (Two Local Epochs) in A Previous Demo Run.

III-2 FedCampus

To harness real users’ health data on university campuses, we developed the FedCampus Android/iOS app to leverage data from participants’ smartwatches to perform FL on a sleep-duration prediction model. As illustrated in Fig. 3, we showcase our self-hosted Backend, and display the real-time logs and losses on a connected laptop. The results of our cross-platform continuous training showcased a significant reduction in the model’s training loss, demonstrating FedKit’s effectiveness in real-world scenarios.

References

[1] A. Farcas et al., “Demo abstract: A hardware prototype targeting federated learning with user mobility and device heterogeneity,” in IoTDI, 2023.
[2] C. He et al., “FedML: A research library and benchmark for federated machine learning,” 2020.
[3] D. Madrigal et al., “Project florida: Federated learning made easy,” Microsoft, Tech. Rep., Jul. 2023.
[4] A. Mathur et al., “On-device federated learning with flower,” 2021.
[5] A. J. Hall et al., “Syft 0.5: A platform for universally deployable structured transparency,” 2021.