Skip to content

A curated list of advancements in Vertical Federated Learning, frameworks and libraries.

License

Notifications You must be signed in to change notification settings

ngc436/awesome-vertical-federated-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

awesome-vertical-federated-learning

A curated list of advancements in Vertical Federated Learning (VFL), frameworks and libraries.

Table of Contents

Publications in Top-tier Conferences (or influential)

Surveys on VFL

Type Title Year Conference / Journal Description
VFL Vertical Federated Learning: Concepts, Advances and Challenges 2023 Arxiv
General Towards Open Federated Learning Platforms: Survey and Vision from Technical and Legal Perspectives 2024 Arxiv

VFL benchmarks (benchmarks with VFL tasks)

Bench Type Title Year Conference Code Algorithms
VFL Stalactite: Toolbox for Fast Prototyping of Vertical Federated Learning Systems 2024 RecSys Code ---
VFL VertiBench: Advancing Feature Distribution Diversity in Vertical Federated Learning Benchmarks 2024 ICLR Code Website GAL, C-VFL, SecureBoost, Pivot, FedTree, FedOnce
VFL VFLAIR: A Research Library and Benchmark for Vertical Federated Learning 2024 ICLR Code ---
VFL FedAds: A Benchmark for Privacy-Preserving CVR Estimation with Vertical Federated Learning 2023 SIGIR Code ---
General The OARF Benchmark Suite: Characterization and Implications for Federated Learning Systems 2022 ACM Transactions on Intelligent Systems and Technology Code ---
General Fedml: A research library and benchmark for federated machine learning 2020 arxiv Code ---

VFL algorithms

Algorithm Model Category Title Code Year Conference / Journal
AL Any Ensemble-based Assisted learning: A framework for multiorganization learning - 2020 Neurips
GAL Any Ensemble-based Gal: Gradient assisted learning for decentralized multi-organization collaborations Code 2022 Neurips
SplitNN NN Split-based Split learning for health: Distributed deep learning without sharing raw patient data - 2018 Arxiv
C-VFL NN Split-based Compressed-VFL: Communication-efficient learning with vertically partitioned data Code 2022 ICML
BlindFL NN Split-based Vertical federated machine learning without peeking into your data - 2022 SIGMOD
FedOnce NN Split-based Practical vertical federated learning with unsupervised representation learning Code 2022 IEEE Transactions on Big Data
SecureBoost GBDT Split-based Secureboost: A lossless federated learning framework - 2021 IEEE Intelligent Systems
Pivot GBDT Split-based Privacy preserving vertical federated learning for tree-based models Code 2020 VLDB
FedTree GBDT Split-based Fedtree: A federated learning system for trees Code 2023 MLSyS
VF2Boost GBDT Split-based Vf2boost: Very fast vertical federated gradient boosting for cross-enterprise learning - 2021 SIGMOD
OpBoost GBDT Split-based OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization 2023 VLDB
Fed-Forest RF Split-based Federated forest - 2020 IEEE Transactions on Big Data

VFL privacy

Title Year Conference / Journal Description
Privacy Matters: Vertical Federated Linear Contextual Bandits for Privacy Protected Recommendation 2023 KDD ---
A Unified Solution for Privacy and Communication Efficiency in Vertical Federated Learning 2023 Neurips ---
Differentially Private Vertical Federated Clustering 2023 VLDB ---

VFL metrics / feature importance estimation

Title Year Conference / Journal Description
Fair and Efficient Contribution Valuation for Vertical Federated Learning 2024 ICLR Clients' contribution valuation metric - vertical federated Shapley value (VerFedSV)

VFL Datasets (or datasets that are used in benchmarks)

Type Dataset Modality Link Benchmark # parties # samples # features # classes
VFL-native NUS-WIDE Image Link VertiBench, VFLAIR 5 269,648 64 / 144 / 73 / 128 / 225 2
VFL-native Satellite Image Link VertiBench 16 3,927 13-channel 158x158 4
VFL-native Vehicle Acoustic, Seismic Link VertiBench 2 78,823 50 / 50 3
VFL-native FedAds Table Link FedAds 2 11,300,000 16 / 7 -
Centralized covtype Table Link VertiBench - 581,012 54 7
Centralized msd Table Link VertiBench - 463,715 90 -
Centralized realsim Table Link VertiBench - 72,309 20,958 2
Centralized gisette Table Link VertiBench - 60,000 5,000 2
Centralized epsilon Table Link VertiBench, FedAds - 400,000 2,000 2
Centralized letter Table Link VertiBench - 15,000 16 26
Centralized radar Table Link VertiBench - 15,000 16 26
Centralized MNIST Image Link VertiBench, VFLAIR - 325,834 174 7
Centralized CIFAR10 Image Link VertiBench, VFLAIR - 60,000 1,024 10
Centralized CIFAR100 Image Link VFLAIR - 60,000 1,024 100
Centralized Breast Cancer Table Link VFLAIR - 569 32 2
Centralized Pima Indians Diabetes Table Link VFLAIR - 768 9 2
Centralized Breast histopathology images Image Link FedAds
Centralized Yahoo answers dataset Text Link FedAds
Centralized Give Me Some Credit Tabilar link FedAds
Centralized Avazu Tabilar link FedAds - 45,006,432 23 2

Frameworks and Libraries with VFL support

FATE

github paper

VFL-related (hetero in FATE terminology) features:

  • privacy-preserving strategies: SSHE and FedPass

Stalactite

paper

FedML

Implements a bunch of practical algorithms in horizontal and vertical FL settings

Falcon

https://github.com/nusdbsystem/falcon