Skip to content

Latest commit

 

History

History
47 lines (43 loc) · 3.66 KB

README.md

File metadata and controls

47 lines (43 loc) · 3.66 KB

Split Learning HE

Source code for the paper Love or Hate? Share or Split? Privacy-Preserving Training Using Split Learning and Homomorphic Encryption (accepted at 20th Annual International Conference on Privacy, Security & Trust (PST'23))

Split learning involves 2 parties (a client and a server) that collaboratively train a model. The client keeps the data on his side, trains his part of the model to produce the activation maps, then sends those activation maps to the server. The server subsequently continues the training process. This way, the client never needs to send his data to the server. However, the activation maps can still leak information about the client's data. In this project, we train a split learning 1D CNN model on homomorphic encrypted activation maps to solve this privacy leakage.

Requirements

Essentially, we only need these 2 main libraries:
torch==1.10.0+cu102
tenseal==0.3.10
More detailed requirements are in the file requirements.txt.

Repository Structure

  • data/
    • train_ecg.hdf5 - the processed training split from the MIT-DB dataset
    • test_ecg.hdf5 - the processed testing split from the MIT-DB dataset
    • ptbxl_processing.ipynb - code needed to process the PTB-XL dataset. Running the code will output train_ptbxl.hdf5 and test_ptbxl.hdf5
  • local_plaintext/
    • train.ipynb - code to train the 1D CNN locally on the MIT-DB dataset
    • visual_invertibility.ipynb - code to demonstrate the privacy leakage of the activation maps produced by the convolutional layers on the MIT-DB dataset
    • train_ptbxl.ipynb - code to train the 1D CNN locally on the PTB-XL dataset
    • visual_invertibility_ptbxl.ipynb - similar to visual_invertibility.ipynb but for the PTB-XL dataset
  • local_plaintext_big
    • train.ipynb - code to train the 1D CNN model on the MIT-DB dataset but with bigger activation maps.
    • visual_invertibility.ipynb - demonstrate the privacy leakage
  • u_shaped_split_he
    • client.py and server.py: code for the client side and server side to train the split learning protocol using homomorphically encrypted activation maps on the MIT-DB dataset
    • client_ptbxl.py and server_ptbxl.py: similarly, but for the PTB-XL dataset
  • u_shaped_split_he_big
    • client.py and server.py: code to train the split 1D CNN using HE with bigger activation maps size, only for the MIT-DB dataset
  • u_shaped_split_plaintext
    • client.py and server.py: code for the client side and server side to train the split learning protocol on plaintext activation maps for the MIT-DB dataset
    • client_ptbxl.py and server_ptbxl.py: similarly, but for the PTB-XL dataset
  • u_shaped_split_plaintext_big
    • client.py and server.py: code for the client and the server to train the split learning protocol on plaintext activation maps with bigger size, only for the MIT-DB dataset

Running the code

Make sure you have the data files needed in the data/ directory (train_ecg.hdf5 and test_ecg.hdf5 for the MIT-DB dataset, and train_ptbxl.hdf5 and test_ptbxl.hdf5 for the PTB-XL dataset).
To run the code, simply cd into the directory and run the code for server side and client side. Note that you need to run the code for server side first. For example, if you want to run the u-shaped split learning using HE for the PTB-XL dataset, do the following:

cd u_shaped_split_he
python server_ptbxl.py

Then, open a new tab and run

python client_ptbxl.py