Skip to content

Commit

Permalink
Update README, Dockerfile, and add docker-compose
Browse files Browse the repository at this point in the history
  • Loading branch information
jimouris committed Nov 2, 2023
1 parent 762d3eb commit 15bdd94
Show file tree
Hide file tree
Showing 3 changed files with 267 additions and 16 deletions.
10 changes: 8 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,14 @@ cp bin/release/pjc-client exec && \
cp bin/release/pjc-server exec && \
cp bin/release/datagen exec && \
cp bin/release/private-id-multi-key-server exec && \
cp bin/release/private-id-multi-key-client exec

cp bin/release/private-id-multi-key-client exec && \
cp bin/release/dpmc-company-server exec && \
cp bin/release/dpmc-helper exec && \
cp bin/release/dpmc-partner-server exec && \
cp bin/release/dspmc-company-server exec && \
cp bin/release/dspmc-helper-server exec && \
cp bin/release/dspmc-partner-server exec && \
cp bin/release/dspmc-shuffler exec

# thin container with binaries
# base image is taken from here https://hub.docker.com/_/debian/
Expand Down
42 changes: 28 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,29 @@
# Private-ID

Private-ID is a collection of algorithms to match records between two parties, while preserving the privacy of these records. We present two algorithms to do this---one of which does an outer join between parties and another does a inner join and then generates additive shares that can then be input to a Multi Party Compute system like [CrypTen](https://github.com/facebookresearch/CrypTen). Please refer to our [paper](https://eprint.iacr.org/2020/599.pdf) for more details. The MultiKey Private-ID [paper](https://eprint.iacr.org/2021/770.pdf) and the Delegated Private-ID [paper](https://eprint.iacr.org/2023/012.pdf) extend Private-ID.
Private-ID is a collection of algorithms to match records between two or parties, while preserving the privacy of these records. We present multiple algorithms to do this---one of which does an outer join between parties, and others do inner or left join and then generate additive shares that can then be input to a Multi Party Compute system like [CrypTen](https://github.com/facebookresearch/CrypTen). Please refer to our [paper](https://eprint.iacr.org/2020/599.pdf) for more details. The MultiKey Private-ID [paper](https://eprint.iacr.org/2021/770.pdf) and the Delegated Private-ID [paper](https://eprint.iacr.org/2023/012.pdf) extend Private-ID.

## Build

Private-ID is implemented in Rust to take advantage of the languages security features and to leverage the encryption libraries that we depend on. It should compile with the nightly Rust toolchain.
Private-ID is implemented in Rust to take advantage of the language's security features and to leverage the encryption libraries that we depend on. It should compile with the nightly Rust toolchain.

The following should build and run the unit tests for the building blocks used by the protocols

- `cargo build --release`, `cargo test`
```bash
cargo build --release
cargo test --release
```

Each protocol involves two (or more) parties and they have to be run in their own shell environment. We call one party Company and another party Partner. Some protocols also involve additional parties such as the Helper and the Shuffler.

Run the script at etc/example/generate_cert.sh to generate dummy_certs directory if you want to test protocol with TLS on local.

Each protocol involves two parties and they have to be run in its own shell environment. We call one party Company and another party Partner.
### Build & Run With Docker Compose
The following, run each party in a different container:
* Private-ID: `docker compose --profile private-id up`
* Delegated Private Matching for Compute (DPMC): `docker compose --profile dpmc up`
* Delegated Private Matching for Compute with Secure Shuffling (DSPMC): `docker compose --profile dspmc up`

Run the script at etc/example/generate_cert.sh to generate dummy_certs directroy if you want to test protocol with tls on local.
By default, this will create datasets of 10 items each. To run with bigger datasets set the `ENV_VARIABLE_FOR_SIZE` environment variable. For example: `ENV_VARIABLE_FOR_SIZE=100 docker compose --profile dpmc up` will run DPMC with datasets of 100 items each.

## Private-ID

Expand Down Expand Up @@ -60,7 +71,7 @@ env RUST_LOG=info cargo run --release --bin private-id-multi-key-client -- \

## PS3I

This protocol does an inner join based on email addresses as keys and then generates additive share of a feature associated with that email address. The shares are generated in the designated output files as 64 bit numbers
This protocol does an inner join based on email addresses as keys and then generates additive share of a feature associated with that email address. The shares are generated in the designated output files as 64-bit numbers

To run Company:
```bash
Expand All @@ -82,7 +93,7 @@ env RUST_LOG=info cargo run --release --bin cross-psi-client -- \

## PS3I XOR

This protocol does an inner join based on email addresses as keys and then generates XOR share of a feature associated with that email address. The shares are generated in the designated output files as 64 bit numbers
This protocol does an inner join based on email addresses as keys and then generates XOR share of a feature associated with that email address. The shares are generated in the designated output files as 64-bit numbers

To run Company:
```bash
Expand All @@ -104,7 +115,7 @@ env RUST_LOG=info cargo run --release --bin cross-psi-xor-client -- \

The `--output` option provides prefix for the output files that contain the shares. In this case, Company generates two files; `output_company_company_feature.csv` and `output_company_partner_feature.csv`. They contain Company's share of company and parter features respectively. Similarly Partner generates two files; `output_partner_company_feature.csv` and `output_partner_partner_feature.csv`. They contain Partner's share of company and partner features respectively.

Thus `output_company_company_feature.csv` and `output_partner_company_feature.csv` are XOR shares of Company's features. Similarly `output_partner_company_feature.csv` and `output_partner_partner_feature.csv` are XOR shares of Partner's features.
Thus `output_company_company_feature.csv` and `output_partner_company_feature.csv` are XOR shares of Company's features. Similarly, `output_partner_company_feature.csv` and `output_partner_partner_feature.csv` are XOR shares of Partner's features.

### Private Join and Compute
This is an implementation of Google's [Private Join and Compute](https://github.com/google/private-join-and-compute) protocol, that does a inner join based on email addresses and computes a sum of the corresponding feature for the Partner.
Expand Down Expand Up @@ -153,7 +164,7 @@ The output will be ElGamal encrypted Universal IDs assigned to each entry in the

## Delegated Private Matching for Compute (DPMC)

We extend the Multi-key Private-ID protocol to multiple partners. Please refer to our [paper](TODO) for more details.
We extend the Multi-key Private-ID protocol to multiple partners. Please refer to our [paper](https://eprint.iacr.org/2023/012) for more details.

To run Company:
```bash
Expand Down Expand Up @@ -310,17 +321,20 @@ To cite Private-ID in academic papers, please use the following BibTeX entries.

## Delegated Private-ID
```
@Misc{EPRINT:MMTSBC23,
@Article{PoPETS:MMTSBC23,
author = "Dimitris Mouris and
Daniel Masny and
Ni Trieu and
Shubho Sengupta and
Prasad Buddhavarapu and
Benjamin M Case",
title = "Delegated Private Matching for Compute",
year = 2023,
howpublished = "Cryptology ePrint Archive, Report 2023/012",
note = "\url{https://eprint.iacr.org/2023/012}",
title = "{Delegated Private Matching for Compute}",
volume = 2024,
month = Jul,
year = 2024,
journal = "{Proceedings on Privacy Enhancing Technologies}",
number = 2,
pages = 1--24",
}
```

Expand Down
231 changes: 231 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
version: '3.0'

services:

# Datagen

datagen:
container_name: 'datagen'
profiles: ['private-id', 'dpmc', 'dspmc']
build:
context: .
entrypoint:
- '/opt/private-id/bin/datagen'
command: '--size ${ENV_VARIABLE_FOR_SIZE:-10} --cols 1 --features -d /etc/example/'
volumes:
- './common/datagen:/etc/example/'

# Private-ID

private-id-server:
container_name: 'private-id-server'
profiles: ['private-id']
depends_on:
datagen:
condition: service_completed_successfully
build:
context: .
entrypoint: '/opt/private-id/bin/private-id-server'
command: >-
--host 0.0.0.0:10009
--input /etc/example/private-id/company.csv
--stdout
--no-tls
environment:
- 'RUST_LOG=info'
volumes:
- './common/datagen/input_a_size_${ENV_VARIABLE_FOR_SIZE:-10}_cols_1.csv:/etc/example/private-id/company.csv'

private-id-client:
container_name: 'private-id-client'
profiles: ['private-id']
depends_on:
datagen:
condition: service_completed_successfully
private-id-server:
condition: service_started
build:
context: .
entrypoint: '/opt/private-id/bin/private-id-client'
command: >-
--company company-host:10009
--input /etc/example/private-id/partner.csv
--stdout
--no-tls
environment:
- 'RUST_LOG=info'
links:
- 'private-id-server:company-host'
volumes:
- './common/datagen/input_b_size_${ENV_VARIABLE_FOR_SIZE:-10}_cols_1.csv:/etc/example/private-id/partner.csv'

# DPMC

dpmc-company-server:
container_name: 'dpmc-company-server'
profiles: ['dpmc']
depends_on:
datagen:
condition: service_completed_successfully
build:
context: .
entrypoint: '/opt/private-id/bin/dpmc-company-server'
command: >-
--host 0.0.0.0:10010
--input /etc/example/dpmc/company.csv
--stdout
--output-shares-path /etc/example/dpmc/output_company
--no-tls
environment:
- 'RUST_LOG=info'
volumes:
- './common/datagen/input_a_size_${ENV_VARIABLE_FOR_SIZE:-10}_cols_1.csv:/etc/example/dpmc/company.csv'

dpmc-partner-server:
container_name: 'dpmc-partner-server'
profiles: ['dpmc']
depends_on:
datagen:
condition: service_completed_successfully
dpmc-company-server:
condition: service_started
build:
context: .
entrypoint: '/opt/private-id/bin/dpmc-partner-server'
command: >-
--host 0.0.0.0:10020
--company company-host:10010
--input-keys /etc/example/dpmc/partner_1.csv
--input-features /etc/example/dpmc/partner_1_features.csv
--no-tls
environment:
- 'RUST_LOG=info'
links:
- 'dpmc-company-server:company-host'
volumes:
- './common/datagen/input_b_size_${ENV_VARIABLE_FOR_SIZE:-10}_cols_1.csv:/etc/example/dpmc/partner_1.csv'
- './common/datagen/input_b_size_${ENV_VARIABLE_FOR_SIZE:-10}_cols_1_features.csv:/etc/example/dpmc/partner_1_features.csv'

dpmc-helper:
container_name: 'dpmc-helper'
profiles: ['dpmc']
depends_on:
datagen:
condition: service_completed_successfully
dpmc-company-server:
condition: service_started
dpmc-partner-server:
condition: service_started
build:
context: .
entrypoint: '/opt/private-id/bin/dpmc-helper'
command: >-
--company company-host:10010
--partners partner-host:10020
--stdout --output-shares-path /etc/example/dpmc/output_partner
--no-tls
environment:
- 'RUST_LOG=info'
links:
- 'dpmc-company-server:company-host'
- 'dpmc-partner-server:partner-host'
volumes:
- './etc/example/dpmc/:/etc/example/dpmc/'

# DsPMC

dspmc-helper-server:
container_name: 'dspmc-helper-server'
profiles: ['dspmc']
depends_on:
datagen:
condition: service_completed_successfully
build:
context: .
entrypoint: '/opt/private-id/bin/dspmc-helper-server'
command: >-
--host 0.0.0.0:10030
--stdout
--output-shares-path /etc/example/dspmc/output_helper
--no-tls
environment:
- 'RUST_LOG=info'
volumes:
- './etc/example/dspmc/:/etc/example/dspmc/'

dspmc-company-server:
container_name: 'dspmc-company-server'
profiles: ['dspmc']
depends_on:
datagen:
condition: service_completed_successfully
dspmc-helper-server:
condition: service_started
build:
context: .
entrypoint: '/opt/private-id/bin/dspmc-company-server'
command: >-
--host 0.0.0.0:10010
--helper helper-host:10030
--input /etc/example/dspmc/company.csv
--stdout
--output-shares-path /etc/example/dspmc/output_company --no-tls
environment:
- 'RUST_LOG=info'
links:
- 'dspmc-helper-server:helper-host'
volumes:
- './common/datagen/input_a_size_${ENV_VARIABLE_FOR_SIZE:-10}_cols_1.csv:/etc/example/dspmc/company.csv'

dspmc-partner-server:
container_name: 'dspmc-partner-server'
profiles: ['dspmc']
depends_on:
datagen:
condition: service_completed_successfully
dspmc-company-server:
condition: service_started
build:
context: .
entrypoint: '/opt/private-id/bin/dspmc-partner-server'
command: >-
--host 0.0.0.0:10020
--company company-host:10010
--input-keys /etc/example/dspmc/partner_1.csv
--input-features /etc/example/dspmc/partner_1_features.csv
--no-tls
environment:
- 'RUST_LOG=info'
links:
- 'dspmc-company-server:company-host'
volumes:
- './common/datagen/input_b_size_${ENV_VARIABLE_FOR_SIZE:-10}_cols_1.csv:/etc/example/dspmc/partner_1.csv'
- './common/datagen/input_b_size_${ENV_VARIABLE_FOR_SIZE:-10}_cols_1_features.csv:/etc/example/dspmc/partner_1_features.csv'

dspmc-shuffler:
container_name: 'dspmc-shuffler'
profiles: ['dspmc']
depends_on:
datagen:
condition: service_completed_successfully
dspmc-company-server:
condition: service_started
dspmc-helper-server:
condition: service_started
dspmc-partner-server:
condition: service_started
build:
context: .
entrypoint: '/opt/private-id/bin/dspmc-shuffler'
command: >-
--company company-host:10010
--helper helper-host:10030
--partners partner-host:10020
--stdout
--no-tls
environment:
- 'RUST_LOG=info'
links:
- 'dspmc-helper-server:helper-host'
- 'dspmc-company-server:company-host'
- 'dspmc-partner-server:partner-host'

0 comments on commit 15bdd94

Please sign in to comment.