The Private Computation Framework (PCF) library builds a scalable, secure, and distributed private computation platform to run secure computations on a production level. PCF library supports running the computation on AWS Cloud and is able to integrate with various private computation technologies. PCF is a set of backend components that run the core encryption protocols.
Reducing cost for running MPC is our primary goal. The goal is to reach some sufficiently low cost for a variety of ads products.
PCF becomes a scalable framework/library of private computation, which can support potentially all major ads use cases, ranging from measurement, delivery health to ranking/optimization. It also needs to work extensively with various infra, like two-party, consortium and/or partner integration.
The difficulty of building any new MPC framework/toolkit is very high. While PCF has to rely on the experts of encryption to devise low level improvements, we should aim for a layered development approach that allows general engineers to understand, iterate and most importantly, to monitor and debug.
PCF is provably private and secure, with acceptable assumptions. Its technical designs and improvements, while oftentimes very complicated, need to be explainable to stakeholders.
PCF v1.0 employs a direct hard dependency on EMP-Toolkit. This design makes it extremely hard to:
- improve the performance/security of the underlying MPC protocol since it’s third-party property;
- adopt new, cost-efficient MPC protocols that are not supported by EMP;
- enable a plaintext protocol for debugging and testing.
To achieve these goals, we developed PCF v2.0 with the following key features:
-
Protocol-independent Frontend APIs. These APIs are exposed to application developers, i.e, general engineers writing MPC-based ads use cases. This feature allows application development without much understanding of the backend protocols.
-
Support for different backend MPC protocols. We currently support the more efficient XOR-SS protocol, instead of the Garbled-Circuit based EMP-Toolkit in PCF v1.0. Meanwhile, it is possible to choose and configure the underlying MPC protocol at run-time.
-
Debugging tools. One special tool is the "plaintext MPC protocol" that enables developers to differentiate issues between application level and MPC engine level.
Putting the key features together, we envision the following picture for PCF v2.0. As shown below, MPC application developers don't need to know the details of the underlying MPC framework and don't need to handle specific public clouds. Changing the MPC framework or switching from AWS to Azure/GCP should be as simple as a configuration change.
Unless you're a developer working on the MPC engine, this is the only component you need to care about when using the system. It can be broken into sub-components.
We'll expand on this below, but this is the interface for reading/writing files without caring whether it's a local file, an S3 object, or something else.
This interface is how a developer can run secure computations without caring about the details of the underlying cryptography. This is where we expose such types as SecInt
or SecBool
. In addition to "private type in clear container" (for example, std::vector<SecInt>
, which require a public index to access its elements.), we also provide "private type in oblivious container", a data type that allows accessing its elements with private indexes (read about oblivious RAM for more details).
The scheduler is the component which consumes a stream of gates (our MPC instruction set) to drive the MPC engine. This object is conceptually similar to a interpreter. The frontend generates a gate stream ("Compute AND between wire_x and wire_y and store result in wire_z") which needs to be executed by the MPC backend. For now, our only backend is the XOR Secret Sharing engine, but in the future there may be contexts in which we can detect that a subroutine would be more efficient to implement in a garbled circuit engine.
The scheduler is also responsible for wire bookkeeping: knowing which wires are no longer necessary and can be cleaned up (conceptually a garbage collector). Overall, this component is responsible for optimization of the circuits generated by the frontend.
Above when we introduced the ASCI, we touched upon private types in clear/oblivious containers. Here, we'll lightly expand on how those work.
Our interface for private types in oblivious containers is backed by write-only Oblivious RAM (ORAM) and allows a developer to write to (adding a value to) a bucket specified by a secret index (a linear scan, like what we do currently to implement pseudo-groupby operations in MPC, is a trivial ORAM).
A write-only ORAM doesn't hide the access pattern of reading (in return of a better-than-full-ORAM performance). In practice, this is fine for many of our use-cases today, but eventually (beyond H2 2022) we may want to support read-only ORAM or even a full ORAM system. In the ASCI, this would be surfaced through our "Private type in oblivious container" objects.
It's hard to really call any specific component (except basic ciphers and PRGs) a "cryptographic primitive" given the depth and complexity to many of these building blocks, but diving beyond the MPC backends described above, there are a number of cryptographic components which are considered generic enough to not be tied to any specific solution. We call these the "core backend components" -- tuple generators, circuit garblers, pseudorandom correlation generators, and more.