FlowFrame is a Scala compiler plugin that performs static, fine-grained information flow analysis on code to detect privacy violations, in the spirit of the Jif compiler (https://www.cs.cornell.edu/jif/).
The project is organized as a hierarchy of top-level traits, each of which extend the functionality of the previous trait: CheckLabels extends SparkSignatures extends DefaultPolicies extends AbstractPolicies extends Solvers extends Constraints extends Policies.
Here is a quick rundown of what’s in each file:
purpose/PurposePolicies.scala
- implementation of Purpose Policiespurpose/PurposePolicyParser.scala
- parser for Purpose Policy annotationspurpose/predicates/Predicates.scala
- incomplete implementation of PurposePolicy predicates (i.e. the stuff that comes after the with operator in Purpose Policies)CheckLabels.scala
- implements constraint generation passConstraints.scala
- data structures for policy constraintsDefaultPolicies.scala
- code for calculating default policies in the absence of policy annotationsFlowFramePlugin.scala
- implements Scala’s plugin interfacePolicies.scala
- basic data structure for policiesAbstractPolicies.scala
- data structures for abstract policies, policy references, policy signatures and policy structuresPolicyViolationDebugInfo.scala
-helpful information for debugging policy violationsSolvers.scala
- policy constraint solverSparkSignatures.scala
- policy signatures for Spark operations
FlowFrame currently supports Scala 2.11 and 2.12.
Run sbt package
to create jars for the compiler plugin and the runtime. You can
find these jars in flowframe-plugin/target
and flowframe-runtime/target
.
To use FlowFrame, add the following commandline options to scalac
:
-Xplugin:</path/to/plugin/jar>
-P:flowframe:lang:<policy language>
Currently, FlowFrame only supports the purpose
policy language.
FlowFrame uses @policy annotations on case classes representing database table rows accessed through the Spark Dataset API. Any named entity can be given explicit policy annotations, but FlowFrame attempts to infer annotations whenever possible. Ideally, only input and output Datasets will require annotations.
See the CONTRIBUTING file for how to help out.
FlowFrame is Apache 2.0 licensed, as found in the LICENSE file.