Skip to content

cipherstash/eqlpy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EQLPY

GitHub Actions Workflow Status Built by CipherStash

Website | GitHub | Reference | Discussions

Searchable encryption for Django and SQLAlchemy.

Based on the CipherStash Encrypt Query Language (EQL),eqlpy provides a simple, type-driven approach to enabling encryption in your application. By using EQL's built-in searchable encryption schemes, your model queries retain full search capabilities, including exact lookups, range queries, ordering and free text search.

Table of contents

Features

  • Data type handling: Supports various data types including integers, booleans, dates, floats, text, and JSONB.
  • Encrypted Search operations: Search over encrypted data with no noticeable overhead.
  • Standard API: Includes adapters for Django and SQLAlchemy.
  • Fast!: Typical queries only incur a few milliseconds of overhead.

Supported database packages

Currently, eqlpy supports either of the following database packages:

  • psycopg 3 or psycopg 2
  • sqlalchemy + psycopg 2
  • Django + psycopg 2

For code examples of storing and querying encrypted data with CipherStash Proxy using those packages, refer to examples directory and integration tests.

Installation and Setup

Pre-requistites

To use eqlpy in your application, you will first need the following:

  • CipherStash Proxy connected to the target database
  • A ZeroKMS instance - which you can get via CipherStash Cloud
  • EQL installed in your database

Installation

To install eqlpy, use the following command:

pip install eqlpy

You can find the latest version on the Python Package Index (PyPI).

Usage with Django

Defining an encrypted field

Let's say you have a customer model with an encrypted name field. You can use the EncryptedText type when defining the field in the model:

from django.db import models
from eqlpy.eqldjango import *

class Customer(models.Model):
    name = EncryptedText()

The above customer model would create a database table like this:

CREATE TABLE myapp_customer (
    "id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
    "name" cs_encrypted_v1 NOT NULL,
);

[!NOTE] The column type in the database is cs_encrypted_v1 which is encrypted type defined in EQL. Internally EQL uses JSON.

eqlpy supports many different data types, including numbers, dates and JSON. See the Reference docs for more information.

TODO: Link to the reference docs.

EQL Configuration

EQL has a built in configuration system that manages encrypted state as well as any Searchable Encrypted Metadata used to enable queries over encrypted records. EQL configurations are always pending until they are explicitly activated so its possible to batch several changes together.

To specify that the name column of our customer model should be encrypted we can call:

SELECT cs_add_column_v1('myapp_customer', 'name');

To activate the pending configuration call:

SELECT cs_activate_v1();

In Django, these changes can be managed using Data Migrations.

For example:

# Generated by Django A.B on YYYY-MM-DD HH:MM
from django.db import migrations

class Migration(migrations.Migration):
  dependencies = [
    ("myapp", "0001_initial"),
  ]

  operations = [
    migrations.RunSQL("SELECT cs_add_column_v1('myapp_customer', 'name')"),
    migrations.RunSQL("SELECT cs_activate_v1()")
  ]

See the EQL Docs for more information.

Inserting data

eqlpy automatically encrypts values when saving to the database. You can interact with the Django model in the same way as you normally would.

>>> customer = Customer(name="Fred Flintstone")
>>> customer.save()

When retrieving a record, eqlpy automatically decrypts the value using the credentials defined in the setup step.

>>> customer = Customer.objects.get(customer.id)
>>> customer.name
'Fred Flintstone'

View data in SQL

Viewing data in the database is possible via the proxy. Query results will be automatically decrypted and returned in EQL form (JSON).

select id, name from customers;
  id  |                                        name                                      |
 -----+----------------------------------------------------------------------------------+
  446 | {"k":"pt","p":"Fred Flintstone","i":{"t":"customers","c":"name"},"v":1,"q":null} |

However, directly querying data via PostgreSQL will return the fully encrypted values.

 id   | name |
 +---+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 446 | {"c": "mBbL^r9sYr{**CKA_6C{IDN*%A*Hh3Y^S6yIkF*Ez{m&ib*gwg#*A)&{F6mjUwBj_aVx|i6EsZUQXbrN)(~iOu-@!cWr+e^P~2NpGTB?trC@Gwq;6qtWxx", "i": {"c": "name", "t": "customers"}, "k": "ct", "m": [], "o": null, "u": null, "v": 1} |

Queries

To support queries on an encrypted column, an index must be added to the EQL configuration.

TODO: This is super confusing. For example, to allow simple exact lookups of a given name an index with type unique must be added:

SELECT cs_add_index_v1('customers', 'name', 'unique');

As before, this configuration is in a pending state until its activated. Because our table already contains data, we can use the cs_encrypt_v1 function to encrypt any existing records with the new config when activating. In this case, we want to EQL to encrypt searchable encrypted metadata for the index we've just created.

SELECT cs_encrypt_v1();
SELECT cs_activate_v1();

Now queries work much the same way in eqlpy as for Django - even though the data is fully encrypted!

For example, to find a record with a given name:

Customer.objects.get(name__eq="Alice Developer")

Text search works slightly differently. EQL doesn't define the LIKE operator, but instead uses an operator called match which works in a similar way but over encrypted values.

We can add a match index to the name field with EQL as follows:

SELECT cs_add_index_v1('customers', 'name', 'match');
SELECT cs_encrypt_v1();
SELECT cs_activate_v1();

Now the following query will work over encrypted data:

found = Customer.objects.get(name__match="caro")

See Supported Queries for a full list of supported queries.

Migrating to EQLPY

TODO

SELECT cs_encrypt_v1();
SELECT cs_activate_v1();

Release

This project has been set up to use Trusted Publisher in PyPI from GitHub Actions.

To make a GitHub release and publish to PyPI do the following steps:

  • Update the version in pyproject.toml in main branch (eg. "1.2.3")
  • Make a tag of the same version prefixed with "v" (eg. "v1.2.3"), and push
    • git tag v1.2.3
    • git push v1.2.3
  • release-and-publish.yml workflow will create a GitHub release, and publish the package to PyPI

Contributing

Contributions are welcome! Please open an issue or submit a pull request on the GitHub repository.

License

This project is licensed under the MIT License.