Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC] S3 multi-tenancy #5263

Open
2 tasks
chibenwa opened this issue Aug 30, 2024 · 5 comments
Open
2 tasks

[EPIC] S3 multi-tenancy #5263

chibenwa opened this issue Aug 30, 2024 · 5 comments

Comments

@chibenwa
Copy link
Member

Why?

Multitenancy is today hard coded for the PG implementation as distinct buckets.

Two concerns here:

That way blbostore could implement different isolation strategies for tenants (configurable):

Note that AES SSE-C isollation strategy cannot be applied with deduplication as several tenants might store the same blob and override each other keys.

How?

Refactor existing API

Refactor API of the blobstore:

Create a new pojo record Tenant(String name)
Create a new pojo record Bucket(BucketName name, Optional<Tenant> tenant)
Add methds for BlobStore and BlobStoreDAO passing Bucket and BlobId), provide default methods for Bucketname supplying a Bucket with no tenant.

Then each blobStore can implement the isolation it wishes - or not!

Memory blobStore DAO multitenancy

Derive a bucketname per tenant within internal storage.

S3

Configuration:

multi-tenancy.mode=none|bucket|ssec|prefix

Definition of done:

  • Documentation
  • Basic unit tests

bucket

Derive a bucketname per tenant within internal storage. (IE what PG does but done within S3BlobStoreDAO)

GC is likely broken and shall be tested with this mode...

ssec

Feed the sse c salt with the tenant.

Should fail with deduplicating blobStore.

prefix

Derive the object key within S3 adding the prefix as needed

This interact with the GC!!!. We shall make sure the GC, when listing only takes the last part of the s3Key IE given prefix/ABC the GC only uses ABC as a blobID.

file

Derive a folder per tenant.

Test GC with this too.

PGSQL

Derive a bucketname per tenant within internal storage. (IE what PG does but done within PostgresBlobStoreDAO)

Test GC with this too.

Cassandra

Tenant isolation strategies do not make sense here...

@quantranhong1999
Copy link
Member

Reminder: Quan will write the ticket for OpenSearch multi-tenancy
The idea: optional conf to inject domain into documents + inject a filter on each searches.
CF #5263

@chibenwa
Copy link
Member Author

chibenwa commented Sep 4, 2024

After a discussion with Patrick,

ssec Should fail with deduplicating blobStore.

Is only true when deduplication is perfomed across tenants

However once deduplication is limited in scope (to one tenant), enforcement of multitenancy isolation through the use of SSE-C can be achieved.

SO multi-tenancy enforcement through the use of PREFIX and SSE-C makes sense.

This is also likely very desirable as encryption with tenant specific keys brings more trust.

@Arsnael
Copy link
Member

Arsnael commented Sep 12, 2024

Team: prefix and file are technically the same no?

prefix/abc technically prefix/ would be like a folder?

@chibenwa
Copy link
Member Author

prefix/abc technically prefix/ would be like a folder?

That's how it looks like but folders do not exist in S3, S3 only support arbitrary prefixes. Prefix can be used to kinda emulate folders.

@Arsnael
Copy link
Member

Arsnael commented Sep 17, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants