Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamo backend #26

Open
LorenzoBoccaccia opened this issue Apr 25, 2024 · 5 comments
Open

dynamo backend #26

LorenzoBoccaccia opened this issue Apr 25, 2024 · 5 comments

Comments

@LorenzoBoccaccia
Copy link

LorenzoBoccaccia commented Apr 25, 2024

I've built a version using dynamodb as backend using this as foundation, with (experimental) locking support, would you be interested in merging the work back? I can create a pull and iterate over the request as needed

https://github.com/LorenzoBoccaccia/sqlite-s3vfs

or the locking there can be used to make the s3 backend optionally write consistent by using a ddb table in combination

@michalc
Copy link
Member

michalc commented Apr 27, 2024

Hi @LorenzoBoccaccia,

I've built a version using dynamodb as backend using this as foundation

I think using dynamodb as a backend for the data itself is pretty cool. But having a think, I think I'm going to say this is beyond the scope for this project - the chance of us using this in the short to medium term is pretty slim. Not that us using it is the only criteria for having changes merged in, but I think it's just too "far away" in some sense from something we will use, and so maintain. My suggestion for this is to keep it as a separate project that, for example, you maintain.

or the locking there can be used to make the s3 backend optionally write consistent by using a ddb table in combination [...] I can create a pull and iterate over the request as needed

But the locking, I think I am quite interested in. Are you able to raise a PR with just that?

But...

I do suspect quite a lot of discussion and so (as you suggest) iteration on this before it gets merged. Essentially will have to make sure it covers various cases - the worst of this would be clients going away while still having things locked. I'll probably have to read up a bit on locks, and especially distributed locking. And also remind myself how SQLite locking works as well. I have written https://github.com/michalc/sqlite-memory-vfs/blob/main/sqlite_memory_vfs.py#L137 that handles the (much?) simpler case of locking a file in memory. Just for background, I settled on a Python mutex to wrap all access to the "global" (in the sense of the VFS) locks for a particular file. And I do now realise it probably doesn't handle the case of a client going away while it holds a SQLite EXCLUSIVE lock on the file...

And then somehow the PR would have to have tests to cover the non-happy path cases especially

Thanks,

Michal

@LorenzoBoccaccia
Copy link
Author

Are you able to raise a PR with just that?

sure will cut that part in

the worst of this would be clients going away while still having things locked.

yeah currently is happy path only but I've been testing with a bunch of writer ingesting wikipedia on fts5 and as long as it's the happy path it works. I'm fine putting some work on it to handle recovery.

one thing I've found is that sqlite absolutely don't respect page size which is fine on s3 but gets expensive on dynamo as you get to do unaligned writes and reads

@refacktor
Copy link

I am ready to help test this PR as soon as it is available

@michalc
Copy link
Member

michalc commented Apr 28, 2024

one thing I've found is that sqlite absolutely don't respect page size

Oh! I don't think I've ever witnessed this: can you give more detail?

(Maybe I've seen it just on the first page with the first 100 bytes? Not sure... maybe I'm just thinking of the initial read...)

@LorenzoBoccaccia
Copy link
Author

LorenzoBoccaccia commented Apr 29, 2024

one thing I've found is that sqlite absolutely don't respect page size

Oh! I don't think I've ever witnessed this: can you give more detail?

(Maybe I've seen it just on the first page with the first 100 bytes? Not sure... maybe I'm just thinking of the initial read...)

put some detail in the pr #27 basically the pages are rounded to pèower of 2s and they are meant for alignment and as invariants more than block writes that why arbitrary xSectorSize wasn't working the relevant doc is in the pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants