VFSFile iVersion 3 methods, version 2 passthrough #418

rogerbinns · 2023-03-24T20:28:53Z

Context: https://groups.google.com/g/python-sqlite/c/IIpnmLGyhrE

Items to fix:

The doc is weak on details
You can't control whether the version 2 methods pass through or not
APSW shell should have .vfslist / .vfsinfo commands

Items to not do:

You can't implement the version 2 methods
The version 3 methods are not exposed or implementable

Nikratio · 2023-03-25T11:12:49Z

My use-case for this is that I want to track all writes to the main database file so that I can keep it in sync with a remote copy. I think this means that I can't allow any shared memory use. At the same time, I would otherwise like to forward all operations to the default VFSFile implementation.

I think Shm is used only for the WAL file (not the main database file) so this should work fine in practice. However, I would feel a lot safer if attempts to call xShm* would fail loudly. It would be great if there was a way to do that.

Would it be feasible to e.g. make a xShmMap = None definition in the VFSFile class translate to the corresponding sqlite3_io_methods element to be set to null?

rogerbinns · 2023-03-26T13:53:45Z

How strictly do you want to track writes? For example do you want to block local writes until remote is in sync, or do you just need to know something changed so you can eventually get around to it? If your use case is on the strict side, then there are already a variety of solutions out there like SQLiteCloud.

It is correct that you can't detect writes when shared memory is in use. (Technically you could by mprotect the area, have a signal handler to detect writes, and similar expensive schemes).

I will add an iVersion flag or similar to control which version of the interface is presented to SQLite. Something similar was done for virtual tables.

Nikratio · 2023-03-26T16:25:54Z

I want to replicate writes asynchronously, so no need to block.

Would setting iVersion=1 flag be the recommended solution to set the xShm* pointers to null? In that case, would it ever be possible to use xFetch (iVersion=3) while still not implementing xShm*?

(Not sure what xFetch is actually used for, just wondering)

rogerbinns · 2023-03-26T16:46:25Z

There is a small combinatorial problem due to 3 sets of methods and wanting some NULL, so that would be part of exactly how many parameters there are.

xFetch looks like a way of you owning the in memory storage. Regular xRead requires you to copy the data into a buffer SQLite provides. xFetch lets you return a pointer, avoiding that copy. The current SQLite VFS implementations only implement xFetch if mmap is enabled. But even that has issues - if the file size has changed then mremap can change the address of the mapping. The VFS keep a reference count of Fetch/UnFetch calls and only does mremap if the outstanding count is zero.

It does look like there is no sense in making it possible to implement the iVersion 2 & 3 calls in Python.

If you only need loose tracking, it would seem that a VFS approach is way overkill. Couldn't you just periodically poll the last modify timestamp on the database files and sync on those changing?

There is a also a tracing vfs.

Nikratio · 2023-03-26T19:41:21Z

I need to know which specific parts of the database are changing, so that I don't needlessly upload the entire file. So mtime doesn't quite do it.

I could use tracing_vfs, but to me that seems like overkill (why parse text messages for all VFS operations?)

rogerbinns · 2023-03-26T22:39:30Z

I don't know the specifics of your requirements, but I'd tend to go for a simpler more robust solution using rsync to transfer the files on change. rsync does per block checksums and then transfers changed blocks only. It would also handle the case of data moving from wal to the main file since the checksum would remain the same so no need to transfer copies of that block. An inotify style hook would then invoke it as needed.

I wouldn't expect you to use the tracing vfs as is, but rather hacking it down to exactly what you need in the most convenient way.

No matter what, the issue description currently had my thinking on what will and won't be implemented and i believe it will also work for your needs.

Nikratio · 2023-03-27T18:50:52Z

I'm confused about the tracing VFS idea. You're saying I should write a custom C VFS (based on tracing VFS) instead of doing this in Python through APSW?

(I can't use rsync because I'm dealing with a dumb REST backend, not a server under my control)

rogerbinns · 2023-03-27T22:54:15Z

APSW has to receive parameters in C, convert them to Python, convert back to C again to call the VFS being inherited from, then go through the conversions again with the result values and out parameters. It is doing a lot of work, when all you wanted to know was modified ranges! Hacking down the tracing vfs would leave you with a module having little footprint or code getting only the information you need.

It is a shame you can't update the REST backend. If it at least gave checksums for blocks then you could do the syncing without having to mess with VFS, and it would be far more resilient against transient network issues etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VFSFile iVersion 3 methods, version 2 passthrough #418

VFSFile iVersion 3 methods, version 2 passthrough #418

rogerbinns commented Mar 24, 2023 •

edited

Loading

Nikratio commented Mar 25, 2023

rogerbinns commented Mar 26, 2023

Nikratio commented Mar 26, 2023

rogerbinns commented Mar 26, 2023

Nikratio commented Mar 26, 2023

rogerbinns commented Mar 26, 2023

Nikratio commented Mar 27, 2023

rogerbinns commented Mar 27, 2023

VFSFile iVersion 3 methods, version 2 passthrough #418

VFSFile iVersion 3 methods, version 2 passthrough #418

Comments

rogerbinns commented Mar 24, 2023 • edited Loading

Nikratio commented Mar 25, 2023

rogerbinns commented Mar 26, 2023

Nikratio commented Mar 26, 2023

rogerbinns commented Mar 26, 2023

Nikratio commented Mar 26, 2023

rogerbinns commented Mar 26, 2023

Nikratio commented Mar 27, 2023

rogerbinns commented Mar 27, 2023

rogerbinns commented Mar 24, 2023 •

edited

Loading