Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change API to use move semantics or custom buffer type #41

Open
elcritch opened this issue Aug 11, 2023 · 2 comments
Open

Change API to use move semantics or custom buffer type #41

elcritch opened this issue Aug 11, 2023 · 2 comments

Comments

@elcritch
Copy link
Contributor

Making datastore multi-threaded would pose a problem with transferring data ownership between threads. It seems we could either use move, or switch from seq[byte] to a custom buffer. I'm not sure which is better.

This looks related to:

@elcritch
Copy link
Contributor Author

elcritch commented Aug 14, 2023

My concern here would be that with refc GC and preventing data copies of large blocks. If my understanding is correct there's not a shared heap, so just using move to transfer the data to another thread wouldn't work. It'd crash or corrupt things. Though I'm going to do some quick tests/research as I'm not too familiar with refc in these matters.

We can keep the the seq[bytes] alive on the async thread. That'd be a bit more work for items like put where the data would be sent, but we'll need a hold onto it and only free it once the data-thread has sent back an ok.

Actually, writing this out, I realize that in this case the get operation might be harder. Since we don't seem to know the size of seq[bytes] upfront, we'd need to dynamically allocate on the data-thread. Moving the data over to the main-thread would require doing a copy to the target seq[bytes]. Maybe shallowCopy could be use somehow?

@elcritch
Copy link
Contributor Author

@dryajov and myself discusses this issue some more, and the get and query operations do pose a bit more of an issue to the simple model of lending data to a thread to do work. That may work well for computations but does pose challenges for non-copying data mechanisms.

We checked out a few variants on GC_ref and GC_unref which do fail when transferring GC types, as expected. There is an option to use protect / dispose but it appears undocumented and only mentioned briefly on the Nim forums:

  • protect/dispose - which are the multithreading counterparts of GC_ref/GC_unref. The idea is that one can allocate an object in one thread, protect the pointer, send the pointer over to another thread and have that thread dispose it when done. protect returns a ForeignCell object, which has a pointer to the owning GC, dispose expects the cell to free it/decrease its RC.

This approach, while potentially usable strikes me as finicky and requiring getting a lot of GC details correct. It's also not future compatible with ARC/ORC either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant