-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate Record Persistence Logic #867
Comments
Great writeup @dennis-tra ! Another question is: do we need to keep storing |
Overall this seems reasonable to me. You could also try and decouple the storage from the "how do I merge records" logic instead of asking the caller to create this. That is likely going to be an implementation detail to many consumers of this library so either way you'd end up creating helper functions here to do the combining work so it's mostly a matter of if you expose that via the constructor or not which is largely a matter of taste 🤷.
Using the mount datastore won't let you do this, although probably you should ask callers to pass datastores for each record type rather than having them predefine a mount with all of them. Also, unless there's going to be a bunch of breakages + utilities for creating an
You'd need to double check, my guess is that it's unused by |
Thanks @aschmahmann! I experimented with an abstraction that I called "Backend" for now. This is a bit different from what I laid out in the first post. The code is all in PR #864.
It's a small interface that can be implemented to support additional record types for PUT_VALUE/GET_VALUE RPCs. There are three "reference" I was also able to enforce support for the above three record types of the DHT uses the |
The |
Right now, the
IpfsDHT
takes two datastores. This one that handlesPUT_VALUE
andGET_VALUE
requests and this one just for provider records wrapped in aProviderManager
.In an effort to
Remove special cases for Peer and Provider records
, I thought that there must be a way to unify things and find better abstractions.One thing I need to mention first: It is not entirely clear to me why the ProviderManager is built with this single loop that selects on 1) additions of provider records 2) asks for provide records 3) garbage collection query results 4) garbage collection trigger. Tracing the git history shows that this design dates back to 2014. Given a thread-safe datastore, I think it would be more performant and easier to reason about if
AddProvider
andGetProviders
would just write and read directly from the underlying datastore. Garbage collection would happen on a schedule in parallel.However, a garbage collection run would race new provider record writes. If I read the current code correctly, this isn't a regression. I think this could be discussed in a separate issue.
Proposal
We could let the
IpfsDHT
expose a method to register record datastores:In the case of the IPFS DHT we would register datastores for the
pk
,ipns
, andproviders
prefixes. The underlying datastores for all of them could be identical but don't have to be.In
NewIpfsDHT
, we would construct a singlemount.Datastore
that all writes and reads are going through.GET_VALUE
/PUT_VALUE
requests will read/write from the common mount datastore and based on the key's namespace prefix use the correct underlying one. The currently supported namespaces arepk
andipns
. With the "mounted" approach, both RPCs would also start to automatically support provider records if the key in the request had the/providers/
prefix and the binary record field had the correct format. In my head, this is one step toward #584.ADD_PROVIDER
/GET_PROVIDERS
requests would prefix the key with/providers/
and read/write from/to the common mount datastore as well.This would allow us to remove the
enableProviders
andenableValues
flags. If there's no datastore for theproviders
prefix we wouldn't handleADD_PROVIDER
/GET_PROVIDERS
requests. If there's no datastore at all, we wouldn't handle*_VALUE
/*_PROVIDERS
requests.However, we're handling PK/IPNS and provider records differently.
When adding PK/IPNS records, we fetch any existing record from disk, validate that existing record (e.g. it might be expired), select the incoming or existing one, and potentially store the incoming one (if we deem it "better").
When adding provider records, we write them to disk and add peer addresses to our Peerstore - no validation of existing data.
My idea would be to provide
ValueStore
andProviderStore
implementations that implement theds.Datastore
interface which can be used as datastores for the prefix mounts.The
ProviderStore
would be similar to theProviderManager
(but directly call the underlying datastore). E.g., it could also use the LRU cache, would implement garbage collection logic, and have a reference to the peerstore to store addresses.The
ValueStore
would receive, among other things, arecord.Validator
and encapsulate the logic from above (fetching existing record, selecting the "better" one). This means we wouldn't need the namespaced validator onIpfsDHT
.The above structure would allow users of
go-libp2p-kad-dht
to inject capabilities to support new record types by registering their own datastore for a certain prefix. TheGET_VALUE
/PUT_VALUE
RPCs would then support it.Concerns
I'm not sure how I feel about moving the logic of how to handle a certain record type into the persistence layer.
cc @aschmahmann
The text was updated successfully, but these errors were encountered: