kad: improve FIND_NODE response definition #609

guillaumemichel · 2024-03-27T14:31:55Z

Context

The Kademlia spec doesn't explicitly define how a node N should answer to the request FIND_NODE(N). The FIND_NODE RPC is expected to return the (k) closest nodes to the requested key.

The libp2p Kademlia DHT offers the following types of operations:

Peer routing

Finding the closest nodes to a given key via FIND_NODE.

Upon a response:

If successful the response will contain the k closest nodes the peer knows to the key Key. Add them to the candidate list Pn, except for those that have already been queried.

Implementations have diverse behaviors and expectations, which can lead to poor interoperability (see libp2p/go-libp2p-kad-dht#966).

go-libp2p-kad-dht only returns its own peer record.
rust-libp2p sends an empty response.
js-libp2p only returns its own peer record.

None of these implementations actually follows the spec, so they should be adapted to return the k closest peers.

Proposal

Nodes should not return their own peer record if they are among the k known closest peers to the requested key.
- Peers sending the request FIND_NODE(N) to N already have N's peer record. Even though the peer record sent by N could include additional multiaddresses that the requestor isn't aware of yet, this is the role of Identify, not the DHT.
FIND_NODE response should never include the requester among the closer peers.
- This information is useless.

SgtPooki · 2024-03-27T14:56:36Z

Probably some dumb questions. Wouldnt it make sense for the requestor to filter responses rather than peers not providing accurate responses?

If im asking for the closest N nodes to a key, a node adding itself to the response (if it is one of the closest), or adding the requestor, is more truthful than not.

Is find-node only used when looking for peers to add provider records to or are there other uses? how are the responses currently processed for all find-node calls? Why wouldnt the requestor filter responses as necessary? They are already keeping a set of queried nodes.

Can we guarantee that a prior find-node call that returned the current requestee was already processed?

guillaumemichel · 2024-03-27T15:14:02Z

If im asking for the closest N nodes to a key, a node adding itself to the response (if it is one of the closest), or adding the requestor, is more truthful than not.

Is find-node only used when looking for peers to add provider records to or are there other uses? how are the responses currently processed for all find-node calls? Why wouldnt the requestor filter responses as necessary? They are already keeping a set of queried nodes.

Can we guarantee that a prior find-node call that returned the current requestee was already processed?

I think that in most implementations the requestor is already filtering the response, which is good.

Not sending these two peers is about saving space on the wire. A response including either one of these peers will only contain useful information about k-1 peers.

FIND_NODE is used to find peers that are close to the requested key, this has many applications such as finding a specific node, looking for the closest peers to store a provider record/IPNS record, etc.

dennis-tra · 2024-03-28T10:48:09Z

First of all, I think it's indeed not spec-compliant to only return a single record when the spec says to always return the k closest peers to a key and I'm favour of calling out these special cases here.

In my mental model, a peer is always in its own 255th bucket so it must be part of the response set. This means I'm siding with @SgtPooki here when he says:

If im asking for the closest N nodes to a key, a node adding itself to the response (if it is one of the closest), or adding the requestor, is more truthful than not.

However, I think the most important consideration is if any of the options (including the peers or not) has an impact on the correct operation of the protocol. I would argue, in terms of routing soundness it doesn't matter, right? If it does then we would have a clear winner.

Some references I have found from git blame:

DHTClient mode breaks peer routing go-libp2p-kad-dht#161
the "Never tell a peer about itself." part was added here libp2p/go-libp2p-kad-dht@2c8f387

The special case

targetPid := peer.ID(pmes.GetKey())
  if targetPid == dht.self {
  	closest = []peer.ID{dht.self}
  }

dates back to 2014 with no specific justification in the git commit messages

sukunrt · 2024-04-02T13:46:23Z

Apologies if this is off topic or my understanding is completely incorrect.

I think the do not tell the peer about itself logic is broken.

closest = dht.betterPeersToQuery(pmes, from, dht.bucketSize)

// Never tell a peer about itself.
if targetPid != from {
	// Add the target peer to the set of closest peers if
	// not already present in our routing table.
	//
	// Later, when we lookup known addresses for all peers
	// in this set, we'll prune this peer if we don't
	// _actually_ know where it is.
	found := false
	for _, p := range closest {
		if targetPid == p {
			found = true
			break
		}
	}
	if !found {
		closest = append(closest, targetPid)
	}
}

if the requester is in the server's routing table, we will tell the peer about itself because it is in the closest slice.

Why do we return the targetPid if it wasn't found in the routing table?

guillaumemichel · 2024-04-02T15:23:30Z

@sukunrt these issues are specific to go-libp2p-kad-dht.

I think the do not tell the peer about itself logic is broken.

dht.betterPeersToQuery never returns itself, nor from. Because in the current implementation, a node is expected to return its own peerid if it receives a FIND_NODE request for itself, the node will add its own peerid to the list.

Why do we return the targetPid if it wasn't found in the routing table?

It is possible that we are connected to a node, but it isn't in the routing table (e.g because there are better candidates). Adding targetPid to the list allows us to return the target peer record, in this special situation.

achingbrain · 2024-04-29T18:17:12Z

This suggestion in combination with libp2p/go-libp2p-kad-dht#820 makes it really hard to bootstrap a DHT from scratch.

Since initially no nodes have any peers in their routing tables, the PR above means go-libp2p-kad-dht will only add a peer to it's routing table that responds with 1+ peers to a query for it's own PeerId, and since no nodes will respond with their own Peer Info to a query for their own PeerID, then no peer will ever be added to the routing table of a go-libp2p-kad-dht node unless that peer already has DHT peers.

This is fine if you're joining an already established DHT but makes creating one almost impossible, unless I'm missing something.

I think maybe you include your own PeerInfo in the response to a request for your own PeerId. Granted it's not the most efficient thing to do, but arguably querying a peer for it's own PeerId doesn't make much sense, unless you're doing it as an effective no-op as go-libp2p-kad-dht does.

guillaumemichel · 2024-04-30T09:27:14Z

Good point @achingbrain!

I think that the lookupCheck introduced in libp2p/go-libp2p-kad-dht#820 is responsible of the problem you are describing. A node shouldn't be picky about peers it adds to its routing table, when its own routing table isn't well populated.

I just opened a PR to address this libp2p/go-libp2p-kad-dht#970

arguably querying a peer for it's own PeerId doesn't make much sense

An example where this query makes sense could be a reputation system where peers gets a grade from other network participants. The grade would be stored on the n closest peers to the target node. We don't want to trust the node itself to store its own grades. In order to give or retrieve the grade associated with a peer, FIND_NODE(peer) should return the n closest peers.

achingbrain · 2024-04-30T09:30:55Z

Yes, my point was from the point of view of a "find closer peers"-type query it doesn't make sense, one where you want to find out more information about the network, find out who to dial next, etc, not that there was no reason to do it. There are plenty of reasons to do it as we both point out. 😄

updated peer routing description

8196b2d

This was referenced Mar 27, 2024

kad: FIND_NODE(self) should return multiple peers libp2p/js-libp2p#2450

Closed

findnode(self) should return multiple peers libp2p/go-libp2p-kad-dht#968

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kad: improve FIND_NODE response definition #609

kad: improve FIND_NODE response definition #609

guillaumemichel commented Mar 27, 2024

SgtPooki commented Mar 27, 2024

guillaumemichel commented Mar 27, 2024

dennis-tra commented Mar 28, 2024

sukunrt commented Apr 2, 2024

guillaumemichel commented Apr 2, 2024

achingbrain commented Apr 29, 2024 •

edited

Loading

guillaumemichel commented Apr 30, 2024

achingbrain commented Apr 30, 2024

kad: improve FIND_NODE response definition #609

Are you sure you want to change the base?

kad: improve FIND_NODE response definition #609

Conversation

guillaumemichel commented Mar 27, 2024

Context

Proposal

SgtPooki commented Mar 27, 2024

guillaumemichel commented Mar 27, 2024

dennis-tra commented Mar 28, 2024

sukunrt commented Apr 2, 2024

guillaumemichel commented Apr 2, 2024

achingbrain commented Apr 29, 2024 • edited Loading

guillaumemichel commented Apr 30, 2024

achingbrain commented Apr 30, 2024

achingbrain commented Apr 29, 2024 •

edited

Loading