-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Obtaining ids for relationships traversed in path finding algorithms #105
Comments
Hello @hbldh and thanks for reaching out to us, and also thank you for the kind words. We are happy to be able to provide the library. It seems to me that what you are asking for is relationship identities. Neo4j (the database) obviously has this notion, as the property graph data model it supports mandates it. However, the in-memory GDS graphs does not maintain a mapping between the source relationships in the database and the projected relationships of the in-memory graph. For nodes, we do maintain such a mapping. This enables GDS to find what node in Neo4j was projected to GDS and write back a property to that very node (containing e.g. a page rank score or a community id). GDS does not do anything similar for relationships; whenever we write relationships to the database, they are always completely new. In our view, the GDS in-memory graph is always a projection or abstraction of the Neo4j database graph. For example, we support deduplication of parallel relationships into an aggregated relationship that explicitly does not exist in the Neo4j database. Similarly, the Now, it wouldn't be impossible to add a mapping layer to map also Neo4j relationship ids, but with the current design of the in-memory graph it would be very difficult to achieve, and we're not sure that's a direction we would want to go in. It would probably mean that the in-memory graph would grow in size substantially. One idea could be to project the original relationship id as a property to the relationship, which could then be used when reaching back to Neo4j in the cases where an abstract relationship has not been created (due to deduplication or collapsePath or Cypher projection or otherwise). As a separate observation, I'm wondering why you need the exactness of the relationship id? The shortest path problem will always find the shortest (or cheapest, if weighted) path, so you can be sure that if a path passed from node In short, the answers to your questions would be:
When you say "creating |
Thanks for the verbose answer; this was more than I had expected and hoped for! I realize I did not emphasize one important aspect of this: I was planning on using the Yen's K Shortest Paths for the application I was writing, and as such I need to be able to discern between different paths returned. It is not merely the optimal path that I am interested in, and given that requirement I have a need to be able to match the returned I can understand your reasoning with If you say that it might be done, but not easily, I will take that as a discouragment to try. It will most probably mean that my changes will make the K-shortest path problem into a special case, and had I been reviewing that PR I would not have accepted it due to the added maintenance weight.
I merely referred to the I have nothing else to add. I have found a way to handle this and I am content. If you do not see any problem in the way it is implemented now, then we are both content. Thank you once again. |
@hbldh That sounds all good. I should add that relationship ids will very likely make it into a future version of GDS. Tangentially related to your issue here, but also to validate another idea, we made a POC that would enable you to map relationship ids as relationship properties. However, we realised that even with the ids attached to the relationships in the in-memory graph, we would need to design a model for the pathfinding algorithms to use them in a good way, with fallback semantics when there are no relationship ids. We deem this a bigger project which we hope to tackle in the future. I think you are aptly highlighting a limitation of the library as it stands, but perhaps as an answer to the question of how k-shortest paths could be useful without identifying individual relationships, I could highlight that knowing the top-k total costs of going from a to b can also be used for valuable insight. For example, if going from data center A to B with cost 6 is minimal, but the second-cheapest has cost 60, we can know something about the risk with the line being broken. Of course, this doesn't tell you exactly where to look to make some action, but it does provide you with some answer. Anyway, I'm glad you found a workaround, and we'd be happy to receive any further issues that you may encounter in the future. All the best |
Hi, sorry to resuscitate this post but I ran in the same issue. But in my case for example I need to count the number of hops, but only if they are of relationship R1. I can use a workaround because I know that R2 is present only if the cost is less than X, but there might be corner cases where this is not true. Adding custom properties to the virtual relationship would already work (like, custom properties, not really projected) but also limiting since strings are not suppoerted for vrelationships (yes, we could use numbers and map them somehow, but again it is a workaround). I understand this is not a top issue, but it would be nice if there would be a way to activate the possibility to map the relationships, maybe through configuration (by default no mapping). What's the status with this? |
The status at the moment is that we are not looking into these algorithms for the upcoming release. I will note your request and re-open this issue until we have a more conclusive answer. The number of requests for relationship ids is increasing, which does increase the likelihood that it out-competes other feature work. |
Hello,
First of all, thank you for GDS, it is an amazing set of tools and I am delighted by the 1.5.0 release, which I am using with the 4.2.3 release of the Neo4j server.
I have an issue where I cannot obtain enough data from the relationships traversed in the path finding algorithms though. There is a community discussion around this as well, and it seems to not have arrived at a solution.
Say that you have a graph similar to the road network examples in the GDS documentation:
I have added a second relationship with a new label between
B
andD
compared to your example.Now, you want to apply e.g. Yen's K Shortest Paths algorithm and find paths between
A
andF
:If I look at the
path
results of the first record I seeI can get the nodes, with ids and properties, but I have no identification to match which of the available relationships that was traversed in the
segments[i].relationship
specification. It creates a newPATH_0
type and does not relay any properties except the one specified in therelationshipWeightProperty
. All I have to match with is thecost
property (which in most cases in my usecase will be enough, even though I guess that matching needs to be done in the application by repeatedly querying the database for relationships between these nodes with the correspondingcost
).I have tried with native projections, anonymous projections and cypher projections with the same results.
Questions:
PATH_X
type relationships and returning those instead that I am unaware of?The text was updated successfully, but these errors were encountered: