Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to query with code? #25

Open
marimeireles opened this issue Aug 20, 2024 · 4 comments
Open

How to query with code? #25

marimeireles opened this issue Aug 20, 2024 · 4 comments

Comments

@marimeireles
Copy link

Hey @migalkin, thank you for making this project open source! It's really awesome :)

I'm trying to use Ultraquery on my ontologies but I'm new to the ecosystem and I'm struggling to understand how to write queries in Python. I understand the overall idea of performing search with FOL and how it translates to this:

struct2type = {
("e", ("r",)): "1p",
("e", ("r", "r")): "2p",
("e", ("r", "r", "r")): "3p",
(("e", ("r",)), ("e", ("r",))): "2i",
(("e", ("r",)), ("e", ("r",)), ("e", ("r",))): "3i",
((("e", ("r",)), ("e", ("r",))), ("r",)): "ip",
(("e", ("r", "r")), ("e", ("r",))): "pi",
(("e", ("r",)), ("e", ("r", "n"))): "2in",
(("e", ("r",)), ("e", ("r",)), ("e", ("r", "n"))): "3in",
((("e", ("r",)), ("e", ("r", "n"))), ("r",)): "inp",
(("e", ("r", "r")), ("e", ("r", "n"))): "pin",
(("e", ("r", "r", "n")), ("e", ("r",))): "pni",
(("e", ("r",)), ("e", ("r",)), ("u",)): "2u-DNF",
((("e", ("r",)), ("e", ("r",)), ("u",)), ("r",)): "up-DNF",
((("e", ("r", "n")), ("e", ("r", "n"))), ("n",)): "2u-DM",
((("e", ("r", "n")), ("e", ("r", "n"))), ("n", "r")): "up-DM",
}

But I haven't found a piece of code that is querying a database.
I've unpickled the query files that were used to train Ultra... But that also didn't prove very enlightening.

I think I'm going in the complete wrong direction. That's why I'm asking for help!

Once I've figured this stuff out I'm happy to write a little tutorial or docs for newbies like me to get started with the project, if you think it'd be beneficial.

Best and thanks again,

@migalkin
Copy link
Collaborator

Hey hey, thanks for looking into this.

Indeed, the current query notation is not really SPARQL- or database-ready - it is inherited from how Query2Box defined the queries in this parenthesis notation.
There is, however, a direct mapping between this notation and SPARQL, for example,

  • 1p (single hop) queries ("e", ("r",)) is equivalent to
SELECT ?t WHERE { e r ?t}
  • 2p (two hop) queries ("e", ("r1", "r2")) can be written as
SELECT ?y WHERE {
     e r1 ?x . 
    ?x r2 ?y
}
  • and so on for other query types

The "query engine" in UltraQuery expects the parenthesis notation to derive the execution order, so in case you want to answer SPARQL queries with the model, there should be some external module (smth rule-based or even LLM, hehe) that parses SPARQL queries to this notation.

The graphs are expected to be PyG Data objects residing in memory (RAM or GPU memory). PyG itself does have some bindings to call external graph databases but there is no specific piece of code to query external databases. Might be a useful PR though!

@marimeireles
Copy link
Author

Very cool, that was a great explanation @migalkin.

smth rule-based or even LLM, hehe

That's exactly my end-goal! 😅 I'm struggling so others don't have to.

@marimeireles
Copy link
Author

Sorry, so you're saying that the right way, or the best way, of going about this is creating a torch_geometric.data.GraphStore with your dataset and then model(graph, query)?

@migalkin
Copy link
Collaborator

Somethings like this, yes.
Generally, it depends on the graph size - you probably don't need all the hassle with sizes of < 100K nodes or < 5M edges: those can easily fit into most modern GPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants