Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple types of DBs at once #3278

Closed
wants to merge 10 commits into from

Conversation

Snapstromegon
Copy link

@Snapstromegon Snapstromegon commented Jun 10, 2024

This PR is meant as a starting point and baseline for discussions.

It implements support for multiple types of DBs by using mutliple "DATABASE_URL_*" environment variables, falling back to the current "DATABASE_URL" variable.

This doesn't allow using multiple DBs of the same type, but two different DBs as long as they don't share a driver (so e.g. Postgres and SQLite).

fixes #121

@Snapstromegon
Copy link
Author

This PR right now only holds a rough draft.
I don't have much experience workling with rust macros, so I chose the easy way for some first discussions.

Right now I only implmented the query! macro, but the others should work the same way.
It also uses a stringly typing for the selection of the DB driver - I don't like that, but wasn't able to get the procedural macros to work with actual types.

How this currently looks like:

query!(
    "SQLite",
    "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)"
).execute(&sqlite).await.unwrap();
let row = query!("PostgreSQL", "SELECT 1 as ID").fetch_one(&pg).await.unwrap();

If you now set the environment variables DATABASE_URL_SQLITE and DATABASE_URL_POSTGRES, both queries will get checked at compile time and work at once.

If you have ideas for improving this, please let me know.

@esmevane
Copy link

This is cool to see!

I'm wondering two things about the DATABASE_URL_DRIVER format. I don't want to make it sound like I'm suggesting you do anything differently, I'm mostly just curious and wondered if maybe you've considered these approaches to the design already:

  • Is it possible to get any DATABASE_URL_* string from the environment? Say by pulling up the environment as a whole and just plucking matches?
  • Is it possible to infer the driver via the protocol? mysql:// or postgres:// or sqlite:// etc. In other words, let the interior of the env var describe the kind of DB, vs. the var name? (I think this might already be something sqlx does?)

The reason I ask these is because I bet you if these things are possible, it would then be possible to allow the caller define any variables they like, even having multiple databases with the same driver. I.E., DATABASE_URL_MEMORY and DATABASE_URL_CACHE, both as sqlite.

@Snapstromegon
Copy link
Author

  • Is it possible to get any DATABASE_URL_* string from the environment? Say by pulling up the environment as a whole and just plucking matches?

Yes, this is totally possible and I want to implement this in the future. It should even be possible to infer the actual expected varaible at compiletime via the type of the database driver (via Database::NAME).
Finding some wildcard solution is also important to allow for supporting third party implemented DB drivers (e.g. for DuckDB or something similar).

  • Is it possible to infer the driver via the protocol? mysql:// or postgres:// or sqlite:// etc. In other words, let the interior of the env var describe the kind of DB, vs. the var name? (I think this might already be something sqlx does?)

Also this is possible and I thought about this too. Right now this PR already kind of does this, because the names are only for differntiating and the code doesn't check that a DATABASE_URL_POSTGRES actually holds a postgres URL. You could just as well (using the patternmatching from point 1) use DATABASE_URL_FILESYSTEM and DATABASE_URL_SERVER for e.g. sqlite and postgres.
When resolving the type to an URL it does exactly this check to test if any of the vars contains a URL that matches the DB supported schemas.

The reason I ask these is because I bet you if these things are possible, it would then be possible to allow the caller define any variables they like, even having multiple databases with the same driver. I.E., DATABASE_URL_MEMORY and DATABASE_URL_CACHE, both as sqlite.

Sadly IMO this will not be possible (at least as I implemented it right now), because it will always use the first URL that matches the scheme for the DB when resolving the URL, so you can't have two sqlite DBs.

IMO the most urgent usecase is supporting multiple types of DBs (as this is my biggest painpoint). Maybe we can find a way to support multiple DBs of the same type at a later point in time.

@Snapstromegon
Copy link
Author

I'm working on a version that uses types instead of strings for selecting the DB backend here: Snapstromegon#1

I'm still running into an issue when multiple backends are compatible with a query and any help is welcome.

@Snapstromegon
Copy link
Author

Hi all, I just merged my implementation for a type driven query provider selection.

With this change you can use the query! macro like this:

query!(
  Sqlite,
  "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)"
).execute(&sqlite).await.unwrap();
query!(
  Postgres,
  "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)"
).execute(&pg).await.unwrap();

I only implemented the declarative macro rule changes for query!, but it should work basically the same for all other macros.

I'd love to have some feedback from someone related to the project about what would need to be added to this PR so a feature like this can be merged.

@Snapstromegon
Copy link
Author

@abonander I saw that you were commenting in the similar PR #3397. What do you think about the approach presented here?
IMO this is one of the easiest ways to support the (from my experience) most common usecase of providing the ability for multiple different DBs as backends.

Maybe we can introduce a solution based on this for the "simple" cases and one based on the toml files for the more complex ones?

@abonander
Copy link
Collaborator

Sorry, I just don't think this is the right approach.

Having multiple ways to do something is just more to teach and more possible confusion for the user.

This would also be rather annoying to use because you'd have to remember to specify the driver every time.

I'm not even sure how usage of two different drivers in the same context is supposed to work; does one of the queries just become a no-op or what? That would be a nightmare to teach.

With the sqlx.toml approach, you would have a separate sub-crate per driver, which could be separately compiled in online or offline mode, and you could rename the DATABASE_URL variable whatever you want to suit the application. You wouldn't have to remember to specify the driver every time. And you would have the opportunity to configure many more things besides.

Later on, we can support the ability to create shadowed versions of the macros with a prefixed name, using a specific config. This would let you mix and match drivers and databases within the same crate to your heart's content, in a way that IDE autocompletion could actually possibly assist with.

I'm still running into an issue when multiple backends are compatible with a query and any help is welcome.

That's covered by configuring the macros to emit code using the Any database so the driver can be chosen at runtime. That's been a long-requested feature and would be the approach that I'd recommend. It would also be enabled by the sqlx.toml work.

The sqlx.toml solution is what I've arrived at after a long time of thinking about it. It covers so many different use-cases in one feature, and it's highly extensible. I'm focusing what time I have to spend on SQLx on getting it implemented so people can start playing with it.

I appreciate the effort, but unfortunately because I fundamentally disagree with the direction here, I'm going to close.

@abonander abonander closed this Sep 18, 2024
@Snapstromegon
Copy link
Author

Sorry, I just don't think this is the right approach.

Having multiple ways to do something is just more to teach and more possible confusion for the user.

I fully agree here.

This would also be rather annoying to use because you'd have to remember to specify the driver every time.

As per my implementation only if you want to actually use multiple drivers, so it's an extra thing you need to do if you want to use this feature, which IMO needs to be done anyways.

I'm not even sure how usage of two different drivers in the same context is supposed to work; does one of the queries just become a no-op or what? That would be a nightmare to teach.

No, if the query gets called, it gets executed. This could be useful e.g. for tools that migrate data from an Sqlite DB to a Postgres one.
Aside from that, if you want to support multiple dbs for your program, you could select a variant of a backend during runtime.

With the sqlx.toml approach, you would have a separate sub-crate per driver, which could be separately compiled in online or offline mode, and you could rename the DATABASE_URL variable whatever you want to suit the application. You wouldn't have to remember to specify the driver every time. And you would have the opportunity to configure many more things besides.

Later on, we can support the ability to create shadowed versions of the macros with a prefixed name, using a specific config. This would let you mix and match drivers and databases within the same crate to your heart's content, in a way that IDE autocompletion could actually possibly assist with.

I'm still running into an issue when multiple backends are compatible with a query and any help is welcome.

That's covered by configuring the macros to emit code using the Any database so the driver can be chosen at runtime. That's been a long-requested feature and would be the approach that I'd recommend. It would also be enabled by the sqlx.toml work.

The sqlx.toml solution is what I've arrived at after a long time of thinking about it. It covers so many different use-cases in one feature, and it's highly extensible. I'm focusing what time I have to spend on SQLx on getting it implemented so people can start playing with it.

I appreciate the effort, but unfortunately because I fundamentally disagree with the direction here, I'm going to close.

If the sqlx.toml gets introduced, why not completely remove the env var entirely and move everything into the toml? To me it feels again like two ways of doing things if the env var persists.

Aside from that I'll need to take a look at how exactly it will work with the sub-crates and your current solution around sqlx.toml.

Personally I love that there's some movement around this, as this is right now one of my main blockers for an app I'm building.

@abonander
Copy link
Collaborator

If the sqlx.toml gets introduced, why not completely remove the env var entirely and move everything into the toml? To me it feels again like two ways of doing things if the env var persists.

No, because the sqlx.toml is for permanent, global configuration changes that are meant to be checked into version control, while DATABASE_URL is environment-specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Question: best way to use sqlx with connections to two different databases?
3 participants