-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple backends error when using join #7
Comments
thanks for reporting that @safrazRampersaud I will take a look into that today. |
Thank you @xmnlab. Happy to help any way I can ... |
@xmnlab Debugging a bit more, for N number of tables the ibis-mssql connection is supplying N number of clients (maybe?). Looking again at the SQLiteClient example with two tables, there is only one object ibis.backends.sqlite.client.SQLiteClient which is referenced twice but since the object has been seen before in the search, the find_backends method from the ibis client implementation does not return two of them to the compile method. This differs when I trace a simple example with ibis-mssql and I find that for 2 tables, there are two different ibis_mssql.client.MSSQLClient objects. Is ibis-mssql creating a different client for every table that gets referenced on the same connection? Hopefully this helps ... |
@safrazRampersaud I think you should execute the materialized expression (ref http://ibis-project.org/docs/user_guide/sql.html#join-with-select), for example: result = result.materialize()
result.execute() Could you check if it fixes the problem? |
Hi @xmnlab Thanks for the response! Unfortunately, I get the same error even with your suggestion. |
If you are trying ibis-mssql from source, could you try some changes in the connect function in api.py? First, you will need to import some modules, add it in the beginning of the file: import ibis.config as cf
from ibis.config import options these are the changes you can try: diff --git a/ibis_mssql/api.py b/ibis_mssql/api.py
index 9f4fea3..c81d405 100644
--- a/ibis_mssql/api.py
+++ b/ibis_mssql/api.py
@@ -113,7 +113,7 @@ def connect(
year : int32
month : int32
"""
- return MSSQLClient(
+ client = MSSQLClient(
host=host,
user=user,
password=password,
@@ -123,3 +123,12 @@ def connect(
driver=driver,
odbc_driver=odbc_driver,
)
+
+ if options.default_backend is None:
+ options.default_backend = client
+
+ with cf.config_prefix('sql'):
+ k = 'default_limit'
+ cf.set_option(k, None)
+
+ return client so basically, instead of return directly MSSQLClient, you will store that to a variable called client ... you will add some extra configuration, and it will return the client variable. this is the same configuration we are using for ibis-omniscidb let me know if it works for you. |
@xmnlab Awesome!! Thank you!! I just looked up and saw your message after I found where my issue lies. When calling the connect method in client.py, if the database parameter is not explicitly set in the signature, then it defaults to master. I used a connection string on my end at first, but then I switched to accessing the connection via other parameters in the signature. Concerning the default, its not a problem in my case until Line 121 in ibis-mssql/client.py, where the client_class instance will get its own ibis_mssql.client.MSSQLClient object. Then, for N tables referenced, N backends will be produced since there is no anchor to an explicit database (master by default does not exist in my setup). Does my interpretation sound in the neighborhood of being correct? Looking at your suggestion, I agree with where you are going setting a default_backend mapped to the already discovered client. I'd like to implement it for coverage and let you know how it holds. Thanks again ... wish I would have came up with my work around sooner so that you didn't have to waste time on it but maybe the default_backend solution will prove useful later. |
@safrazRampersaud I am glad that you found a workaround for this issue. We also need to add more tests for the project, so maybe we can find this kind of issue early in the test stage PS: as ibis-framework doesn't have a way to test "external" backends, we need to figure out a way to create some tests that could be reusable. |
@xmnlab I added the default_backend as directed. Works well! Time for a PR? Yes, I hear you on the tests. I think this is something I'll have to think about as well. Happy to have this conversation if it will provide more coverage to the project and then some. |
that sounds great! for this PR, just a simple test for the join expression should be enough for now. I am going to create an issue for tests, so we can discuss some ideas to increase the coverage for tests there. |
Great! ... I'll look at putting something together this weekend. Thanks! |
That sounds great! thank you so much. |
Created a PR for the default_backend change and for getting the ball rolling on tests. Could use another set of eyes on it to see if this is the direction the project wants to go. Moreover, I left comments about what I believe was supposed to be accomplished, but again ... could use some more direction here. |
Hello,
I am encountering an error that is related to multiple backends when attempting to perform a join. I can't provide the exact code but I can provide the skeleton syntax of my steps:
Here is what I have observed:
I printed out the backends from the ibis-framework client (ibis/client.py) in method validate_backends before the stack trace error. I saw [<ibis_mssql.client.MSSQLClient object at 0x000001A3EED58588>, <ibis_mssql.client.MSSQLClient object at 0x000001A3EED58448>]. My first thought was that these backends correspond to the number of tables I am joining. To determine if that was true, I joined 3 tables and the output from the backends had 3 ibis_mssql.client.MSSQLClient objects.
For comparison, I ran the example in https://labs.quansight.org/blog/2020/06/ibis-an-idiomatic-flavor-of-sql-for-python-programmers/ under the Materializing the join section to see if the join would give the same issue about the number of backends. It did not. Moreover, I investigated the results of the materialize() command in:
from the website and it was interesting that there were 2 SQLiteTables and a MaterializedJoin[table].
There was no MaterializedJoin[table] from [5] in my steps when I took a closer look.Note: I've resolved all the duplicate columns from the join and each of the tables referenced above can successfully have execute() invoked on it. So I do not think there are any issues with my setup.
Have any thoughts about what might be causing this? Happy to provide more details about this error. Thanks!
The text was updated successfully, but these errors were encountered: