Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only install PyHive for default HiveServer2Client driver #65

Merged
merged 2 commits into from
Sep 11, 2018

Conversation

naoyak
Copy link
Collaborator

@naoyak naoyak commented Sep 8, 2018

Since HiveServer2Client defaults to using PyHive for its driver, it seems like we don't really need to install impyla and pyhive. I think we should drop impyla from the default install reqs and raise if the user tries to use it for the Hive backend.

I came across this recently when a few new employees installing Python on OSX via brew install python (which defaults to 3.7 now, problematic given all the packages lagging in support...) and omniduct[hive] failed due to thriftpy via impyla. (Thriftpy/thriftpy#333)
Impyla crashes on 3.7 on its own (cloudera/impyla#312) as well.

The 3.7 issue aside (PyHive is fixed: dropbox/PyHive#232), wdyt of installing only PyHive by default?

@matthewwardrop
Copy link
Collaborator

Hi @naoyak ! pyhive seems to be winding down a bit, and there's still some hope that impyla will become a first-class citizen of the Apache Impala project, which would then make it a more future-proof dependency. For the time being, however, pyhive is (in my experience) more stable, and so I'm happy to accept this patch.

As I'm imagining users who didn't actually write the wrapper library will be the ones seeing these errors more often, can we change the error message slightly to: " is attempting to use the 'impyla' driver, but it is not installed. Please either install the impyla package, or reconfigure this Duct to use the 'pyhive' driver."

Thanks for the patch!

@naoyak
Copy link
Collaborator Author

naoyak commented Sep 11, 2018

@matthewwardrop awesome! added a more informative error message.

Agreed that impyla might be the future path provided it is well supported by Cloudera/integrated into Apache Impala. It would be nice if you (or someone else building on top of PyHive) could be added as a maintainer of PyHive.

@naoyak naoyak merged commit c4f42f2 into airbnb:master Sep 11, 2018
@naoyak naoyak deleted the drop-default-impyla branch September 11, 2018 06:43
@matthewwardrop
Copy link
Collaborator

@naoyak Just noticed that this was merged in with all commits, rather than being squashed first. Would you mind using the "Squash and Merge" functionality in the future? That way the commit log is less cluttered, and it's easier to return to the PR for rationale, etc :). Thanks again!

@danfrankj
Copy link
Collaborator

FYI @matthewwardrop I've updated the settings so squash merging should now be the only option
image

@matthewwardrop
Copy link
Collaborator

@danfrankj Heh... I should have thought to do this. Thanks Dan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants