Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added netloc support for hdfs URIs #168

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

vvaten
Copy link

@vvaten vvaten commented Jan 2, 2018

removed the special handling of HDFS URIs which was against the URI specification

…DFS URIs which was against the URI specification
@mpenkov
Copy link
Collaborator

mpenkov commented Jan 2, 2018

@vvaten Thank you for your pull request. It looks good to me.

@menshikh-iv Might be a good idea to merge this after our HDFS integration tests are up. What do you think?

@menshikh-iv
Copy link
Contributor

@mpenkov I agree.
@vvaten thanks for PR! Sorry for waiting, but first we need to finish #151.

@@ -355,6 +356,9 @@ class ParseUri(object):
* file:///home/user/file
* file:///home/user/file.bz2

NOTE: hdfs://path/file does no longer work as it is against the URI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more information to comment (when this happens, what HDFS version affected, etc)?

Copy link
Author

@vvaten vvaten Feb 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a generic change that affects all HDFS versions. They do support hdfs://host/path/file URI format. Using hdfs://path/file in smart_open violates this and also violates the URI specification (RFC3986) where the hostname part is always after the '://'. The correct way to refer to local content is hdfs:///path/file instead of hdfs://path/file.

@mpenkov mpenkov added the stale No recent activity from author label Sep 28, 2019
@mpenkov mpenkov self-assigned this Sep 28, 2019
@mpenkov mpenkov changed the base branch from master to develop April 25, 2020 09:04
@govindmurthi21
Copy link

Hello is there any way this can be merged ? As part of my work I was trying to read and write to hdfs and traced the bug to this
Hdfs commands require a fully qualified uri such as ‘’hdfs dfs ls hdfs://path’’

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement stale No recent activity from author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants