Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the connector support scan with filtering on some column values? #34

Open
trungtv opened this issue Jul 6, 2016 · 5 comments
Open

Comments

@trungtv
Copy link

trungtv commented Jul 6, 2016

Hello,
I want to get some records based on some scan filters i.e.:
column:something = "somevalue"
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html
Does the connector support this kind of operator? if yes, please give example.

Thank you very much,

@trungtv trungtv changed the title Does the connector support scan with filtering on some column value? Does the connector support scan with filtering on some column values? Jul 6, 2016
@nicolaferraro
Copy link
Contributor

Hi, filters are not supported. Just scan parameters (i.e. filters on the row id) can be set in the query.
You can perform filtering at spark level (using filter), obviously, a less-efficient approach.

It should not be so difficult to add them, this can be a good idea for a contribution...

@sachinjain024
Copy link

@nicolaferraro This is one of the important features in spark-hbase connector. I can help/contribute on this but may need your help. AFAIK we need to create DefaultSource which should extend SchemaRelationProvider and create an instance of HbaseRelation. Then we may need to define the implementation of buildScan method.

Since your existing implementation is not based on above approach as mentioned, I don'g get how to add support for pushdown filters. It would be good if you can help me with some pointers.

@nicolaferraro
Copy link
Contributor

@sachinjain024 This connector still does not support Spark SQL, it would be a major improvement. We were talking about extending the HBaseReaderBuilder to add support for filtering rows on the basis of column values.

This could be done by adding some utility methods to the builder that ultimately will produce Filters for a Scan that can be passed to the runtime in some way.

@sachinjain024
Copy link

@nicolaferraro Thanks for information. This is far away from what I am intending to use. Apologies for the confusion.

@liuluheng
Copy link

@nicolaferraro EXCITED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants