Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHOENIX-5238 Provide an option to pass hints with PhoenixRDD and Data… #4

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yanxinyi
Copy link

…source v2

@yanxinyi
Copy link
Author

yanxinyi commented Apr 30, 2019

used setCacheBlocks to disable the block cache. 
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks-boolean-
Current implementation deprecated the PhoenixRDD.scala file, so I modified PhoenixDataSource.java with an additional option.

@@ -87,6 +88,7 @@ public PhoenixDataSourceReader(DataSourceOptions options) {
this.tableName = options.tableName().get();
this.zkUrl = options.get(PhoenixDataSource.ZOOKEEPER_URL).get();
this.dateAsTimestamp = options.getBoolean("dateAsTimestamp", false);
this.disableBlockCache = options.getBoolean("NO_CACHE", false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use Hint.NO_CACHE instead of the String for consistency?

@@ -148,6 +150,9 @@ public StructType readSchema() {
// Optimize the query plan so that we potentially use secondary indexes
final QueryPlan queryPlan = pstmt.optimizeQuery(selectStatement);
final Scan scan = queryPlan.getContext().getScan();
if (this.disableBlockCache) {
scan.setCacheBlocks(false);
Copy link
Contributor

@ChinmaySKulkarni ChinmaySKulkarni Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scan variable is unused. You can actually remove it. You should be setting this on each scan in the queryPlan, otherwise the Spark executor scans will not have this hint set. Instead of iterating over each scan here, it may be easier to set this in PhoenixDataSourceReadOptions. We create an instance of this when we call PhoenixDataSourceReader#planInputPartitions() from the driver. Also, these are embedded in each of our InputPartitions, so the read options are available to us on the Spark executors (see PhoenixInputPartitionReader#initialize()). Here we are iterating over the scans and you can use the set value in the read options to setCacheBlocks to false.

Also, in case this hint is provided, you should make sure any other scan objects used on the driver also has this property set for example, the scan that we use on the driver-side to get the region locations.

@@ -38,11 +38,14 @@
private static final String V1 = "v1";
private static final String V2 = "v2";
private static final String V3 = "v3";
private static final String NO_CACHE = "NO_CACHE";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about using already defined enum value Hint.NO_CACHE

@@ -64,6 +67,8 @@ public void testPhoenixConfigsExtractedProperly() {
assertEquals(V1, p.getProperty(P1));
assertEquals(V2, p.getProperty(P2));
assertEquals(V3, p.getProperty(P3));
assertEquals(V3, p.getProperty(P3));
assertEquals(true, Boolean.valueOf(p.getProperty(NO_CACHE)));
Copy link
Contributor

@ChinmaySKulkarni ChinmaySKulkarni Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test isn't testing your change at all..Here you are using the extraOptions to set the property and just checking that the property is set. Ideally, we want to use extraOptions to set HBase/Phoenix properties if they are valid configs we set, in say hbase-site.xml. In this case, NO_CACHE is not such a config so we are using a different way to set this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants