Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"hbaseTable" cannot read from HBase #33

Open
kenouyang opened this issue Jul 5, 2016 · 2 comments
Open

"hbaseTable" cannot read from HBase #33

kenouyang opened this issue Jul 5, 2016 · 2 comments

Comments

@kenouyang
Copy link

I am trying to pull data from HBase table using spark-hbase-connector.

Here are the steps that I have done:

  1. Download spark-hbase-connector_2.10-1.0.3.jar file
  2. call spark using:
    spark-shell --deploy-mode client --master yarn --jars spark-hbase-connector_2.10-1.0.3.jar --conf spark.hbase.host=server1.dev.hbaseserver.com
  3. run:

import it.nerdammer.spark.hbase._

import org.apache.spark.SparkContext

import org.apache.spark.SparkConf

  1. read HBase table using:
    val hBaseRDD = sc.hbaseTable[(String, String, String)]("WEATHER_INFO").select("tem-0301", "tem-0401").inColumnFamily("r")
  2. try to check the first 10 rows:
    hBaseRDD.take(10)

Here is the error log:

Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=68347: row 'WEATHER_INFO,,00000000000000' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=server1.dev.hbaseserver.com,60020,1465837116220, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to server1.dev.hbaseserver.com/10.14.116.11:60020 failed on local exception: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to server1.dev.hbaseserver.com/10.14.116.11:60020 is closing. Call id=9, waitTime=2

It seems like the program is pointing to table 'hbase:meta' instead of pointing to table 'WEATHER_INFO'.
I am new to scala, I use pyspark most of the time. I want to try this Spark-Hbase-connector, because it has a much better and powerful UI.
Please point out where I made mistakes? Thanks.

@nicolaferraro
Copy link
Contributor

Looks from your logs that you cannot connect to machine 10.14.116.11:60020 or that the DNS server returns the wrong address for server1.dev.hbaseserver.com. I've seen some cluster configurations making a lot of confusion between public and LAN IPs. Make sure 10.14.116.11 is the internal IP (usually without firewall rules) and not the external one.

You should also look at the configuration to check if a region server is started in that machine, that it listens on port 60020 and that every machine of the cluster (master included) can telnet to that host:port pair.

Note that port mapping has changed in HBase 1.1+. So if region servers are listening on a different port there could be something wrong with the connector.

@kenouyang
Copy link
Author

@nicolaferraro Thank you for your reply.
The 10.14.116.11 is the internal IP, firewall has been disabled.
Hbase version is 1.0.0, and we use all default settings.
I am checking with our cluster admin with your suggestion. I will update this post if any improvement.
One thing I forgot to mention is that we enabled Kerbros security, could that caused the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants