Talend tHBASEConnection and tHBaseInput for MapR

I have access to an edge node to a MapR Hadoop cluster. I have an HBase table named / app / SubscriptionBillingPlatform / Matthew with some fake data. Scanning it in hbase shell results in the following:

enter image description here

I have a very simple Talend task that needs to scan a table and write each row:

enter image description here

Here is the configuration for tHBaseConnection. I got zookeeper quorum and client port from / opt / mapr / hbase / hbase -0.94.13 / conf / hbase-site.xml file:

enter image description here

And here is the config for tHBaseInput:

enter image description here

However, when I SCP the jar file after creating / exporting the job and running it on the border node, I get the following error:

14/08/06 15:51:26 INFO mapr.TableMappingRulesFactory: Could not find MapRTableMappingRules class, assuming HBase only cluster.
14/08/06 15:51:26 INFO mapr.TableMappingRulesFactory: If you are trying to access M7 tables, add mapr-hbase jar to your classpath.
14/08/06 15:51:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/08/06 15:51:26 INFO security.JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
...
Exception in component tHBaseInput_1
org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for /app/SubscriptionBillingPlatform/Matthew,,99999999999999 after 10 tries.
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:991)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:896)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:998)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:900)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:857)
        at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:257)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:187)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:142)
        at poc2.testhbaseoperations_0_1.TestHBaseOperations.tHBaseInput_1Process(TestHBaseOperations.java:752)
        at poc2.testhbaseoperations_0_1.TestHBaseOperations.tHBaseConnection_1Process(TestHBaseOperations.java:375)
        at poc2.testhbaseoperations_0_1.TestHBaseOperations.runJobInTOS(TestHBaseOperations.java:1104)
        at poc2.testhbaseoperations_0_1.TestHBaseOperations.main(TestHBaseOperations.java:993)

      

When I told the sys admins about this, who doesn't know what Talend is, they told me that MapR does not use HRegionServers like Cloudera and believed that my Talend configurations were wrong.

Any ideas?

+3


source to share


2 answers


The kicker was these two lines:

INFO mapr.TableMappingRulesFactory: Could not find MapRTableMappingRules class, assuming HBase only cluster.
mapr.TableMappingRulesFactory: If you are trying to access M7 tables, add mapr-hbase jar to your classpath.

      

If the job does not have java mapr-hbase in the classpath, it will try to submit the job to normal HBase, not MapR-DB. This is why it hangs up forever.



You can add the mapr-hbase from jar /opt/mapr/lib

to your shell script classpath, or you can simply add all the jars from that directory to your classpath.

#!/bin/sh
cd `dirname $0`
 ROOT_PATH=`pwd`
java -Xms256M -Xmx1024M -cp /opt/mapr/lib/*:$ROOT_PATH/.. 

      

+1


source


I quickly tried to reproduce this in Talend Big Data Sandbox , but couldn't seem to get your error. I'm afraid.

Including the error message on google (with some options) looks like this is a common error outside of Talend, so my guess is as long as you load the required library and drivers correctly and they are included in your exported work then this is a problem configuration somewhere on your Hadoop cluster. This seems to happen on non-MapR distributions as well.

This issue in the Cloudera community tips seems to have a satisfactory resolution where Oozi was misconfigured and returned the same error as you. It might be worth adding:

<property>
<name>oozie.credentials.credentialclasses</name>
<value>hcat=org.apache.oozie.action.hadoop.HCatCredentials</value>
</property>

      



B Oozie service->Configuration->Oozie Server(default)->Advanced-> Oozie Server Configuration Safety Valve for oozie-site.xml

and restart the Hive and Oozie services.

Of course this can be complicated by how your Hadoop cluster is managed and if you have a development cluster / local instance to run that also suffers from the same issue.

I would highly recommend installing the aforementioned Sandend Big Data Sandbox, or at least the MapR sandbox if you only have production or production, like a Hadoop cluster for deployment.

0


source







All Articles