Client-Side Injection of ThreadPoolSize Changes - Apache Phoenix JDBC Driver
I recently installed a JDBC driver to connect to Hadoop db using Apache Phoenix. Basic queries on Squirrel worked well (eg "select * from datafile"), but as soon as I ask a slightly more complex query (ie "Select column1 from datafile where column2 = 'filter1" I ran into this error
org.apache.phoenix.exception.PhoenixIOException: Task
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask rejected from
org.apache.phoenix.job.JobManager[Running, pool size = 128, active threads =
128, queued tasks = 5000, completed tasks = 5132]
From some searching it seems that I should increase the ThreadPoolSize in the Apache Phoenix hbase.xml config file to avoid this error I made by increasing it from 128 to 512. However, this change seems to have been noticed. The error persists and the "pool size" is still reported as 128 within the error.
In the Phoenix driver settings in Squirrel, I specified the location of the hbase and hdfs directories containing the .xml configuration files in the Custom Class Instance section of the configuration.
Is there a way to get the driver "notification" to change the ThreadPoolSize?
Thank!
source to share
A few things to check
- Make sure your feng shui flash version is compatible with your phoenix server version.
- Get the hbase-site.xml file (make sure the phoenix thread size is set appropriately in sync with the master) from your master Hbase node and add to the phoenix jar file (using 7zip) and try running the protein client again.
source to share
I spent a lot of time on this problem ...
The first step would be to run explain
in a query and find the chunk number (ex: CLIENT 4819-CHUNK ):
explain select row sum(row2) where the_date=to_date("2018-01-01");
+------------------------------------------------------------------------------+
| PLAN |
+------------------------------------------------------------------------------+
| CLIENT 4819-CHUNK 2339029958 ROWS 1707237752908 BYTES PARALLEL 4819-WAY FULL |
| SERVER FILTER BY "THE_DATE" = DATE '2018-01-01 01:00:00.000' |
| SERVER AGGREGATE INTO DISTINCT ROWS BY ["THE_DATE"] |
| CLIENT MERGE SORT |
+------------------------------------------------------------------------------+
4 rows selected (0.247 seconds)
- Check the number of regions and / or pointers in the table
- Set the property to a
phoenix.stats.guidepost.width
value greater than its default size100MB
and restart the HBase realm servers to apply the change - Update table statistics by running the following command:
jdbc:phoenix...> UPDATE STATISTICS my_table
Set these values ββin Ambari / hbase config:
phoenix.query.threadPoolSize:
The number of parallel threads for each request and should be set to the number of vcores on the client side / region servers in the cluster.
phoenix.query.queueSize:
The maximum queue length for tasks that will run for any queue beyond which an attempt to queue for additional work is denied. Set this property value to the number of chunks for the table , as you can see in the output of the explain command.
source to share