Hive ParseException on tab table statement

Question

Hive ParseException on tab table statement

I am using python and pyobbc module specifically for making Hive queries on Hadoop. Part of the code running problem looks like this:

import pyodbc
import pandas

oConnexionString = 'Driver={ClouderaHive};[...]'
oConnexion = pyodbc.connect(oConnexionString, autocommit=True)
oConnexion.setencoding(encoding='utf-8')
oQueryParameter = "select * from my_db.my_table;"
oParameterData = pandas.read_sql(oQueryParameter, oConnexion)
oCursor = oConnexion.cursor()

for oRow in oParameterData.index:
    sTableName = oParameterData.loc[oRow,'TableName']
    oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';'
    print(oQueryDeleteTable)
    oCursor.execute(oQueryDeleteTable)

The fingerprint gives the following: drop table if exists dl_audit_data_quality.hero_context_start_gamemode;

But it cursor.execute

runs the following error message

pyodbc.Error: ('HY000', "[HY000] [Cloudera] [HiveODBC] (80) Syntax or semantic parsing error caused by the server when executing an execurint request. error message from the server: Error compiling statement: FAILED: ParseException line 1 : 44 character '(80) (SQLExecDirectW) ")

Note that when I copy the print and do it manually in Hue, it works well. I assume it has something to do with the encoding of the variable sTableName

, but I can't figure out how to fix it.

thank

+1

python encoding hadoop hive pyodbc

Alexis.Rolland 06 May '17 at 13:37

source to share

1 answer

Alexis.Rolland · Accepted Answer · 2017-05-07T08:40:42+0000

The request did not work due to incorrect encoding of the variable sTableName

. Printing only the variable will result in the text being displayed correctly. Example with print above:

>>> print(oQueryDeleteTable)
>>> 'drop table if exists dl_audit_data_quality.hero_context_start_gamemode;'

But printing the original data frame showed that it contains characters like this:

>>> print(oParameterData.loc[oRow,'TableName']
>>> 'h\x00e\x00r\x00o\x00_c\x00o\x00n\x00t\x00e\x00x\x00t\x00'

The issue was solved by reworking to encoding as described here: Python dictionary contains encoded values

import pyodbc
import pandas

oConnexionString = 'Driver={ClouderaHive};[...]'
oConnexion = pyodbc.connect(oConnexionString, autocommit=True)
oConnexion.setdecoding(pyodbc.SQL_CHAR, encoding='utf-8')
oConnexion.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8')
oConnexion.setencoding(encoding='utf-8')
oQueryParameter = "select * from my_db.my_table;"
oParameterData = pandas.read_sql(oQueryParameter, oConnexion)
oCursor = oConnexion.cursor()

for oRow in oParameterData.index:
    sTableName = oParameterData.loc[oRow,'TableName']
    oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';'
    print(oQueryDeleteTable)
    oCursor.execute(oQueryDeleteTable)

Hive ParseException on tab table statement

More articles: