Hive XML Serde - NULLPOINTEREXCEPTION
I am new to Hadoop / Hive. I want to create a Hive script with XML input. So I have this link for getting serde. It has a jar in it and I used that for a Hive script.
This is how I execute Hive assertions.
[biadmin@bng xml]$ hive
Logging initialized using configuration in jar:file:/home/biadmin/hive/lib/hive-common-0.13.1.jar!/hive-log4j.properties
hive> add jar /home/biadmin/scripts/hivexmlserde-1.0.0.0.jar;
Added /home/biadmin/scripts/hivexmlserde-1.0.0.0.jar to class path
Added resource: /home/biadmin/scripts/hivexmlserde-1.0.0.0.jar
hive> create external table if not exists xmltest (id varchar(50), name varchar(50), type varchar(50), dependency varchar(50), values varchar(50))
> ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
> WITH SERDEPROPERTIES (
> "column.xpath.id"="/recs/rec/id/text()",
> "column.xpath.name"="/recs/rec/name/text()",
> "column.xpath.type"="/recs/rec/type/text()",
> "column.xpath.dependency"="/recs/rec/dependency/text()",
> "column.xpath.values"="/recs/rec/values/text()"
> )
> STORED AS
> INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
> TBLPROPERTIES (
> "xmlinput.start"="<Par as",
> "xmlinput.end"="</Par>"
> );
OK
Time taken: 0.126 seconds
hive> load data local inpath '/home/biadmin/scripts/xml/playcontent.xml' into table xmltest;
Copying data from file:/home/biadmin/scripts/xml/playcontent.xml
Copying file: file:/home/biadmin/scripts/xml/playcontent.xml
Loading data to table default.xmltest
Table default.xmltest stats: [numFiles=3, numRows=0, totalSize=1842, rawDataSize=0]
OK
Time taken: 0.807 seconds
hive> select * from xmltest;
OK
Failed with exception java.io.IOException:java.lang.NullPointerException
Time taken: 0.305 seconds
hive>
The XML I'm trying to use looks like this:
<Par as="val">
<recs>
<rec>
<id>servicedescription</id>
<name>Description</name>
<type>textbox</type>
<dependency>1</dependency>
<values>1</values>
</rec>
<rec>
<id>contentlist</id>
<name>Content File/s</name>
<type>selectmul</type>
<dependency>1</dependency>
<values>1</values>
</rec>
<rec>
<id>seek</id>
<name>Seek</name>
<type>checkbox</type>
<dependency>1</dependency>
<values>1</values>
</rec>
</recs>
</Par>
Can someone tell me where I go wrong? Any help would be appreciated.
+3
source to share
1 answer
Try using them as values ββfor a property column.xpath.<your attribute-name>
in the CREATE statement.
"column.xpath.id"="Par/recs/rec/id/text()",
"column.xpath.name"="Par/recs/rec/name/text()",
"column.xpath.type"="Par/recs/rec/type/text()",
"column.xpath.dependency"="Par/recs/rec/dependency/text()",
"column.xpath.values"="Par/recs/rec/values/text()"
I believe this modification will work for you.
0
source to share