Hive XML Serde - NULLPOINTEREXCEPTION

I am new to Hadoop / Hive. I want to create a Hive script with XML input. So I have this link for getting serde. It has a jar in it and I used that for a Hive script.

This is how I execute Hive assertions.

[biadmin@bng xml]$ hive

Logging initialized using configuration in jar:file:/home/biadmin/hive/lib/hive-common-0.13.1.jar!/hive-log4j.properties

hive> add jar /home/biadmin/scripts/hivexmlserde-1.0.0.0.jar;                               
Added /home/biadmin/scripts/hivexmlserde-1.0.0.0.jar to class path
Added resource: /home/biadmin/scripts/hivexmlserde-1.0.0.0.jar

hive> create external table if not exists xmltest (id varchar(50), name varchar(50), type varchar(50), dependency varchar(50), values varchar(50))
    > ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
    > WITH SERDEPROPERTIES (
    > "column.xpath.id"="/recs/rec/id/text()",
    > "column.xpath.name"="/recs/rec/name/text()",
    > "column.xpath.type"="/recs/rec/type/text()",
    > "column.xpath.dependency"="/recs/rec/dependency/text()",
    > "column.xpath.values"="/recs/rec/values/text()"
    > )
    > STORED AS
    > INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
    > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
    > TBLPROPERTIES (
    > "xmlinput.start"="<Par as",
    > "xmlinput.end"="</Par>"
    > );
OK
Time taken: 0.126 seconds

hive> load data local inpath '/home/biadmin/scripts/xml/playcontent.xml' into table xmltest;
Copying data from file:/home/biadmin/scripts/xml/playcontent.xml
Copying file: file:/home/biadmin/scripts/xml/playcontent.xml
Loading data to table default.xmltest
Table default.xmltest stats: [numFiles=3, numRows=0, totalSize=1842, rawDataSize=0]
OK
Time taken: 0.807 seconds

hive> select * from xmltest;
OK
Failed with exception java.io.IOException:java.lang.NullPointerException
Time taken: 0.305 seconds

hive> 

      

The XML I'm trying to use looks like this:

<Par as="val">
    <recs>
        <rec>
            <id>servicedescription</id>
            <name>Description</name>
            <type>textbox</type>
            <dependency>1</dependency>
            <values>1</values>
        </rec>
        <rec>
            <id>contentlist</id>
            <name>Content File/s</name>
            <type>selectmul</type>
            <dependency>1</dependency>
            <values>1</values>
        </rec>
        <rec>
            <id>seek</id>
            <name>Seek</name>
            <type>checkbox</type>
            <dependency>1</dependency>
            <values>1</values>
        </rec>
    </recs>
</Par>

      

Can someone tell me where I go wrong? Any help would be appreciated.

+3


source to share


1 answer


Try using them as values ​​for a property column.xpath.<your attribute-name>

in the CREATE statement.

"column.xpath.id"="Par/recs/rec/id/text()",
"column.xpath.name"="Par/recs/rec/name/text()",
"column.xpath.type"="Par/recs/rec/type/text()",
"column.xpath.dependency"="Par/recs/rec/dependency/text()", 
"column.xpath.values"="Par/recs/rec/values/text()"

      



I believe this modification will work for you.

0


source







All Articles