Using lxml to parse xml with multiple namespaces

I am pulling xml from SOAP api which looks like this:

<SOAP-ENV:Envelope xmlns:SOAP-ENC="" xmlns:SOAP-ENV="" xmlns:ae="urn:sbmappservices72" xmlns:c14n="" xmlns:diag="urn:SerenaDiagnostics" xmlns:ds="" xmlns:wsse="" xmlns:wsu="" xmlns:xenc="" xmlns:xsd="" xmlns:xsi="">
          <ae:id xsi:type="ae:ItemIdentifier">


I can't for the rest of my life use findall to pull out something like tableId. Most of the parsing tutorials using lxml do not include namespaces, but the one with and I am trying to follow it.

According to their tutorial, you should create a namespace dictionary which I did like this:

r = tree.xpath('/e:SOAP-ENV/s:ae', 
        namespaces={'e': '',
                    's': 'urn:sbmappservices72'})


But that doesn't seem to work, since when I try to get len ​​of r it returns as 0:

print 'length: ' + str(len(r)) #<---- always equals 0


Since the URI for the second namespace is "urn:" I tried to use the real url for the wsdl, but it gives me the same result.

Is there something obvious that I am missing? I just need to be able to pull values ​​like those for tableIdItemId.

Any help would be greatly appreciated.


source to share

1 answer

Your XPath doesn't match the XML structure. Try this instead:

r = tree.xpath('/e:Envelope/e:Body/s:GetItemsByQueryResponse/s:return/s:item/s:id/s:tableId', 
        namespaces={'e': '',
                    's': 'urn:sbmappservices72'})


For small XML, you can use //

instead /

to simplify the expression, for example:

r = tree.xpath('/e:Envelope/e:Body//s:tableId', 
        namespaces={'e': '',
                    's': 'urn:sbmappservices72'})



will find tableId

no matter how deeply it is nested in Body

. Note, however, that it //

is certainly slower than /

that, especially when applied to huge XML.



All Articles