ElementTree - findall to recursively select all child elements
Python code:
import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.findall('saybye')
h.xml code:
<hello>
<saybye>
<saybye>
</saybye>
</saybye>
<saybye>
</saybye>
</hello>
Code outputs,
[<Element 'saybye' at 0x7fdbcbbec690>, <Element 'saybye' at 0x7fdbcbbec790>]
saybye
that is a child of another saybye
is not selected here. So how do I get findall to recursively walk down the DOM tree and collect all three elements saybye
?
source to share
Quote findall
,
Element.findall()
finds only elements with a tag that are direct children of the current element.
Since it only finds direct children, we need to recursively find other children such as
>>> import xml.etree.ElementTree as ET
>>>
>>> def find_rec(node, element, result):
... for item in node.findall(element):
... result.append(item)
... find_rec(item, element, result)
... return result
...
>>> find_rec(ET.parse("h.xml"), 'saybye', [])
[<Element 'saybye' at 0x7f4fce206710>, <Element 'saybye' at 0x7f4fce206750>, <Element 'saybye' at 0x7f4fce2067d0>]
Better yet, make it a generator function like this
>>> def find_rec(node, element):
... for item in node.findall(element):
... yield item
... for child in find_rec(item, element):
... yield child
...
>>> list(find_rec(ET.parse("h.xml"), 'saybye'))
[<Element 'saybye' at 0x7f4fce206a50>, <Element 'saybye' at 0x7f4fce206ad0>, <Element 'saybye' at 0x7f4fce206b10>]
source to share
Element.findall()
finds only elements with a tag that are direct children of the current element.
we need to recursively traverse all children to find the elements that match your element.
def find_rec(node, element):
def _find_rec(node, element, result):
for el in node.getchildren():
_find_rec(el, element, result)
if node.tag == element:
result.append(node)
res = list()
_find_rec(node, element, res)
return res
source to share