BeautifulSoup: parse children by tag name
I have xml with coordinates like this:
<geo>
<lat>52.5025100</lat>
<lng>13.3086000</lng>
</geo>
I can parse the string of the first and second children (which are stored in the list) like this:
child_1=soup.find('geo').contents[1].get_text(strip=True)
child_2=soup.find('geo').contents[3].get_text(strip=True)
Suppose I need to process multiple files and I'm not sure if lat and long always appear in the above order, indexing won't work because it's not reliable. Instead, I would like to parse lat and long with their tag names as children geo
.
This does not work:
child_1=soup.find('geo').contents('lat').get_text(strip=True)
So how could I achieve this?
Note : lat
and long
appear several times in the document. Therefore, I cannot parse the document directly for lat
andlong
source to share
You can access the children of a node directly using the tag name :
geo = soup.find('geo')
print geo.lat.get_text(strip=True)
print geo.lng.get_text(strip=True)
source to share