tag Is it possible to split the tag text into br tags? I have this tag content: [u'+420 777 593 5...">

Beautifulsoup splits the text in the <br/"> tag

Is it possible to split the tag text into br tags?

I have this tag content: [u'+420 777 593 531', <br/>, u'+420 776 593 531', <br/>, u'+420 775 593 531']

And I only want to get numbers. Any advice?

EDIT:

[x for x in dt.find_next_sibling('dd').contents if x!=' <br/>']

      

Doesn't work at all.

+3


source to share


1 answer


You need to test tags that are modeled as instances Element

. Element

objects have an attribute name

, whereas text elements are not (which are instances NavigableText

):

[x for x in dt.find_next_sibling('dd').contents if getattr(x, 'name', None) != 'br']

      

Since you only have text and <br />

elements in that element <dd>

, you can simply get all the contained lines instead:



list(dt.find_next_sibling('dd').stripped_strings)

      

Demo:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''\
... <dt>Term</dt>
... <dd>
...     +420 777 593 531<br/>
...     +420 776 593 531<br/>
...     +420 775 593 531<br/>
... </dd>
... ''')
>>> dt = soup.dt
>>> [x for x in dt.find_next_sibling('dd').contents if getattr(x, 'name', None) != 'br']
[u'\n    +420 777 593 531', u'\n    +420 776 593 531', u'\n    +420 775 593 531', u'\n']
>>> list(dt.find_next_sibling('dd').stripped_strings)
[u'+420 777 593 531', u'+420 776 593 531', u'+420 775 593 531']

      

+7


source







All Articles