Beautiful Soup - Class contains 'a' and does not contain 'b'

Question

Beautiful Soup - Class contains 'a' and does not contain 'b'

Usage bs4

I need to find an element with class_=re.compile("viewLicense")

but notclass_="viewLicenseDetails"

Here is a snippet,

<tr class="viewLicense inactive"></tr>
<tr class="viewLicense"></tr>
<tr id="licenseDetails_552738" class="viewLicenseDetails"</tr>

I want the first two tr and don't want the last one.

Could anyone help, thanks

+3

python-2.7 beautifulsoup

Md. Mohsin 12 oct. '14 at 3:26

source to share

2 answers

Use CSS selectors?

results = soup.select('tr.viewLicense')

+3

DivinusVox 12 oct. '14 at 3:35

source to share

avi · Accepted Answer · 2014-10-12T03:34:56+0000

Below you will find each tag tr

with a viewLicense

soup.find_all("tr", class_="viewLicense")

So it will work for the text provided in quesiton:

>>> soup.find_all("tr", class_="viewLicense")
[<tr class="viewLicense inactive"></tr>, <tr class="viewLicense"></tr>]

However, if you have a tag tr

that has both classes viewLicense

and viewLicenseDetails

, then the next one will find all the tags tr

with viewLicense

and then remove the tags with viewLicenseDetails

:

>>> both_tags = soup.find_all("tr", class_="viewLicense")
>>> for tag in both_tags:
...     if 'viewLicenseDetails' not in tag.attrs['class']:
...             print tag

Beautiful Soup - Class contains 'a' and does not contain 'b'

More articles: