Beautiful Soup - Class contains 'a' and does not contain 'b'

Usage bs4

I need to find an element with class_=re.compile("viewLicense")

but notclass_="viewLicenseDetails"

Here is a snippet,

<tr class="viewLicense inactive"></tr>
<tr class="viewLicense"></tr>
<tr id="licenseDetails_552738" class="viewLicenseDetails"</tr>

      

I want the first two tr and don't want the last one.

Could anyone help, thanks

+3


source to share


2 answers


Below you will find each tag tr

with a viewLicense

soup.find_all("tr", class_="viewLicense")

      

So it will work for the text provided in quesiton:



>>> soup.find_all("tr", class_="viewLicense")
[<tr class="viewLicense inactive"></tr>, <tr class="viewLicense"></tr>]

      

However, if you have a tag tr

that has both classes viewLicense

and viewLicenseDetails

, then the next one will find all the tags tr

with viewLicense

and then remove the tags with viewLicenseDetails

:

>>> both_tags = soup.find_all("tr", class_="viewLicense")
>>> for tag in both_tags:
...     if 'viewLicenseDetails' not in tag.attrs['class']:
...             print tag

      

+6


source


Use CSS selectors?



results = soup.select('tr.viewLicense')

      

+3


source







All Articles