some text

Python, lovely soup, get all class names

given the html code, one could say:

 <div class="class1">
    <span class="class2">some text</span>
    <span class="class3">some text</span>
    <span class="class4">some text</span>

Run codeHide result

How do I get all the class names? ie: ['class1', 'class2', 'class3', 'class4']

I tried:



But it fetches the whole tag and then I need to do some regex in the string


source to share

1 answer

You can handle every instance Tag

found as a dictionary
when it comes to retrieving attributes. Note that the attribute value class

will be a list since it class

is a special "multi-valued" attribute :

classes = []
for element in soup.find_all(class_=True):



classes = [value 
           for element in soup.find_all(class_=True) 
           for value in element["class"]]



In [1]: from bs4 import BeautifulSoup

In [2]: data = """
   ...: <div class="class1">
   ...:     <span class="class2">some text</span>
   ...:     <span class="class3">some text</span>
   ...:     <span class="class4">some text</span>
   ...: </div>"""

In [3]: soup = BeautifulSoup(data, "html.parser")

In [4]: classes = [value
   ...:            for element in soup.find_all(class_=True)
   ...:            for value in element["class"]]

In [5]: print(classes)
['class1', 'class2', 'class3', 'class4']




All Articles