Get all content in a tag using BeautifulSoup

I am trying to get all content in an article tag, say http://magazine.magix.com/de/5-tipps-fuer-die-fotobearbeitung/

However, when using

print soup.article

      

Comes only to "... Foto auf verschiedene Art und Weise und fΓΌr verschiedene Zwecke bearbeiten".

Whole code:

from bs4 import BeautifulSoup
import requests

request_page = requests.get('http://magazine.magix.com/de/5-tipps-fuer-die-fotobearbeitung/', 'html.parser')
source = request_page.text
soup = BeautifulSoup(source, "html.parser")
print soup.article.text

      

How can I get everything?

+3


source to share


1 answer


Ok, finally found it. Welcome to the wonderful world of scraping.

There <article>

are tags in the tag </br>

, guy necessarily means <br/>

.

Anyway, it interrupts the html stream, so BS tries to parse it.

This is how I solved it:



from bs4 import BeautifulSoup
import requests

request_page = requests.get('http://magazine.magix.com/de/5-tipps-fuer-die-fotobearbeitung/', 'html.parser')
source = request_page.text
source = source.replace('</br>', '<br/>')
soup = BeautifulSoup(source, "html.parser")
print soup.article

      

(I replaced </br>

with <br/>

...)

This is a great selection, this material is legion, count on it :)

+4


source







All Articles