Get all content in a tag using BeautifulSoup

Question

Get all content in a tag using BeautifulSoup

I am trying to get all content in an article tag, say http://magazine.magix.com/de/5-tipps-fuer-die-fotobearbeitung/

However, when using

print soup.article

Comes only to "... Foto auf verschiedene Art und Weise und für verschiedene Zwecke bearbeiten".

Whole code:

from bs4 import BeautifulSoup
import requests

request_page = requests.get('http://magazine.magix.com/de/5-tipps-fuer-die-fotobearbeitung/', 'html.parser')
source = request_page.text
soup = BeautifulSoup(source, "html.parser")
print soup.article.text

How can I get everything?

+3

python web-scraping beautifulsoup

eLudium 13 jul. 17 at 10:17

source to share

1 answer

Arount · Accepted Answer · 2017-07-13T13:17:27+0000

Ok, finally found it. Welcome to the wonderful world of scraping.

There <article>

are tags in the tag </br>

, guy necessarily means <br/>

.

Anyway, it interrupts the html stream, so BS tries to parse it.

This is how I solved it:

from bs4 import BeautifulSoup
import requests

request_page = requests.get('http://magazine.magix.com/de/5-tipps-fuer-die-fotobearbeitung/', 'html.parser')
source = request_page.text
source = source.replace('</br>', '<br/>')
soup = BeautifulSoup(source, "html.parser")
print soup.article

(I replaced </br>

with <br/>

...)

This is a great selection, this material is legion, count on it :)

Get all content in a tag using BeautifulSoup

More articles: