Accessing only some elements of BeautifulSoup result with negative indices
I am using BeautifulSoup4 to parse a document and I am getting some strange behavior, the corresponding code snippet looks like this:
for sale_table in sales_soup.find_all('table'): rows = sale_table.find_all('tr') grantor = rows
However, this gives me an out of range exception. So I did some basic checks and len (rows) == 4 just before and after the destination is assigned (using an index that doesn't throw an exception). Also I can access the first and second elements of lines with lines  and lines . However, I can only access items 3 and 4 with lines [-1] and lines [-2] by trying to use indices, 2 or 3, or -3 or -4, throwing an out of range exception. Also when I file.write (str (rows)) and html I get matches exactly to the html of the test document.
In summation, I can access the entire list, but I would like to understand why I am getting this strange exception.
Sorry guys, the answer is I'm an idiot. There is an inconsistent table in the markup which is shorter and throws an exception. Running the loop one at a time shows len! = 4 on each iteration, sorry for the misinformation. Is this the wrong form to edit this question as it is incorrect?
source to share
You should never index a list of unknown size. Never trust the markup to be right all the time.
In my experience with BeautifulSoup, you have to write a lot of if statements to cover for yourself. Change the above code to something like this:
for sale_table in sales_soup.find_all('table'): rows = sale_table.find_all('tr') if len(rows) > 3: grantor = rows else: grantor = None
Also, have a look at the BS4 documentation for more options
that might be helpful for your use case. If you are only using 4th element, use
as keyword argument.
source to share