List of pythons in recursion

Question

List of pythons in recursion

I want to find all links in a div, for example:

<div>
  <a href="#0"></a>
  <a href="#1"></a>
  <a href="#2"></a>
</div>

So, I write func like this:

def get_links(div):
    links = []
    if div.tag == 'a':
        links.append(div)
        return links   
    else:
        for a in div:
            links + get_links(a)
        return links

why the results are [] and not [a, a, a] ? ------- question

I know this is a matter of linking to the list, could you show some details

This is the complete module:

import lxml.html


def get_links(div):
    links = []
    if div.tag == 'a':
        links.append(div)
        return links   
    else:
        for a in div:
            links + get_links(a)
        return links


if __name__ == '__main__':

    fragment = '''
        <div>
          <a href="#0">1</a>
          <a href="#1">2</a>
          <a href="#2">3</a>
        </div>'''
    fragment = lxml.html.fromstring(fragment)
    links = get_links(fragment)    # <---------------

+3

python

zwidny 05 jan. 15 at 8:03

source to share

3 answers

If the tag is not "a", your code looks like this.

# You create an empty list

links = []
for a in div:
    # You combine <links> with result of get_links() but you do not assign it to anything
    links + get_links(a)
# So you return an empty list   
return links

You must change +

to +=

:

links += get_links(a)

Or use extend()

links.extend(get_links(a))

+1

Mariusz Jamro 05 jan. At 8:08 am

source to share

Another option is to use a method xpath

to get all tags a

from div

at any level.

code:

from lxml import etree
root = etree.fromstring(content)
print root.xpath('//div//a')

Output:

[<Element a at 0xb6cef0cc>, <Element a at 0xb6cef0f4>, <Element a at 0xb6cef11c>]

0

Vivek sable 05 jan. 15 at 8:10

source to share

6502 · Accepted Answer · 2015-01-05T08:07:57+0000

List concatenation in Python returns a new list derived from the concatenation of means, does not change them:

x = [1, 2, 3, 4]
print(x + [5, 6])  # displays [1, 2, 3, 4, 5, 6]
print(x)           # here x is still [1, 2, 3, 4]

you can use the method extend

:

x.extend([5, 6])

and +=

x += [5, 6]

The latter is IMO a little "weird" because it is a case where it x=x+y

doesn't match x+=y

, and so I prefer to avoid it and make the inner expansion more explicit.

For your code

links = links + get_links(a)

would also be acceptable, but remember that it does different: it allocates a new list with concatenation and then assigns a name to it links

: it does not change the original object it is referencinglinks

x = [1, 2, 3, 4]
y = x
x = x + [5, 6]
print(x)   # displays [1, 2, 3, 4, 5, 6]
print(y)   # displays [1, 2, 3, 4]

but

x = [1, 2, 3, 4]
y = x
x += [5, 6]
print(x)   # displays [1, 2, 3, 4, 5, 6]
print(y)   # displays [1, 2, 3, 4, 5, 6]

List of pythons in recursion

More articles: