Lxml AssertionError: Invalid proxy element
I have a session on an instance started by ZODB that parses a page and then stores an lxml object. He later throws:
AssertionError: invalid Element proxy at 4495778632
It's not easy to reproduce in my specific case, but this code does it too:
from lxml import etree
tree = etree.fromstring("<html><body>test</body></html>" , etree.HTMLParser())
c=[ x for x in tree.iter() ][0]
print(c.__class__())
What's happening?
source to share
I have an AssertionError when I was trying to perform operations on a node that I passed as an argument to call celery @shared_task
on it .delay
. To fix the error and not pass in the element, I went through the xml_string and made a new one ET.fromstring(xml_string)
in @shared_task
. All etree operations worked fine with the new document. It must have had something to do with the serialization of the item when it got into the celery queue.
source to share
The error message states that the item proxy does not exist. Proxy means the corresponding C
node representation that is missing.
With, c.__class__()
you are trying to call the constructor of the class _Element
. The lxml documentation says:
It's important to know that every proxy in lxml has a factory that sets the C level members correctly. Proxy objects should never be created outside of this factory. For example, to create an _Element or its subclasses, you must always call its factory ::
cdef xmlNode* c_node cdef _Document doc cdef _Element element ... element = _elementFactory(doc, c_node)
Without using the factory pattern and passing c_node, the constructor will fail due to assertions:
LXML / SRC / LXML / apihelpers.pxi:
cdef inline int _assertValidNode(_Element element) except -1:
assert element._c_node is not NULL, u"invalid Element proxy at %s" % id(element)
source to share