Resizing python sequence during iteration
The implementation of the method bytes.join
implemented here includes code that protects against size changes during iteration:
if (seqlen != PySequence_Fast_GET_SIZE(seq)) {
PyErr_SetString(PyExc_RuntimeError,
"sequence changed size during iteration");
goto error;
}
How can the iterable sequence be changed within the call bytes.join
and why is the above code needed? Or maybe it is unnecessary and redundant?
source to share
If you pass a list object in bytes.join()
, you can add items to the list on a different thread while the call bytes.join()
is iterating.
The method bytes.join()
should make two passes over the sequence; once to compute the total length of the contained objects bytes
, a second time to then build the actual output object bytes
. Changing the cardinality while iterating over it would put a wrench in this calculation.
Usually you cannot do this on the list as the GIL has not been released, but if any of the objects in the list are not bytes
objects the buffer protocol is used instead. As a comment on the original states of the patch :
The problem with your approach is that the sequence might be mutated while another thread is running (
_getbuffer()
might free the GIL). Then the previously calculated size becomes incorrect.
source to share