Increase degree of segmentation python segmentation

Consider the following simple python extension. When start()-ed

, Foo

will simply add the following sequential integer to a py::list

, once per second:

#include <boost/python.hpp>
#include <thread>
#include <atomic>

namespace py = boost::python;

struct Foo {
    Foo() : running(false) { } 
    ~Foo() { stop(); }   

    void start() {
        running = true;
        thread = std::thread([this]{
            while(running) {
                std::cout << py::len(messages) << std::end;
                messages.append(py::len(messages));
                std::this_thread::sleep_for(std::chrono::seconds(1));
            }
        });
    }   

    void stop() {
        if (running) {
            running = false;
            thread.join();
        }
    }   

    std::thread thread;
    py::list messages;
    std::atomic<bool> running;
};

BOOST_PYTHON_MODULE(Foo)
{
    PyEval_InitThreads();

    py::class_<Foo, boost::noncopyable>("Foo",
        py::init<>())
        .def("start", &Foo::start)
        .def("stop", &Foo::stop)
    ;   
}

      

Considering the above, the following simple python script segfaults all the time without even printing anything:

>>> import Foo
>>> f = Foo.Foo()
>>> f.start()
>>> Segmentation fault (core dumped)

      

With a kernel pointing to:

namespace boost { namespace python {

    inline ssize_t len(object const& obj)
    {   
        ssize_t result = PyObject_Length(obj.ptr());
        if (PyErr_Occurred()) throw_error_already_set(); // <==
        return result;
    }   

}} // namespace boost::python

      

Where:

(gdb) inspect obj
$1 = (const boost::python::api::object &) @0x62d368: {<boost::python::api::object_base> = {<boost::python::api::object_operators<boost::python::api::object>> = {<boost::python::def_visitor<boost::python::api::object>> = {<No data fields>}, <No data fields>}, m_ptr = []}, <No data fields>}
(gdb) inspect obj.ptr()
$2 = []
(gdb) inspect result
$3 = 0

      

Why does this happen when run on a thread? obj

looks great, result

installs correctly. Why is it happening PyErr_Occurred()

? Who installs this?

+3


source to share


1 answer


In short, there is a mutex around the CPython interpreter known as the Global Interpreter Lock (GIL). This mutex prevents parallel operations on Python objects. Thus, at any given time, a maximum of one thread, the one that acquired the GIL, is allowed to perform operations on Python objects. When multiple threads are present, calling Python code without holding the GIL results in undefined behavior.

C or C ++ threads are sometimes referred to as foreign threads in the Python documentation. The Python interpreter has no way to control someone else's thread. Therefore, foreign threads are responsible for managing the GIL to allow parallel or parallel execution with Python threads. With that in mind, let's look at the source code:

while (running) {
  std::cout << py::len(messages) << std::endl;           // Python
  messages.append(py::len(messages));                    // Python
  std::this_thread::sleep_for(std::chrono::seconds(1));  // No Python
}

      

As noted above, only two of the three lines in the thread body need to execute while the thread is owned by the GIL. One common way to deal with this is to use RAII classes to help manage the GIL. For example with the following class gil_lock

, when an object is created gil_lock

, the calling thread will get the GIL. When the object is gil_lock

destroyed, it releases the GIL.

/// @brief RAII class used to lock and unlock the GIL.
class gil_lock
{
public:
  gil_lock()  { state_ = PyGILState_Ensure(); }
  ~gil_lock() { PyGILState_Release(state_);   }
private:
  PyGILState_STATE state_;
};

      

The thread body can then use explicit scoping to control the lifetime of the lock.



while (running) {
  // Acquire GIL while invoking Python code.
  {
    gil_lock lock;
    std::cout << py::len(messages) << std::endl;
    messages.append(py::len(messages));
  }
  // Release GIL, allowing other threads to run Python code while
  // this thread sleeps.
  std::this_thread::sleep_for(std::chrono::seconds(1));
}

      


Here is a complete example based on the source code that demonstrates that the program works as expected after being controlled by the GIL:

#include <thread>
#include <atomic>
#include <iostream>
#include <boost/python.hpp>

/// @brief RAII class used to lock and unlock the GIL.
class gil_lock
{
public:
  gil_lock()  { state_ = PyGILState_Ensure(); }
  ~gil_lock() { PyGILState_Release(state_);   }
private:
  PyGILState_STATE state_;
};

struct foo
{
  foo() : running(false) {}
  ~foo() { stop(); }

  void start()
  {
    namespace python = boost::python;
    running = true;
    thread = std::thread([this]
      {
        while (running)
        {
          {
            gil_lock lock; // Acquire GIL.
            std::cout << python::len(messages) << std::endl;
            messages.append(python::len(messages));
          } // Release GIL.
          std::this_thread::sleep_for(std::chrono::seconds(1));
        }
      });
  }

  void stop()
  {
    if (running)
    {
      running = false;
      thread.join();
    }
  }

  std::thread thread;
  boost::python::list messages;
  std::atomic<bool> running;
};

BOOST_PYTHON_MODULE(example)
{
  // Force the GIL to be created and initialized.  The current caller will
  // own the GIL.
  PyEval_InitThreads();

  namespace python = boost::python;
  python::class_<foo, boost::noncopyable>("Foo", python::init<>())
    .def("start", &foo::start)
    .def("stop", &foo::stop)
    ;
}

      

Interactive use:

>>> import example
>>> import time
>>> foo = example.Foo()
>>> foo.start()
>>> time.sleep(3)
0
1
2
>>> foo.stop()
>>>

      

+9


source







All Articles