authored by Wayne Witzel III

Boost Python, Threads, and Releasing the GIL

On February 26, 2010 In python, c++ Permalink

After Beazley's talk at PyCon "Understanding the Python GIL" I released I had never done any work that released the GIL, spawned threads, did some work, and then restored the GIL. So I wanted to see if I could so something like that with Boost::Python and Boost::Thread and the type of performance I'd get from it with an empty while loop as the baseline. So I hacked up some quick and dirty C++ code and quick bit of runable Python to test out the resulting module and away I went. Below are the code snippets, links to bitbucket, and the results of the Python runable.

#include 
#include 
#include 
#include 
#include 

class ScopedGILRelease {
public:
    inline ScopedGILRelease() { m_thread_state = PyEval_SaveThread(); }
    inline ~ScopedGILRelease() {
        PyEval_RestoreThread(m_thread_state);
        m_thread_state = NULL;
    }
private:
    PyThreadState* m_thread_state;
};

void loop(long count)
{
    while (count != 0) {
        count -= 1;
    }
    return;
}

void nogil(int threads, long count)
{
    if (threads  > v_threads;
    for (int i=0; i != threads; i++) {
        boost::shared_ptr<:thread>
        m_thread = boost::shared_ptr<:thread>(
            new boost::thread(
                boost::bind(loop,thread_count)
            )
        );
        v_threads.push_back(m_thread);
    }

for (int i=0; i != v_threads.size(); i++)
        v_threads[i]->join();

return;
}

BOOST_PYTHON_MODULE(nogil)
{
    using namespace boost::python;
    def("nogil", nogil);
}

Then I used the following Python script to run some quick tests.

import time
import nogil

def timer(func):
    def wrapper(*arg):
        t1 = time.time()
        func(*arg)
        t2 = time.time()
        print "%s took %0.3f ms" % (func.func_name, (t2-t1)*1000.0)
    return wrapper

@timer
def loopone():
    count = 5000000
    while count != 0:
        count -= 1

@timer
def looptwo():
    count = 5000000
    nogil.nogil(1,count)

@timer
def loopthree():
    count = 5000000
    nogil.nogil(2,count)

@timer
def loopfour():
    count = 5000000
    nogil.nogil(4,count)

@timer
def loopfive():
    count = 5000000
    nogil.nogil(6,count)

def main():
    loopone()
    looptwo()
    loopthree()
    loopfour()
    loopfive()

if __name__ == '__main__':
    main()

The results I got were quite interesting and very consistent on my MacBook Pro. I ran the script about 1,000 times and got roughly the same results every time.

loopone took 364.159 ms (pure python)
looptwo took 15.295 ms (c++, no GIL, single thread)
loopthree took 7.763 ms (c++, no GIL, two threads)
loopfour took 8.119 ms (c++, no GIL, four threads)
loopfive took 11.102 ms (c++, no GIL, six threads)

Anyway, that's all really. Nothing profound here, no super insightful ending. Just hey look and stuff is faster and I might use this. All the code for this is available in my bitbucket repo. http://bitbucket.org/wwitzel3/code/src/tip/nogil/

You will require Boost Library including Boost Python and Boost Thread as well as Python libraries and includes to build this. For boost, bjam --with-python --with-thread variant=release toolset=gcc is all I did on my Mac. Then I added the resulting lib's as Framework dependencies in Xcode along with the Python.framework.

Read and Post Comments

Dynamic and static typing with unit tests.

On April 20, 2009 In python, brainstorm, c++ Permalink
There is an on going discussion at the office with a team member who refuses to use dynamic languages. Claiming that most of his errors are typographical errors and they are caught by the compiler. So for him, since these errors are not caught until runtime, he throws and entire group of languages out the window. He also claims that to ensure that same level of checking with a dynamic language you would have to create more unit tests than normal to prevent introducing unhandled runtime exceptions. So I decided to do a little test over the weekend. I created a very simple Number class in Python and C++. Using the exact same TDD development process, I implemented some very basic operations including division, addition, subtraction, etc... I ended up with 12 tests. The exact same tests for both the C++ and Python implementation resulting in 100% of the executation path being covered. I decided that the compliation (in case of C++) and passing of the tests determined a success. Then went back and inserted common typographical errors. Mistypes, extra = signs, not enough = signs, miseplled_varaibles, etc... The end result was I was unable to get my unit tests passing while introducing syntax that would induce an unhandled runtime exception in Python. Granted, in C++ the compiler did catch a lot of things for me, but the point here is I didn't have to create any extra tests to ensure that same level of confidence in my Python code.
Read and Post Comments