boost::python and boost::variant

Posted on February 8, 2014
Tags: C++, Python, boost::python, boost::variant<>, heterogeneous container, discriminated union

I have unsuccessfully tried to find a solution for the following problem on the internet several times. Now that I have come at least closer to a usable approach, I thought I’d document what I have found so that others trying to achieve a similar thing can use this as a starting point.

Boost.Python offers a very nice and flexible way to interface C++ data types with Python. With just a few lines of code, and the proper linker flags, you get a Python importable shared object from your C++ compiler. This can be very productive.

However, there is one aspect of C++ data types that I couldn’t figure out how to interface with Python, which are C++ discriminated unions, or more specifically, heterogeneous containers. While Python has no problems with containers containing objects of different types, C++ does not make this very easy by default. Usually the problem is solved with a container of pointers to a base class, and various subclasses with virtual functions. However, this approach is not always practical, especially if the different types of objects in an heterogeneous container dont have many things in common. This is where discriminated unions come to the rescue. They basically behave like a normal union in C, but have an additional field which indicates the type of object currently stored in the union. Boost.Variant does exactly that, with a nice visitor interface added on top of it.

Heterogeneous containers in C++

If we put the boost::variant<> template inside a STL container like std::vector<>, the result is a heterogeneous container. For the purpose of illustration, lets implement such a container. The example below is deliberately simple. In reality, the various types allowed in your variant will probably have more fields then just one.

#include <boost/variant.hpp>
#include <vector>

struct a { int x; };
struct b { std::string y; };

typedef boost::variant<a, b> variant;
typedef std::vector<variant> vector;

To ease creation of these two types of objects, we are going to write a few factory functions. We are going to wrap them in Python later on.

variant make_variant() { return variant(); }
vector make_vector() { return vector{a(), b(), a()}; }

Boost.Python

Now lets create a Python module which exports the above functionality to Python.

#include <boost/python/class.hpp>
#include <boost/python/def.hpp>
#include <boost/python/implicit.hpp>
#include <boost/python/init.hpp>
#include <boost/python/module.hpp>
#include <boost/python/object.hpp>
#include <boost/python/suite/indexing/vector_indexing_suite.hpp>

vector_indexing_suite apparently needs operator== defined on the value_type of the container. In our case, this is our boost::variant<a, b> type. Luckily, boost::variant<> already provides operator==. However, that operator== relies on operator== being defined for the underlying types. Since equality comparison is probably useful for other things as well, lets just create operator== for our two classes a and b.

bool operator==(a const &lhs, a const &rhs) { return lhs.x == rhs.x; }
bool operator==(b const &lhs, b const &rhs) { return lhs.y == rhs.y; }

Convert a boost::variant<> to PyObject *

Boost.Python needs a way to convert our discriminated union to a Python object. This code relies on Python class definitions being present for all underlying variant types. We will define them later.

struct variant_to_object : boost::static_visitor<PyObject *> {
  static result_type convert(variant const &v) {
    return apply_visitor(variant_to_object(), v);
  }

  template<typename T>
  result_type operator()(T const &t) const {
    return boost::python::incref(boost::python::object(t).ptr());
  }
};

And finally, lets create our Python module.

BOOST_PYTHON_MODULE(bpv) {
  using namespace boost::python;

  class_<a>("a", init<a>()).def(init<>()).def_readwrite("x", &a::x);
  class_<b>("b", init<b>()).def(init<>()).def_readwrite("y", &b::y);
  to_python_converter<variant, variant_to_object>();
  implicitly_convertible<a, variant>();
  implicitly_convertible<b, variant>();

  def("make_variant", make_variant);

  class_<vector>("vector").def(vector_indexing_suite<vector, true>());
  def("make_vector", make_vector);
}

Compiling

Lets create a shared object for Python.

$ g++ -std=c++11 -fPIC -shared $(python-config --includes) -o bpv.so file.cpp -lboost_python

Running

We can load the module into Python and see what it does.

>>> import bpv
>>> variant=bpv.make_variant()
>>> variant
<bpv.a object at 0x7f06bb2130c0>
>>> variant.x
0
>>> variant.x=2
>>> variant.x
2

Nice. We can access the underlying type, and even modify it.

Lets see how our heterogeneous container wrapping code behaves.

>>> vector=bpv.make_vector()
>>> vector
<bpv.vector object at 0x7f20693289d0>
>>> len(vector)
3
>>> list(vector)
[<bpv.a object at 0x7f20693190c0>, <bpv.b object at 0x7f20693193d0>, <bpv.a object at 0x7f2069319440>]

So far, so good. This will at least make it possible to convert heterogeneous containers from C++ to Python, which was my initial goal.

Unfortunately, contained objects are not treated as references. Whenever retrieved, we get a copy. So in-place modification does not work.

>>> vector[0].x
0
>>> vector[0].x=2
>>> vector[0].x
0

However, we can override an existing element with a modified copy.

>>> e0=vector[0]
>>> type(e0)
<class 'bpv.a'>
>>> e0.x = 2
>>> vector[0] = e0
>>> vector[0].x
2

And we can also use the append and extend methods of Python containers.

>>> len(vector)
3
>>> vector.extend(vector)
>>> vector.append(bpv.a())
>>> len(vector)
7
>>> len(filter(lambda x: type(x)==bpv.b, vector))
2
>>> len(filter(lambda x: type(x)==bpv.a, vector))
5
>>> map(lambda x: x.x, filter(lambda x: type(x)==bpv.a, vector))
[2, 0, 2, 0, 0]

All that is missing for a perfect world is reference semantics for container elements. If anyone has a hint on how to achieve this, please let me know.