my-server
← Wiki

Expression templates

Expression templates are a C++ template metaprogramming technique that builds structures representing a computation at compile time, where expressions are evaluated only as needed to produce efficient code for the entire computation. Expression templates thus allow programmers to bypass the normal order of evaluation of the C++ language and achieve optimizations such as loop fusion.

Expression templates were invented independently by Todd Veldhuizen and David Vandevoorde; it was Veldhuizen who gave them their name. They are a popular technique for the implementation of linear algebra software.

Motivation and example

Consider a library representing vectors and operations on them. One common mathematical operation is to add two vectors and , element-wise, to produce a new vector. The obvious C++ implementation of this operation would be an overloaded <code>operator+</code> that returns a new vector object:

Users of this class can now write <code>Vec3 x = a + b;</code> where <code>a</code> and <code>b</code> are both instances of <code>Vec3</code>.

A problem with this approach is that more complicated expressions such as <code>Vec3 x = a + b + c</code> are implemented inefficiently. The implementation first produces a temporary <code>Vec3</code> to hold <code>a + b</code>, then produces another <code>Vec3</code> with the elements of <code>c</code> added in. Even with return value optimization this will allocate memory at least twice and require two loops.

Delayed evaluation solves this problem, and can be implemented in C++ by letting <code>operator+</code> return an object of an auxiliary type, say <code>Vec3Sum</code>, that represents the unevaluated sum of two <code>Vec3</code>s, or a vector with a <code>Vec3Sum</code>, etc. Larger expressions then effectively build expression trees that are evaluated only when assigned to an actual <code>Vec3</code> variable. But this requires traversing such trees to do the evaluation, which is in itself costly.

Expression templates implement delayed evaluation using expression trees that only exist at compile time. Each assignment to a <code>Vec3</code>, such as <code>Vec3 x = a + b + c</code>, generates a new <code>Vec3</code> constructor if needed by template instantiation. This constructor operates on three <code>Vec3</code>; it allocates the necessary memory and then performs the computation. Thus only one memory allocation is performed.

Example implementation of expression templates

An example implementation of expression templates looks like the following. A base class <code>Vec3Expression</code> represents any vector-valued expression. It is templated on the actual expression type <code>E</code> to be implemented, per the curiously recurring template pattern. The existence of a base class like <code>VecExpression</code> is not strictly necessary for expression templates to work. It will merely serve as a function argument type to distinguish the expressions from other types (note the definition of a <code>Vec3</code> constructor and <code>operator+</code> below).

The Boolean <code>is_leaf</code> is there to tag <code>VecExpression</code>s that are "leafs", i.e. that actually contain data. The <code>Vec3</code> class is a leaf that stores the coordinates of a fully evaluated vector expression, and becomes a subclass of <code>VecExpression</code>.

The sum of two <code>Vec3</code>s is represented by a new type, <code>VecSum</code>, that is templated on the types of the left- and right-hand sides of the sum so that it can be applied to arbitrary pairs of <code>Vec3</code> expressions. An overloaded <code>operator+</code> serves as syntactic sugar for the <code>VecSum</code> constructor. A subtlety intervenes in this case: in order to reference the original data when summing two <code>VecExpression</code>s, <code>VecSum</code> needs to store a const reference to each <code>VecExpression</code> if it is a leaf, otherwise it is a temporary object that needs to be copied to be properly saved.

With the above definitions, the expression <code>a + b + c</code> is of type

so <code>Vec3 x = a + b + c</code> invokes the templated <code>Vec3</code> constructor <code>Vec3(VecExpression<E> const& expr)</code> with its template argument <code>E</code> being this type (meaning <code>Vec3Sum<Vec3Sum<Vec3, Vec3>, Vec3></code>). Inside this constructor, the loop body

is effectively expanded (following the recursive definitions of <code>operator+</code> and <code>operator[]</code>on this type) to

with no temporary <code>Vec</code> objects needed and only one pass through each memory block.

Basic usage

The following demonstrates basic usage of the above:

Applications

Expression templates have been found especially useful by the authors of libraries for linear algebra, that is, for dealing with vectors and matrices of numbers. Among libraries employing expression template are Dlib, Armadillo, Blaze, Blitz++, Boost uBLAS, Eigen, POOMA, Stan Math Library, and xtensor. Expression templates can also accelerate C++ automatic differentiation implementations, as demonstrated in the Adept library.

Outside of vector math, the Spirit parser framework uses expression templates to represent formal grammars and compile these into parsers.

See also

References