C++ lambda preprocessor |
|
Directory
Why clamp? What it does How it works Grammar Portability Contact info Source download Windows executable Author's home page |
The C++ lambda preprocessor (clamp) converts C++ code containing
lambda expressions into ordinary C++ code. Here's a simple example:
vector<int> v; This example uses the standard algorithm for_each to apply an anonymous function to each element of a vector. The anonymous function accepts an integer parameter by reference, and resets the value to zero if it is currently five (a simple, but not very useful example). The preprocessor replaces the entire lambda expression in its output, so that the C++ compiler ends up seeing something like the following:
std::for_each (v.begin(), v.end() The exact nature of the template lambda_generator_1 is beyond the scope of this introduction, except to say that its generate() member function returns a function object by value. The function object has, in this case, a member function void operator()(int &) which for_each applies to each element of the vector. Some people would probably prefer to use the standard transform algorithm for this example, as in:
std::transform (v.begin(), v.end(), v.begin() This example shows an anonymous function that returns a value, in this case int. Rather than hard-wiring a value into the function body, it is also possible to include contextual information in the function object. For instance:
void reset (std::vector<int> &v, int val) { The __ctx expression is an example of context information bound by value. The clamp preprocessor also supports reference semantics for contextual information via __ref expressions. For example:
int sum = 0; This, of course, calculates the sum of elements in the vector. Getting into some more complicated examples, it is possible to name the type of the function object generated by a lambda expression by simply omitting the function body. You have to do this, for instance, if you want to use an anonymous function generated by a lambda expression as a function parameter or return value. For example, the type of the expression from the previous example: lambda (int p) { __ref(sum) += p; } can be referred to in the code as "lambda (int &) (int)". The first pair of brackets contains the context binding (or closure) parameters, and the second pair contains the function parameters. The closure parameter list is optional for context-less functions, as is the return type for functions returning void, such as this one. Putting all of that together, here's a templated function that returns a function object:
template<typename T>
// Use a generated comparison object This find_if example returns an iterator to the first 7 in the vector (or v.end(), if none) using an instantiation of the match template with an int parameter. For a vector of strings, you could do the following:
std::vector<std::string>::iterator Why a preprocessor?I wrote the preprocessor just for fun. There doesn't seem to be any way to achieve real lambda expressions in pure C++, since it won't let you insert a function definition in the middle of an expression. The limits of what pure C++ allows are pretty well exhausted by the boost lambda library.Lambda expressions simplify some coding tasks, so it would be nice to have them in C++. In the time it takes you to extract that one-liner into a named function, I bet you could write two lambda expressions for sure. Not to mention cases which require a named class that contains context information. What it doesclamp scans its input for lambda expressions, passing any plain C++ through unchanged. When it encounters a lambda expression, it extracts the function body into a separate file. It also generates a class template with a suitable operator() and (where necessary) member variables to store any context binding. This class template also goes into a separate file. The whole lambda expression is then replaced in the output by a single constructor call, which creates an object of the templated class. The first line of the output is always a #include directive, which drags in the generated templates and (indirectly) the function bodies. The generated templates do not refer explicitly to any types used in the original lambda expressions, which is how it can be included before any user code. The actual types are only bound at the point of use. Because of this, the clamp parser doesn't have to know what scope a lambda expression appears in, or where the required types are defined. This also makes including lambda expressions in templated code a breeze, since the type binding is done within the template scope where the expression was originally used. How it works
The clamp preprocessor consists of a lexical analyser (lexer) written in flex, a parser written in bison and a code generator in plain C++. The clamp parser mostly tries to ignore everything in the input file, letting the lexer copy input to output. When the lexer encounters the lambda keyword, it enters a different mode ("start condition" in flex terminology) in which is behaves like a normal lexer and supplies tokens to the parser. The parser does some messy stuff redirecting output and resetting the lexer mode as necessary. Note: clamp is actually pretty dumb. It performs purely syntactic transformations on the input, without really understanding scope, types or variables. This will no doubt result in some incomprehensible errors from the C++ compiler if something goes wrong. This is also the reason that clamp requires the __ctx and __ref keywords, since it wouldn't otherwise be able to tell that an expression relies on surrounding context information. Grammarclamp introduces three keywords: lambda, __ctx and __ref. The parser recognises more or less the following grammar:
lambda-expression:
lambda-decl:
return-type:
param-list:
parameter:
initialiser:
lambda-body:
extended-expression:
PortabilityI wrote clamp using the following tools: g++ 2.95.3-5 with boost 1.25.1, flex 2.5.4, bision 1.28 and gnu make 3.79.1, all under Cygwin on Windows 2000. The preprocessor builds successfully with g++ 3.1, but the code that it generates causes an internal compiler error when taking the address of a member function. This is probably fixed in later versions of g++.
The preprocessor itself might build with yacc and/or
traditional Unix make (maybe) with any reasonable C++ compiler. The
lexer probably won't compile with plain lex, because (according to the
flex manual), lex doesn't support exclusive start conditions.
|