BrainFlood: Runtime code generation via reflection in .NET

Howdy

Hi everyone. I wrote a brainfuck interpreter. It implements a few simple optimizations, but it's really nothing to write home about. I have benchmarked it against Nayuki's optimizing compiler using Erik Bosman's Mandelbrot program. Here are the results:

My interpreter takes 11.563 seconds. Nayuki's compiler takes 0.535 seconds.

Next, I wrote a compiler that can generate code at runtime. Here's the benchmark chart with it added:

My compiler takes 1.22 seconds.? My interpreter takes 11.563 seconds. Nayuki's compiler takes 0.535 seconds. ?

The technique used to build the compiler is completely absurd, but is (hopefully) safe, and is even able to work in S&box's sandboxed environment! No native modules or System.Reflection.Emit needed! It can't make my simple brainfuck optimizations or .NET's JIT magically competitive with Nayuki and LLVM, but I'd like to think it's an impressive result.

A Background on Compilers

Parakeet

First, it may be useful to understand a few things about compilers. Beware that each of these topics could be their own blog post. If something is confusing, it might be a good idea to find a more in-depth resource to explain it. Also beware that I'm a fairly knowledgeable hobbyist, but not an expert, especially on .NET. If you see something that's wrong feel free to yell at me.

Just in Time compilers, or JIT compilers are compilers which generate machine code while the program is running. The .NET runtime is an example of this. When C# is initially compiled, it is only compiled to bytecode. When the program is executed, the bytecode may be executed directly, or it may be compiled to machine code. A JIT may even involve several tiers, where more frequently used code is re-compiled with more optimizations. Other examples of JIT compilers are the Java Virtual Machine, LuaJIT, and most JavaScript runtimes. Language implementations which compile straight to machine code at build-time are called Ahead of Time, or AOT.

Constant Evaluation, or Constant Folding is an optimization which involves simplifying constant expressions. Any decent compiler should be able to do the following:

int x = 3 * 3 * 3 * 3 * 5 * 5;
// simplifies to:
int x = 2025;

When the code is executed, it won't have to perform the multiplies. It will simply load 2025 into x.

Inlining is another very usful optimization, which replaces a call to a function with the body of the function. For example:

int Add(int x, int y) {
    return x + y;
}

int Add3(int x, int y, int z) {
    return Add(Add(x,y),z);
}

// the second function can be optimized to:
int Add3(int x, int y, int z) {
    return x + y + z;
}

The real power of inlining isn't just that it saves time doing a call, it's that it can enable other optimizations. A function can be inlined, then constants in the resulting code can be folded:

int Five() {
    return 5;
}

int Fifteen() {
    return Five()+Five()+Five();
}

// the second function can be optimized to:
int Fifteen() {
    return 15;
}

Monomorphization is a slightly more niche topic, but it is also very important. Monomorphization is the process of generating separate code for generic functions, depending on the type parameters. Consider the following function:

static T Min<T>(T a, T b) where T: IComparable {
    if (a.CompareTo(b)<0) {
        return a;
    } else {
        return b;
    }
}

If monomorphization is used, then calling this function with ints, floats, and strings would all generate separate code. In the separate code, the call to CompareTo can be devirtualized. Much like inlining, this can be used to enable other optimizations: Instead of just devirtualizing the call, it could be inlined.

Monomoprhization is a more popular topic in AOT compiled languages like C++ and Rust, which will monomorphize code almost any time you use generics. The down-side of monomorphization is that it increases the amount of code generated, and increases compile times. The .NET runtime and JVM are less eager to monomorphize code, preferring to generate polymorphic code with virtual calls.

Reflection isn't really a compiler topic, but it's another key part of this compiler. Reflection is a feature of some managed language runtimes, like .NET and the JVM. It allows you to deal with types dynamically at runtime. Reflection could allow us to load an assembly at runtime, search it for classes implementing some interface, then create an instance of each of those classes. S&box doesn't give us access to the entire .NET reflection library, but it provides some wrappers for some useful features.

Practical .NET Optimization Abuse

Parakeet

Time for a bigger example!

using System.Runtime.CompilerServices;

interface Number {
    static abstract int Eval();
}

class ConstDead : Number {
    public static int Eval() {
        return 0xDEAD;
    }
}

class ConstBeef : Number {
    public static int Eval() {
        return 0xBEEF;
    }
}

class Concat<A,B> : Number
    where A: Number
    where B: Number
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static int Eval() {
        return (A.Eval() << 16) + B.Eval();
    }
}

class Example {
    int Run() {
        return Concat<ConstDead,ConstBeef>.Eval();
    }
}

This combines a few ideas from the previous section: The three calls to Eval are monomorphically inlined and folded. If you throw this code in SharpLab, you'll find that Example.Run() JIT compiles to two instructions:

    L0000: mov eax, 0xdeadbeef
    L0005: ret

The [MethodImpl(MethodImplOptions.AggressiveInlining)] attribute is necessary to persuade .NET to inline the calls. Sadly, it isn't always sufficient. We need to encourage the JIT to inline and monomorphize our code. Fortunately, there's a way to do this: structs! Since structs are value types that use an unknown amount of stack space, the JIT isn't allowed to just re-use polymorphic code. Our example code becomes:

using System.Runtime.CompilerServices;

interface Number {
    int Eval();
}

struct ConstDead : Number {
    public int Eval() {
        return 0xDEAD;
    }
}

struct ConstBeef : Number {
    public int Eval() {
        return 0xBEEF;
    }
}

struct Concat<A,B> : Number
    where A: struct, Number
    where B: struct, Number
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Eval() {
        return (default(A).Eval() << 16) + default(B).Eval();
    }
}

class Example {
    int Run() {
        return default(Concat<ConstDead,ConstBeef>).Eval();
    }
}

By some miracle, the JIT is just smart enough to optimize this the same way as the static code. Presumably it helps that the structs are empty, but I'm not sure whether it actually matters.

Integer Constants

Parakeet

If you're familiar with brainfuck, you'll know it doesn't need to represent constants; however, they're really useful for some optimizations. My interpreter converts the brainfuck code into a bytecode which contains integer offsets. My compiler works by converting the bytecode into a disgusting ball of generic types. So I needed constants, and if I couldn't handle constants, the project was probably doomed anyway.

Behold:

interface Const {
    int Run();
}

// Single Hex Digits
struct D0 : Const { public int Run() => 0; }
struct D1 : Const { public int Run() => 1; }
struct D2 : Const { public int Run() => 2; }
// D3 - DD omitted
struct DE : Const { public int Run() => 0xE; }
struct DF : Const { public int Run() => 0xF; }

// Two Hex Digits
struct Num<A,B> : Const
    where A: struct, Const
    where B: struct, Const
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Run() {
        return default(A).Run()<<4 | default(B).Run();
    }
}

// Negatives
struct Neg<A> : Const
    where A: struct, Const
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Run() {
        return -default(A).Run();
    }
}

With these building blocks, we can represent any integer constant. We can verify it works with Sharplab:

class Example {
    int Run() {
        return default(Neg<Num<D2,DF>>).Run();
    }
}
// compiles to
    L0000: mov eax, 0xffffffd1
    L0005: ret

Converting Bytecode

Parakeet

As I mentioned before, the compiler consumes the interpreter's bytecode. The following instructions need to be converted:

UpdateCell(offset, inc): Adds inc to the cell at offset from the current pointer.
UpdatePointer(offset): Modifies the current pointer by offset.
ZeroCell(offset): Sets the cell at offset from the current pointer to zero.
LoopStart(offset): Jumps forward to offset if the current cell is zero.
LoopEnd(offset): Jumps backward to offset if the current cell is nonzero.
OutputCell(offset): Writes the cell to output at offset from the current pointer.

I define an Operation interface, and a struct for each instruction. Here's an example:

interface Op {
    int Run(int index, byte[] data, Output output);
}

struct ZeroCell<OFFSET,NEXT> : Op
    where OFFSET: struct, Const
    where NEXT: struct, Op
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Run(int index, byte[] data, Output output)
    {
        data[index + default(OFFSET).Run()] = 0;
        return default(NEXT).Run(index, data, output);
    }
}

Most operations have a NEXT parameter, which allows us to chain them together. The final operation in a chain sets this to a Stop type, which does nothing.

struct Stop : Op
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Run(int index, byte[] data, Output output)
    {
        return index;
    }
}

Instead of storing the current pointer in some state object, it is passed around and returned by each operation. My hope is that this will keep it in a register as much as possible, and avoid constantly loading and storing it.

struct UpdatePointer<OFFSET,NEXT> : Op
    where OFFSET: struct, Const
    where NEXT: struct, Op
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Run(int index, byte[] data, Output output)
    {
        index += default(OFFSET).Run();
        return default(NEXT).Run(index, data, output);
    }
}

The LoopStart and LoopEnd instructions are converted into a single Loop type:

struct Loop<BODY,NEXT> : Op
    where BODY: struct, Op
    where NEXT: struct, Op
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Run(int index, byte[] data, Output output)
    {
        var body = default(BODY);
        while (data[index] != 0) {
            output.CheckKill();
            index = body.Run(index, data, output);
        }
        return default(NEXT).Run(index, data, output);
    }
}

Both BODY and NEXT are Operation chains. The call to output.CheckKill() tests whether we want to pre-emptively terminate the program, and throws an exception if we do.

If you want to see all the operations, you can take a look at the source here.

SharpLab refuses to dump assembly for anything more than very trivial generics. I don't have good debugger skills, and using a debugger on JIT'ed is painful. I wrote a simple native module for dumping assembly, but even when it works, it's very difficult to make sense of the output. I mostly pray that it works and trust the benchmarks at this point.

This is what the Mandelbrot program looks like when converted into a single huge type:

Beautiful.

Compiling at Runtime

Parakeet

So, we can convert a brainfuck program into a bunch of very unconventional C# code. But what about compiling at runtime? You were told it worked at runtime! Now Reflection will save the day! S&box exposes a simplified reflection library, but it's enough for our purposes.

Here's how we construct constants:

    private static Type MakeGeneric(Type base_ty, Type[] args) {
        return TypeLibrary.GetType(base_ty).MakeGenericType(args);
    }

    private static Type GetDigit(int n) {
        switch (n) {
            case 0: return typeof(D0);
            case 1: return typeof(D1);
            case 2: return typeof(D2);
            case 3: return typeof(D3);
            case 4: return typeof(D4);
            case 5: return typeof(D5);
            case 6: return typeof(D6);
            case 7: return typeof(D7);
            case 8: return typeof(D8);
            case 9: return typeof(D9);
            case 0xA: return typeof(DA);
            case 0xB: return typeof(DB);
            case 0xC: return typeof(DC);
            case 0xD: return typeof(DD);
            case 0xE: return typeof(DE);
            case 0xF: return typeof(DF);
        }
        throw new Exception("die");
    }

    private static Type GenerateConst(int n) {
        if (n < 0) {
            return MakeGeneric(typeof(Neg<>),[GenerateConst(-n)]);
        }
        if (n < 16) {
            return GetDigit(n);
        } else if (n < 256) {
            return MakeGeneric(typeof(Num<,>),[GetDigit(n>>4),GetDigit(n&0xF)]);
        } else {
            throw new Exception("const too large "+n);
        }
    }

Building instructions is basically the same. You can take a look at the full code, if you want.

Q and A

Parakeet

Is this actually useful?

Probably not. There are a ton of better ways to build a compiler. If you want to build one on .NET, and you aren't in a sandboxed jail, you should probably just use System.Reflection.Emit or maybe System.Linq.Expressions and save yourself most of the headache of hoping the JIT will make sense of your insane garbage.

So why have you done this?

Because it's funny, and it's an opportunity to show off. I'd like to credit rileyzzz and their very cool project DExTr for some of the inspiration.

What next?

Brainfuck was really good for a prototype, given it's limited instruction set and very simple control-flow. But I really want to run webassembly. This is a much harder challenge. It will probably involve some real control-flow and data-flow analysis. It might not go anywhere. Stay tuned I guess. Maybe if I'm successful we can see how much slower it is than DExTr.