As I mentioned before, the compiler consumes the interpreter's bytecode. The following instructions need to be converted:
UpdateCell(offset, inc): Adds inc to the cell at offset from the current pointer.
UpdatePointer(offset): Modifies the current pointer by offset.
ZeroCell(offset): Sets the cell at offset from the current pointer to zero.
LoopStart(offset): Jumps forward to offset if the current cell is zero.
LoopEnd(offset): Jumps backward to offset if the current cell is nonzero.
OutputCell(offset): Writes the cell to output at offset from the current pointer.
I define an Operation interface, and a struct for each instruction. Here's an example:
interface Op {
int Run(int index, byte[] data, Output output);
}
struct ZeroCell<OFFSET,NEXT> : Op
where OFFSET: struct, Const
where NEXT: struct, Op
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public int Run(int index, byte[] data, Output output)
{
data[index + default(OFFSET).Run()] = 0;
return default(NEXT).Run(index, data, output);
}
}
Most operations have a NEXT parameter, which allows us to chain them together. The final operation in a chain sets this to a Stop type, which does nothing.
struct Stop : Op
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public int Run(int index, byte[] data, Output output)
{
return index;
}
}
Instead of storing the current pointer in some state object, it is passed around and returned by each operation. My hope is that this will keep it in a register as much as possible, and avoid constantly loading and storing it.
struct UpdatePointer<OFFSET,NEXT> : Op
where OFFSET: struct, Const
where NEXT: struct, Op
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public int Run(int index, byte[] data, Output output)
{
index += default(OFFSET).Run();
return default(NEXT).Run(index, data, output);
}
}
The LoopStart and LoopEnd instructions are converted into a single Loop type:
struct Loop<BODY,NEXT> : Op
where BODY: struct, Op
where NEXT: struct, Op
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public int Run(int index, byte[] data, Output output)
{
var body = default(BODY);
while (data[index] != 0) {
output.CheckKill();
index = body.Run(index, data, output);
}
return default(NEXT).Run(index, data, output);
}
}
Both BODY and NEXT are Operation chains. The call to output.CheckKill() tests whether we want to pre-emptively terminate the program, and throws an exception if we do.
If you want to see all the operations, you can take a look at the source here.
SharpLab refuses to dump assembly for anything more than very trivial generics. I don't have good debugger skills, and using a debugger on JIT'ed is painful. I wrote a simple native module for dumping assembly, but even when it works, it's very difficult to make sense of the output. I mostly pray that it works and trust the benchmarks at this point.
This is what the Mandelbrot program looks like when converted into a single huge type: Beautiful.