First, it may be useful to understand a few things about compilers. Beware that each of these topics could be their own blog post. If something is confusing, it might be a good idea to find a more in-depth resource to explain it. Also beware that I'm a fairly knowledgeable hobbyist, but not an expert, especially on .NET. If you see something that's wrong feel free to yell at me.
Just in Time compilers, or JIT compilers are compilers which generate machine code while the program is running. The .NET runtime is an example of this. When C# is initially compiled, it is only compiled to bytecode. When the program is executed, the bytecode may be executed directly, or it may be compiled to machine code. A JIT may even involve several tiers, where more frequently used code is re-compiled with more optimizations. Other examples of JIT compilers are the Java Virtual Machine, LuaJIT, and most JavaScript runtimes. Language implementations which compile straight to machine code at build-time are called Ahead of Time, or AOT.
Constant Evaluation, or Constant Folding is an optimization which involves simplifying constant expressions. Any decent compiler should be able to do the following:
int x = 3 * 3 * 3 * 3 * 5 * 5;
// simplifies to:
int x = 2025;
When the code is executed, it won't have to perform the multiplies. It will simply load 2025 into x.
Inlining is another very usful optimization, which replaces a call to a function with the body of the function. For example:
int Add(int x, int y) {
return x + y;
}
int Add3(int x, int y, int z) {
return Add(Add(x,y),z);
}
// the second function can be optimized to:
int Add3(int x, int y, int z) {
return x + y + z;
}
The real power of inlining isn't just that it saves time doing a call, it's that it can enable other optimizations. A function can be inlined, then constants in the resulting code can be folded:
int Five() {
return 5;
}
int Fifteen() {
return Five()+Five()+Five();
}
// the second function can be optimized to:
int Fifteen() {
return 15;
}
Monomorphization is a slightly more niche topic, but it is also very important. Monomorphization is the process of generating separate code for generic functions, depending on the type parameters. Consider the following function:
static T Min<T>(T a, T b) where T: IComparable {
if (a.CompareTo(b)<0) {
return a;
} else {
return b;
}
}
If monomorphization is used, then calling this function with ints, floats, and strings would all generate separate code. In the separate code, the call to CompareTo can be devirtualized. Much like inlining, this can be used to enable other optimizations: Instead of just devirtualizing the call, it could be inlined.
Monomoprhization is a more popular topic in AOT compiled languages like C++ and Rust, which will monomorphize code almost any time you use generics. The down-side of monomorphization is that it increases the amount of code generated, and increases compile times. The .NET runtime and JVM are less eager to monomorphize code, preferring to generate polymorphic code with virtual calls.
Reflection isn't really a compiler topic, but it's another key part of this compiler. Reflection is a feature of some managed language runtimes, like .NET and the JVM. It allows you to deal with types dynamically at runtime. Reflection could allow us to load an assembly at runtime, search it for classes implementing some interface, then create an instance of each of those classes. S&box doesn't give us access to the entire .NET reflection library, but it provides some wrappers for some useful features.