Compilers

Notes

  • The compiler is a program that takes your human-readable source code, analyses it, and produces a computer-readable binary code

  • The steps of processing involved in compilng are:

    • Preprocessing: reads the input file and substitutes specific macros, symbols and directives with their definitions;

    • Lexical analysis (or tokenization): split the input up character by character, and group them together to form words, identifiers, symbols, and more - these groups are called tokens;

    • Parsing: take the tokens, attempt to see if they're in certain patterns (usually defined by a grammar), then associate those patterns with expressions like calling functions, recalling variables, or math operations; the output of the parser is the abstract syntax tree (AST);

    • Optimisation: take the AST and try to evaluate constant expressions, remove unused variables or unreachable code, unroll loops if possible, etc.;

    • Generating code: take the AST and emit the equivalent in the assembly language in an object file (or other high-level programming languages in the case of a transpiler)

    • Linking: take all the object files and make an executable, a shared library, or a static library

  • The front end is responsible for taking in the source code and turning it into an intermediate representation.

  • The middle end is responsible for performing various optimizations on the intermediate representation.

    • these optimisations are independent of the target platform, so they are going to speed up the code regardless of what the back end does

    • some examples of optimisations are constant folding, reachability analysis and dead code elimination

  • The back end is responsible for taking the optimised intermediate representation and generating the machine code for the specific CPU architecture or generating bytecode.

Resources

Articles

Books

Courses

GitHub repositories

Websites

YouTube Playlists

YouTube Videos

Last updated