Bootstrapping (compilers)

This article is about bootstrapping compilers. For the general concept, see Bootstrapping.

In computer science, bootstrapping is the process of writing a compiler (or assembler) in the source programming language that it intends to compile. Applying this technique leads to a self-hosting compiler. An initial minimal core version of the compiler is generated in a different language (which could be assembly language); from that point, successive expanded versions of the compiler are run using the minimal core of the language.

Many compilers for many programming languages are bootstrapped, including compilers for BASIC, Algol, C, D, Pascal, PL/I, Factor, Haskell, Modula-2, Oberon, OCaml, Common Lisp, Scheme, Go, Java, Rust, Python, Scala, Nim, Eiffel, and more.

Advantages

Bootstrapping a compiler has the following advantages:[1][2]

The chicken and egg problem

If one needs to obtain a compiler for language X (which is written in language X), there is the issue of how the first compiler can be written. The different methods that are used in practice to solving this chicken or the egg problem include:

Methods for distributing compilers in source code include providing a portable bytecode version of the compiler, so as to bootstrap the process of compiling the compiler with itself. Such methods are also one way of detecting or avoiding (or both) the potential problem pointed out in Reflections on Trusting Trust. The T-diagram is a notation used to explain these compiler bootstrap techniques.[2] In some cases, the most convenient way to get a complicated compiler running on a system that has little or no software on it involves a series of ever more sophisticated assemblers and compilers.[3]

History

Assemblers were the first language tools to bootstrap themselves.

The first high level language to provide such a bootstrap was NELIAC in 1958. The first widely used languages to do so were Burroughs B5000 Algol in 1961 and LISP in 1962.

Hart and Levin wrote a LISP compiler in LISP at MIT in 1962, testing it inside an existing LISP interpreter. Once they had improved the compiler to the point where it could compile its own source code, it was self-hosting.[4]

The compiler as it exists on the standard compiler tape is a machine language program that was obtained by having the S-expression definition of the compiler work on itself through the interpreter.
AI Memo 39[4]

This technique is only possible when an interpreter already exists for the very same language that is to be compiled. It borrows directly from the notion of running a program on itself as input, which is also used in various proofs in theoretical computer science, such as the proof that the halting problem is undecidable.

List of languages having self-hosting compilers

The following programming languages have self-hosting compilers:

See also

References

  1. Compilers and Compiler Generators: An Introduction With C++. Patrick D. Terry 1997. International Thomson Computer Press. ISBN 1-85032-298-8
  2. 1 2 "Compiler Construction and Bootstrapping" by P.D.Terry 2000. HTML. PDF Archived December 14, 2010, at the Wayback Machine..
  3. "Bootstrapping a simple compiler from nothing" Archived March 3, 2010, at the Wayback Machine. by Edmund GRIMLEY EVANS 2001
  4. 1 2 Tim Hart and Mike Levin. "AI Memo 39-The new compiler" (PDF). Retrieved 2008-05-23.
  5. https://code.google.com/p/virgil/
This article is issued from Wikipedia - version of the 11/18/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.