Skip to content

ludocode/onramp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to the INFORMATION SUPERHIGHWAY!!1

Onramp

Onramp is a virtualized implementation of C that can be bootstrapped from scratch on arbitrary hardware.

Starting in machine code, we implement...

The resulting toolchain can (soon) bootstrap a native C compiler (e.g. TinyCC), which can bootstrap GCC, which can compile an entire system.

Only the first two steps are platform-specific. The entire rest of the process operates on a platform-independent bytecode. Onramp bytecode is simple to implement, simple to hand-write, and simple to compile to, making the entire bootstrap process as simple and portable as possible.

The platform independence of Onramp makes present-day C trivially compilable by future archaeologists, alien civilizations, collapse recovery efforts and more. The goal of Onramp is to maintain a timeless and universal bootstrapping path to C.

What is self-bootstrapping?

Most compilers are self-hosting: they are written in the language they compile. C compilers tend to be written in C, so to compile a C compiler, you need to already have a C compiler. This is a chicken-and-egg problem.

Onramp is instead self-bootstrapping: it can compile itself from scratch. Onramp is written in stages and broken up into small discrete tools. Each stage of each tool can be compiled by the stages before it. All stages are plain text, human-readable and heavily documented to make the entire process auditable.

All you need to compile and use Onramp are the initial stages: the hex tool and virtual machine. These can easily be implemented by anyone in anything. Onramp includes implementations in handwritten machine code as well as in high-level languages like Python and C.

Once you have a VM, Onramp bootstraps itself. Read the full bootstrapping path for details. Onramp follows in the footsteps of the bcompiler and stage-0 bootstrapping projects; see the inspiration page for more.

Why bootstrap?

Security: Compiler binaries can contain malware and backdoors that insert viruses into programs they compile. Malicious code in a compiler can even recognize its own source code and propagate itself. Recompiling a compiler with itself therefore does not eliminate the threat. The only compiler that can truly be trusted is one that you've bootstrapped from scratch.

Preservation: We have a duty to preserve information and media about our culture, our history, and our world for future generations. We need to make it possible for contemporary codecs and compression algorithms to run on hardware we can't even concieve of. The best way to do that is to preserve the code, along with the tools to compile it on anything.

Education: Bootstrapping demonstrates the entire stack from machine code to a high-level language. Students can observe how every step of the process works, and in the case of Onramp, on a simplified machine with simplified tools and languages.

Fun: Modern languages and frameworks have little connection to how hardware really works. Layers of complexity and waste continue to build upon one another, and goals of simplicity and efficiency have been abandoned by the industry. Bootstrapping is a respite from this. Working with the bare metal, writing low-level code, understanding every part of the machine and the toolchain can reignite our passion for software and bring much needed joy back into programming.

Under Construction!

Onramp is not yet complete. It can compile Doom, but not much else at the moment. It is missing floating point support and most libc functionality.

A near-term goal is to compile native compilers and tools: TinyCC, cproc+QBE, chibicc/Kefir+binutils, etc. A medium-term goal is to be able to boot a computer directly into a freestanding Onramp VM in order to bootstrap a modern OS kernel from source.

Onramp is an experiment in implementing C completely from scratch on a custom architecture, retaining all bootstrap stages in between. It is essentially three compilers, three preprocessors/assemblers/linkers, a multi-stage libc, a custom instruction set, several virtual machines, a debugger... It will take a long time to complete and there are many directions it can take in the future.

I welcome bug reports but I am not currently accepting contributions. Feel free to fork this and implement VMs but be aware that the bytecode format will change. (Also, I don't have good VM tests at the moment so you might have a lot of trouble debugging your VM.)

Project Status

Current status of Onramp components.

Build Status

Tests

Quick Start

WARNING: The libc is incomplete and there is no support for floating point math. Onramp is not yet ready for real world use.

On POSIX systems, run the build script and put the results on your PATH.

scripts/posix/build.sh
export PATH=$PWD/build/posix/bin:$PATH

That's it! You can now compile C programs with onrampcc.

You'll need this PATH to run programs since they depend on onrampvm. If you'd like to install Onramp in ~/.local/bin instead, run this:

scripts/posix/install.sh

Since Onramp is self-bootstrapping, this works even on a system that does not have a C compiler, binutils, make or any other build tools. Try it on a barebones x86_64 Linux with nothing but coreutils.

See the Setup Guide for more information on how to build Onramp and the Usage Guide for how to use it.

Project Organization

  • core/ - Source code of the platform-independent parts of Onramp. Contains the compiler, linker, driver, libc, etc.
  • platform/ - Source code implementations of the platform-specific components of Onramp for various platforms.
  • scripts/ - Scripts for building and installing Onramp on various platforms.
  • docs/ - Specifications of Onramp's languages and other documentation. Defines the Onramp subsets of C, Onramp Assembly, etc.
  • test/ - Test cases and scripts for testing the various Onramp components.

Documentation Index

Components

Platform-specific:

Program Description Operation
hex Hex tool Converts hexadecimal .ohx to raw bytes
vm Virtual machine Executes .oe bytecode, bridges filesystem when hosted

Platform-independent:

Program Description Operation
cc Driver Performs any or all phases of translation
cpp Preprocessor Preprocesses .c to .i
cci Compiler Compiles .i to .os
as Assembler Assembles .os to .oo object file
ar Archiver Combines .oo object files into .oa library
ld Linker Links .oo and .oa into .oe executable
libc Standard library Provides C and POSIX library functions
sh Shell Runs scripts
os Operating System Implements a filesystem and syscalls

File Types

Extension Description
.ohx Onramp Hexadecimal, plain-text hexadecimal with comments
.oe Onramp Executable, an Onramp bytecode program in binary
.oo Onramp Object File, plain-text bytecode with labels
.oa Onramp Archive, a static library of .oo files
.os Onramp Assembly, our custom assembly language
.i Preprocessed C source code (no comments, no preprocessor directives)
.c C source code, an Onramp Subset (omC or opC) or a standard version
.sh Onramp Shell, our subset of POSIX shell
.od Onramp Debug Info, the debug symbols for an Onramp executable

Limitations

The Onramp C compiler targets a simple virtual machine with its own runtime environment and libc. This means it can't link against native libraries, and it can't do graphics, sound, networking, etc.

Onramp is not intended to be a general-purpose native compiler. It is intended (among other things) to bootstrap such a compiler.

The virtual machine therefore implements only those features that would be useful to a compilation environment. These features should nevertheless be sufficient to emulate much of a POSIX-like system, to support some basic coreutils and to run configure scripts and build tools and so on. A goal of Onramp is to be able to compile an entire native system including a kernel from a freestanding Onramp VM.

About

A portable self-bootstrapping C compiler

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published