Archives

Tags

An assembler for the Bolverk machine emulator

After I finished writing the Bolverk emulator (source, implementation) in early 2009, I instantly had ideas about designing an assembly language for it's native instruction set. It sounded like an interesting project.

After reading the first few hundred pages of Michael L. Scott's brilliant book "Programming Language Pragmatics" during February of this year, I felt that I had enough knowledge about writing simple compilers (most assemblers are not really considered compilers, but shhhh) in order to get started. In fact, the whole process turned out to be far more challenging and rewarding than I would ever have guessed before.

An assembler provides a level of abstraction above the native language of the machine at hand. Most basic assemblers provide a one-to-one mapping between named instructions and native commands (generally ones and zeroes). Many more modern assemblers also support more advanced features like macro expansion and control structures over basic JUMPing.
With an assembler, we can now express instructions to Bolverk in atleast slightly more memorable instructions. This is the reason the primitive procedures defined in an assembly language are often referred to as "mnemonics".

Another great advantage of an assembler is it's ability to define arbitrary abstractions that defy the architectural restrictions of it's target machine. In particular, the bolverk machine specifies that only 4-bits are used to identify an instruction. This leaves us with only sixteen (0..F) possible options. If chosen correctly, these sixteen natives will be more than enough to define useful abstractions by compounding and naming them. As an example, the PVDS mnemonic used in the example below will actally compile to 3 lines of machine code.

With the finished product, we can now write something like this:

-- Print the sum of two decimal numbers (-10 and 120)
-- followed by an exclamation mark.
VALL 1, -10
VALL 2, 120
PVDS 1, 2
PVCH '!'

Instead of the very-hard-to-remember machine code equivalent:

21f6
2278
2421
5123
33a0
34a1
d1a0
d0a1
c000

The assembly version required only four instructions opposed to the nine required in the machine code version. It was also presented in much more readable manner.

Once we know the following semantics:

  • VALL: Load into register X the value Y
  • PVDS: Print to STDOUT the value of the sum of register X and register Y
  • PVCH: Print to STDOUT, the value of X as though it's an ASCII character

The assembly version becomes very easy to understand.

The source code is openly available from my git repository here: http://github.com/buntine/Bolverk-Assembler. I will add a front-end for it into the web-based implementation of Bolverk (linked above) in the coming days. I've also published the grammar and a language specification in the same git repository.

I feel that my work on this set of projects could really be beneficial in teaching the ideas behind a Von Neumann machine to beginning computer science students. If found by the right person in the right position, I think they could do wonders with it. If you are that person - please give me an email or some feedback! I'd happily donate all of my work to your organisation if it can be put to good use.