8-bit CPU: Generating machine code

I developed some demo programs using an assembly-like constructs in Python. All is nice and it utilizes real hardware for calculations. Still that is just Python code running on PC and sending control signals to my registers, ALU and memory. Eventually, I'll have to run the code on the 8-bit computer autonomously. For that I need to convert it to sequence of binary instruction opcodes and arguments - machine code. I've also tried to hand-assemble some simple programs, but it feels quite tedious and error-prone.

I did not want to write an assembler utility, decided that I might try to convert the existing demo programs into machine code instead.

Compiling Python into machine code may appear to be even more complicated, but it's not really Python I need to translate. The demo programs are already programmed in assembly-like manner. For example, there's an "instruction":

    ldi (A, 42)

What it really does is up to the backend code. Currently it sends a sequence of commands to Arduino, instructing it to toggle control wires. I can write another backend that writes byte sequence 01 2A into file instead.

That would be enough for simple programs without any branching, but what about something more complex, involving conditionals and loops? Python does not have labels nor goto statements, so I emulate loops and conditionals using while loops and if statements. That does not translate to assembly at all.

Sample code

Well, yes and no. It can not be used as-is, but if we apply some transformations to the code, it works out. Let's take a simple program, that loops through a memory area of 32 bytes and outputs all bytes that are non-zero:

    # initialize i = 0, store it
    ldi (A, 0)
    st (i, A)

    while True:
        # each iteration begins with A holding value of i

        # calculate address of data[i]
        ldi (B, data)   # load address of data into B
        add (B, A)      # add offset

        # load value from address=B into A
        # this also updates flags
        ldabs (A, B)

        # output if flag Z is not set (value not zero)
        if not beq():
            out(A)

        # A was changed by ldabs(), load back index
        ld (A, i)

        # increment it (current ISA is a bit limited)
        ldi (B, 1)
        add (A, B)

        # store i for later
        st (i, A)

        # check if a != 32
        ldi (B, 32)
        cmp (A, B)

        # and jump back to start
        if bne(): continue
        break # or exit loop

    hlt()
 

There's a conditional block and a loop. How can we translate those into jumps?

Conditional blocks

If block is quite simple: you should just jump past it. How to do that? Let's add a variable for label and transform the if statement:

        skip_output = Label()

        # output if flag Z is not set (value not zero)
        beq(skip_output)
        out(A)

        skip_output.here()

The 'if not' is removed, beq() function gets an argument, and there's a mysterious .here() method call. What's going on?

In Python you have to initialize variable before using it, so we are creating an instance of Label()  earlier. Argument for the beq() indicates the target of conditional jump. But to what location exactly? The code-generating backend keeps track of the current memory address and resolves the contents of label when .here() is called.

Loops

What about loops? I do not support any conditions on 'while' statement (all loops are 'while True:'), iterations should be controlled by 'if' statements, containing conditional jump "instructions". As a result there are only few cases to consider:

  • start the next iteration - conditional jump to label at the beginning of loop
  • exit loop - conditional jump to label after the loop
  • reach end of the loop, and there's break statement - do nothing, fallthrough
  • reach end of the loop, and there's no break - unconditional jump to label at the beginning

Jumps to arbitrary locations in the loop are not allowed (I suppose that could be an exotic optimization in assembly, but not achievable in Python anyway), just to the beginning of the loop or to a location past it. Let's transform the "essence" of the loop:

    loop_start = Label()
    loop_end = Label()

    # while True:
    loop_start.here()

    # body of the loop

    # and jump back to start
    bne(loop_start)
    #break # or exit loop

    loop_end.here()

The loop_end label is unused here, but it does not do any harm to insert it anyway. The last statement in the loop was break, it is a fall-through case. Note that code is "flattened" out - there's no indentation of the block of while True: anymore.

Transforming

That was theory, but these transformations feels complicated. One have to search for ifs and whiles, insert lines, add function arguments, unindent blocks. Certainly a lot more work than a simple parser for assembly syntax. Unless...

That's where Python AST module comes in. It parses the code, and its NodeTransformer handles all the heavy lifting. Only a small amount of code is needed to recognize various cases and apply transformations. Once transformed, the result can be dumped it back into .py file.

Transformed sample program looks like this:

    l_2_skip = Label()
    l_1_loop = Label()
    l_1_end = Label()
    ldi(A, 0)
    st(i, A)
    l_1_loop.here()
    ldi(B, data)
    add(B, A)
    ldabs(A, B)
    beq(l_2_skip)
    out(A)
    l_2_skip.here()
    ld(A, i)
    ldi(B, 1)
    add(A, B)
    st(i, A)
    ldi(B, 32)
    cmp(A, B)
    bne(l_1_loop)
    l_1_end.here()
    hlt()

All the pretty formatting and comments are lost, but it still contains all the original ldi(), add(), etc. "instructions". Loops and conditionals are substituted with references to Label() objects in a way I described earlier.

When this code is executed against code-generating backend, it supplies all the necessary information for machine code generation. What remains is just to match instructions against opcodes, resolve addresses and write everything to a file.

Conclusions

The "compilation" procedure described may seem a bit odd and error-prone. There's no denying that, but it was worth to try. It works, after all! The resulting binary appears to run fine on the CPU in its current state.

Will I continue to develop this way? Probably not, it feels bit too "hacky" for me. I think I'll either try to write a proper assembler utility or port an existing one for my architecture. Spoiler from future: I've found out that CustomAsm completely covers my needs. Also, it was really easy to adopt.

Comments

Popular posts from this blog

8-bit CPU: ALU and Flags

8-bit CPU: Clock module

8-bit CPU: Memory