Brian Robert Callahan

academic, developer, with an eye towards a brighter techno-social life

2021-04-14
Demystifying programs that create programs, part 8: Finishing opcode processing

All source code for this blog post can be found here.

One last day of teaching our assembler about Intel 8080 opcodes. For the last time, let's pull up our Intel 8080 opcode table and now we will tackle the first quarter of the table.

Instructions in the first quarter of the opcode table

Eight of the instructions are nops, which our assembler already knows. What remains are logical shifts and rotates, incrementers and decrementers, and moving immediates into registers. Let's get to it

Logical shifts and rotates

There are eight: rlc, rrc, ral, rar, daa, cma, stc, and cmc. They're not all shifts and rotates, but they are organized together. All of them take no arguments and are one byte in size. At this point, I think we know how to code them up:

/**
 * rcl (0x07)
 */
private void rlc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x07);
}

/**
 * rrc (0x0f)
 */
private void rrc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x0f);
}

/**
 * ral (0x17)
 */
private void ral()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x17);
}

/**
 * rar (0x1f)
 */
private void rar()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x1f);
}

/**
 * daa (0x27)
 */
private void daa()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x27);
}

/**
 * cma (0x2f)
 */
private void cma()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x2f);
}

/**
 * stc (0x37)
 */
private void stc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x37);
}

/**
 * cmc (0x3f)
 */
private void cmc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x3f);
}

Incrementers and decrementers

Next up are the incrementers and decrementers: inx, inr, dcr, dad, and dcx. With the exception of dad, if it ends with an x it is a 16-bit operation and if it ends with an r it is an 8-bit operation. dad happens to be 16-bit. All are one byte in size and follow the usual register formula.

A small tweak to regMod16

If you look at the regMod16 function and compare it to the inx, dad, and dcx instructions, you will notice that there is an sp register being used rather than the psw register referred to in the regMod16 function, and indeed the push and pop instructions. Turns out these sets of instructions accept different, only partially overlapping, sets of registers. The good news is that sp and psw registers encode to the same offset of 0x30. What we need to do then is check for both in our register list, and then make sure we are not using the register with the wrong instruction. That looks like this:

/**
 * Return the 16 bit register offset.
 */
private int regMod16()
{
    if (a1 == "b") {
        return 0x00;
    } else if (a1 == "d") {
        return 0x10;
    } else if (a1 == "h") {
        return 0x20;
    } else if (a1 == "psw") {
        if (op == "pop" || op == "push")
            return 0x30;
        else
            err("psw may not be used with " ~ op);
    } else if (a1 == "sp") {
        if (op != "pop" && op != "push")
            return 0x30;
        else
            err("sp may not be used with " ~ op);
    } else {
        err("invalid register for " ~ op);
    }

    /* This will never be reached, but quiets gdc.  */
    return 0;
}

Coding the incrementers and decrementers

Let's code the 16-bit incrementers and decrementers up first now that we improved the regMod16 function:

/**
 * inx (0x03 + 16 bit register offset)
 */
private void inx()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x03 + regMod16());
}

/**
 * dad (0x09 + 16 bit register offset)
 */
private void dad()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x09 + regMod16());
}

/**
 * dcx (0x0b + 16 bit register offset)
 */
private void dcx()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x0b + regMod16());
}

And now the 8-bit incrementers and decrementers:

/**
 * inr (0x04 + (8 bit register offset << 3))
 */
private void inr()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x04 + (regMod8(a1) << 3));
}

/**
 * dcr (0x05 + (8 bit register offset << 3))
 */
private void dcr()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x05 + (regMod8(a1) << 3));
}

Storing and loading the accumulator

Since all of our arithmetic happens on register a, it would be good to have easy ways to store and load accumulator values. Fortunately, we have such functions: stax, ldax, shld, lhld, sta, and lda. The middle two are really for storing and loading hl, but close enough for us.

Interestingly, stax and ldax are both one byte in size and take one argument. But that argument can only be register b or register d. The rest are all three bytes in size and take a 16-bit address as their single argument. We can use our a16 function for those.

Let's code up stax and ldax first:

/**
 * stax (0x02 + 16 bit register offset)
 */
private void stax()
{
    argcheck(!a1.empty && a2.empty);
    if (a1 == "b")
        passAct(1, 0x02);
    else if (a1 == "d")
        passAct(1, 0x12);
    else
        err("stax only takes b or d");
}

/**
 * ldax (0x0a + 16 bit register offset)
 */
private void ldax()
{
    argcheck(!a1.empty && a2.empty);
    if (a1 == "b")
        passAct(1, 0x0a);
    else if (a1 == "d")
        passAct(1, 0x1a);
    else
        err("ldax only takes b or d");
}

I decided to take the direct route rather than add special case logic to regMod16 for these two instructions.

Now let's code up the rest of the loads and stores:

/**
 * shld (0x22)
 */
private void shld()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0x22);
    a16();
}

/**
 * lhld (0x2a)
 */
private void lhld()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0x2a);
    a16();
}

/**
 * sta (0x32)
 */
private void sta()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0x32);
    a16();
}

/**
 * lda (0x3a)
 */
private void lda()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0x3a);
    a16();
}

The two odd instructions: `lxi` and `mvi`

We have only two instructions left to teach our assembler: lxi and mvi. These two are different from every other instruction in their pattern, and it causes us to have to make some special cases for them.

Every other instruction that takes an immediate or an address takes only one argument, and that argument is the immediate or address. Instead, lxi and mvi take two arguments and it is the second argument that is the immediate. We will need to update our imm function to adapt to either possibility:

/**
 * Get an 8-bit or 16-bit immediate.
 */
private void imm(int type)
{
    ushort num;
    string arg;
    bool found = false;

    if (op == "lxi" || op == "mvi")
        arg = a2;
    else
        arg = a1;

    if (isDigit(arg[0])) {
        num = numcheck(arg);
    } else {
        if (pass == 2) {
            for (size_t i = 0; i < stab.length; i++) {
                if (arg == stab[i].lab) {
                    num = stab[i].value;
                    found = true;
                    break;
                }
            }

            if (!found)
                err("label " ~ arg ~ " not defined");
        }
    }

    if (pass == 2) {
        output ~= cast(ubyte)(num & 0xff);
        if (type == IMM16)
            output ~= cast(ubyte)((num >> 8) & 0xff);
    }
}

We added a check to see if the current op is lxi or mvi and if it is, we want to work with a2. Otherwise, we want to work with a1. The rest of the logic stays the same except at the very end where we check to see if we should print an 8-bit or 16-bit number and then print one or two bytes depending.

The big change is that we added an argument to the imm function. I am using fancy constants IMM8 and IMM16 to mean 8-bit immediate and 16-bit immediate. But we need to add all this to our code. At the top of our code, where all the global variables are, let's add these two lines:

/**
 * 8 and 16 bit immediates
 */
enum IMM8 = 8;
enum IMM16 = 16;

This codes IMM8 and IMM16 as constants with their respective values. We must also now look through all our code for every instance where we call the imm function. Since every call to the imm function right now is asking for an 8-bit immediate, we change imm(); to imm(IMM8);. mvi will also take an 8-bit immediate but our lxi instruction is the one instruction that takes a 16-bit immediate.

Other facts: lxi is 3 bytes in size and uses a 16-bit register offset as its first argument and a 16-bit immediate as its second argument. mvi is 2 bytes in size, uses the 8-bit register formula for its first argument, and takes an 8-bit immediate as its second argument. See if you can code these up before looking at my functions:

/**
 * lxi (0x01 + 16 bit register offset)
 */
private void lxi()
{
    argcheck(!a1.empty && !a2.empty);
    passAct(3, 0x01 + regMod16());
    imm(IMM16);
}

/**
 * mvi (0x06 + (8 bit register offset << 3))
 */
private void mvi()
{
    argcheck(!a1.empty && !a2.empty);
    passAct(2, 0x06 + (regMod8(a1) << 3));
    imm(IMM8);
}

Finally, let's hook up all these new instructions to the mnemonic list in the process function. And with that, we have taught our assembler all the Intel 8080 instructions! You can write programs using any and all of the instructions and you will get a correctly assembled executable.

Current state of the assembler

As always, to wrap up, here is the assembler is it stands now:

import std.stdio;
import std.file;
import std.algorithm;
import std.string;
import std.conv;
import std.exception;
import std.ascii;

/**
 * Line number.
 */
private size_t lineno;

/**
 * Pass.
 */
private int pass;

/**
 * Output stored in memory until we're finished.
 */
private ubyte[] output;

/**
 * Address for labels.
 */
private ushort addr;

/**
 * 8 and 16 bit immediates
 */
enum IMM8 = 8;
enum IMM16 = 16;

/**
 * Intel 8080 assembler instruction.
 */
private string lab;      /// Label
private string op;       /// Instruction mnemonic
private string a1;       /// First argument
private string a2;       /// Second argument
private string comm;     /// Comment

/**
 * Individual symbol table entry.
 */
struct symtab
{
    string lab;         /// Symbol name
    ushort value;       /// Symbol value
};

/**
 * Symbol table is an array of entries.
 */
private symtab[] stab;

/**
 * Top-level assembly function.
 * Everything cascades downward from here.
 * Repeat the parsing twice.
 * Pass 1 gathers symbols and their addresses/values.
 * Pass 2 emits code.
 */
private void assemble(string[] lines, string outfile)
{
    pass = 1;
    for (lineno = 0; lineno < lines.length; lineno++) {
        parse(lines[lineno]);
        process();
    }

    pass = 2;
    for (lineno = 0; lineno < lines.length; lineno++) {
        parse(lines[lineno]);
        process();
    }

    fileWrite(outfile);
}

/**
 * After all code is emitted, write it out to a file.
 */
private void fileWrite(string outfile) {
    import std.file : write;

    write(outfile, output);
}

/**
 * Parse each line into (up to) five tokens.
 */
private void parse(string line) {
    /* Reset all our variables.  */
    lab = null;
    op = null;
    a1 = null;
    a2 = null;
    comm = null;

    /* Remove any whitespace at the beginning of the line.  */
    auto preprocess = stripLeft(line);

    /* Split comment from the rest of the line.  */
    auto splitcomm = preprocess.findSplit(";");
    if (!splitcomm[2].empty)
        comm = strip(splitcomm[2]);

    /* Split second argument from the remainder.  */
    auto splita2 = splitcomm[0].findSplit(",");
    if (!splita2[2].empty)
        a2 = strip(splita2[2]);

    /* Split first argument from the remainder.  */
    auto splita1 = splita2[0].findSplit("\t");
    if (!splita1[2].empty) {
        a1 = strip(splita1[2]);
    } else {
        splita1 = splita2[0].findSplit(" ");
        if (!splita1[2].empty) {
            a1 = strip(splita1[2]);
        }
    }

    /* Split op from label.  */
    auto splitop = splita1[0].findSplit(":");
    if (!splitop[1].empty) {
        op = strip(splitop[2]);
        lab = strip(splitop[0]);
    } else {
        op = strip(splitop[0]);
    }

    /**
     * Fixup for the label: op case.
     */
    auto opFix = a1.findSplit("\t");
    if (!opFix[1].empty) {
        op = strip(opFix[0]);
        a1 = strip(opFix[2]);
    } else {
        opFix = a1.findSplit(" ");
        if (!opFix[1].empty) {
            op = strip(opFix[0]);
            a1 = strip(opFix[2]);
        } else {
            if (op.empty && !a1.empty && a2.empty) {
                op = a1;
                a1 = null;
            }
        }
    }
}

/**
 * Figure out which op we have.
 */
private void process()
{
    /**
     * Special case for if you put a label by itself on a line.
     * Or have a totally blank line.
     */
    if (op.empty && a1.empty && a2.empty) {
        passAct(0, -1);
        return;
    }

    /**
     * List of all valid mnemonics.
     */
    if (op == "nop")
        nop();
    else if (op == "lxi")
        lxi();
    else if (op == "stax")
        stax();
    else if (op == "inx")
        inx();
    else if (op == "inr")
        inr();
    else if (op == "dcr")
        dcr();
    else if (op == "mvi")
        mvi();
    else if (op == "rlc")
        rlc();
    else if (op == "dad")
        dad();
    else if (op == "ldax")
        ldax();
    else if (op == "dcx")
        dcx();
    else if (op == "rrc")
        rrc();
    else if (op == "ral")
        ral();
    else if (op == "rar")
        rar();
    else if (op == "shld")
        shld();
    else if (op == "daa")
        daa();
    else if (op == "lhld")
        lhld();
    else if (op == "cma")
        cma();
    else if (op == "sta")
        sta();
    else if (op == "stc")
        stc();
    else if (op == "lda")
        lda();
    else if (op == "cmc")
        cmc();
    else if (op == "mov")
        mov();
    else if (op == "hlt")
        hlt();
    else if (op == "add")
        add();
    else if (op == "adc")
        adc();
    else if (op == "sub")
        sub();
    else if (op == "sbb")
        sbb();
    else if (op == "ana")
        ana();
    else if (op == "xra")
        xra();
    else if (op == "ora")
        ora();
    else if (op == "cmp")
        cmp();
    else if (op == "rnz")
        rnz();
    else if (op == "pop")
        pop();
    else if (op == "jnz")
        jnz();
    else if (op == "jmp")
        jmp();
    else if (op == "cnz")
        cnz();
    else if (op == "push")
        push();
    else if (op == "adi")
        adi();
    else if (op == "rst")
        rst();
    else if (op == "rz")
        rz();
    else if (op == "ret")
        ret();
    else if (op == "jz")
        jz();
    else if (op == "cz")
        cz();
    else if (op == "call")
        call();
    else if (op == "aci")
        aci();
    else if (op == "rnc")
        rnc();
    else if (op == "jnc")
        jnc();
    else if (op == "out")
        i80_out();
    else if (op == "cnc")
        cnc();
    else if (op == "sui")
        sui();
    else if (op == "rc")
        rc();
    else if (op == "jc")
        jc();
    else if (op == "in")
        i80_in();
    else if (op == "cc")
        cc();
    else if (op == "sbi")
        sbi();
    else if (op == "rpo")
        rpo();
    else if (op == "jpo")
        jpo();
    else if (op == "xthl")
        xthl();
    else if (op == "cpo")
        cpo();
    else if (op == "ani")
        ani();
    else if (op == "rpe")
        rpe();
    else if (op == "pchl")
        pchl();
    else if (op == "jpe")
        jpe();
    else if (op == "xchg")
        xchg();
    else if (op == "cpe")
        cpe();
    else if (op == "xri")
        xri();
    else if (op == "rp")
        rp();
    else if (op == "jp")
        jp();
    else if (op == "di")
        di();
    else if (op == "cp")
        cp();
    else if (op == "ori")
        ori();
    else if (op == "rm")
        rm();
    else if (op == "sphl")
        sphl();
    else if (op == "jm")
        jm();
    else if (op == "ei")
        ei();
    else if (op == "cm")
        cm();
    else if (op == "cpi")
        cpi();
    else
        err("unknown mnemonic: " ~ op);
}

/**
 * Take action depending on which pass this is.
 */
private void passAct(ushort size, int outbyte)
{
    if (pass == 1) {
        /* Add new symbol if we have a label.  */
        if (!lab.empty)
            addsym();

        /* Increment address counter by size of instruction.  */
        addr += size;
    } else {
        /**
         * Output the byte representing the opcode.
         * If the opcode carries additional information
         *   (e.g., immediate or address), we will output that
         *   in a separate helper function.
         */
        if (outbyte >= 0)
            output ~= cast(ubyte)outbyte;
    }
}

/**
 * Add a symbol to the symbol table.
 */
private void addsym()
{
    for (size_t i = 0; i < stab.length; i++) {
        if (lab == stab[i].lab)
            err("duplicate label: " ~ lab);
    }

    symtab newsym = { lab, addr };
    stab ~= newsym;
}

/**
 * nop (0x00)
 */
private void nop()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x00);
}

/**
 * lxi (0x01 + 16 bit register offset)
 */
private void lxi()
{
    argcheck(!a1.empty && !a2.empty);
    passAct(3, 0x01 + regMod16());
    imm(IMM16);
}

/**
 * stax (0x02 + 16 bit register offset)
 */
private void stax()
{
    argcheck(!a1.empty && a2.empty);
    if (a1 == "b")
        passAct(1, 0x02);
    else if (a1 == "d")
        passAct(1, 0x12);
    else
        err("stax only takes b or d");
}

/**
 * inx (0x03 + 16 bit register offset)
 */
private void inx()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x03 + regMod16());
}

/**
 * inr (0x04 + (8 bit register offset << 3))
 */
private void inr()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x04 + (regMod8(a1) << 3));
}

/**
 * dcr (0x05 + (8 bit register offset << 3))
 */
private void dcr()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x05 + (regMod8(a1) << 3));
}

/**
 * mvi (0x06 + (8 bit register offset << 3))
 */
private void mvi()
{
    argcheck(!a1.empty && !a2.empty);
    passAct(2, 0x06 + (regMod8(a1) << 3));
    imm(IMM8);
}

/**
 * rcl (0x07)
 */
private void rlc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x07);
}

/**
 * dad (0x09 + 16 bit register offset)
 */
private void dad()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x09 + regMod16());
}

/**
 * ldax (0x0a + 16 bit register offset)
 */
private void ldax()
{
    argcheck(!a1.empty && a2.empty);
    if (a1 == "b")
        passAct(1, 0x0a);
    else if (a1 == "d")
        passAct(1, 0x1a);
    else
        err("ldax only takes b or d");
}

/**
 * dcx (0x0b + 16 bit register offset)
 */
private void dcx()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x0b + regMod16());
}

/**
 * rrc (0x0f)
 */
private void rrc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x0f);
}

/**
 * ral (0x17)
 */
private void ral()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x17);
}

/**
 * rar (0x1f)
 */
private void rar()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x1f);
}

/**
 * shld (0x22)
 */
private void shld()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0x22);
    a16();
}

/**
 * daa (0x27)
 */
private void daa()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x27);
}

/**
 * lhld (0x2a)
 */
private void lhld()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0x2a);
    a16();
}

/**
 * cma (0x2f)
 */
private void cma()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x2f);
}

/**
 * sta (0x32)
 */
private void sta()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0x32);
    a16();
}

/**
 * stc (0x37)
 */
private void stc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x37);
}

/**
 * lda (0x3a)
 */
private void lda()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0x3a);
    a16();
}

/**
 * cmc (0x3f)
 */
private void cmc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x3f);
}

/**
 * mov (0x40 + (8-bit register offset << 3) + 8-bit register offset
 * We allow mov m, m (0x76)
 * But that will result in HLT.
 */
private void mov()
{
    argcheck(!a1.empty && !a2.empty);
    passAct(1, 0x40 + (regMod8(a1) << 3) + regMod8(a2));
}

/**
 * hlt (0x76)
 */
private void hlt()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0x76);
}

/**
 * add (0x80 + 8-bit register offset)
 */
private void add()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x80 + regMod8(a1));
}

/**
 * adc (0x88 + 8-bit register offset)
 */
private void adc()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x88 + regMod8(a1));
}

/**
 * sub (0x90 + 8-bit register offset)
 */
private void sub()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x90 + regMod8(a1));
}

/**
 * sbb (0x98 + 8-bit register offset)
 */
private void sbb()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0x98 + regMod8(a1));
}

/**
 * ana (0xa0 + 8-bit register offset)
 */
private void ana()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0xa0 + regMod8(a1));
}

/**
 * xra (0xa8 + 8-bit register offset)
 */
private void xra()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0xa8 + regMod8(a1));
}

/**
 * ora (0xb0 + 8-bit register offset)
 */
private void ora()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0xb0 + regMod8(a1));
}

/**
 * cmp (0xb8 + 8-bit register offset)
 */
private void cmp()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0xb8 + regMod8(a1));
}

/**
 * rnz (0xc0)
 */
private void rnz()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xc0);
}

/**
 * pop (0xc1 + 16-bit register offset)
 */
private void pop()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0xc1 + regMod16());
}

/**
 * jnz (0xc2)
 */
private void jnz()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xc2);
    a16();
}

/**
 * jmp (0xc3)
 */
private void jmp()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xc3);
    a16();
}

/**
 * cnz (0xc4)
 */
private void cnz()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xc4);
    a16();
}

/**
 * push (0xc5 + 16-bit register offset)
 */
private void push()
{
    argcheck(!a1.empty && a2.empty);
    passAct(1, 0xc5 + regMod16());
}

/**
 * adi (0xc6)
 */
private void adi()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xc6);
    imm(IMM8);
}

/**
 * rst (0xc7 + offset)
 */
private void rst()
{
    argcheck(!a1.empty && a2.empty);
    auto offset = to!int(a1, 10);
    if (offset >= 0 && offset <= 7)
        passAct(1, 0xc7 + (offset * 8));
    else
        err("invalid reset vector: " ~ to!string(offset));
}

/**
 * rz (0xc8)
 */
private void rz()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xc8);
}

/**
 * ret (0xc9)
 */
private void ret()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xc9);
}

/**
 * jz (0xca)
 */
private void jz()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xca);
    a16();
}

/**
 * cz (0xcc)
 */
private void cz()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xcc);
    a16();
}

/**
 * call (0xcd)
 */
private void call()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xcd);
    a16();
}

/**
 * aci (0xce)
 */
private void aci()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xce);
    imm(IMM8);
}

/**
 * rnc (0xd0)
 */
private void rnc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xd0);
}

/**
 * jnc (0xd2)
 */
private void jnc()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xd2);
    a16();
}

/**
 * out (0xd3)
 */
private void i80_out()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xd3);
    imm(IMM8);
}

/**
 * cnc (0xd4)
 */
private void cnc()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xd4);
    a16();
}

/**
 * sui (0xd6)
 */
private void sui()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xd6);
    imm(IMM8);
}

/**
 * rc (0xd8)
 */
private void rc()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xd8);
}

/**
 * jc (0xda)
 */
private void jc()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xda);
    a16();
}

/**
 * in (0xdb)
 */
private void i80_in()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xdb);
    imm(IMM8);
}

/**
 * cc (0xdc)
 */
private void cc()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xdc);
    a16();
}

/**
 * sbi (0xde)
 */
private void sbi()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xde);
    imm(IMM8);
}

/**
 * rpo (0xe0)
 */
private void rpo()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xe0);
}

/**
 * jpo (0xe2)
 */
private void jpo()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xe2);
    a16();
}

/**
 * xthl (0xe3)
 */
private void xthl()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xe3);
}

/**
 * cpo (0xe4)
 */
private void cpo()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xe4);
    a16();
}

/**
 * ani (0xe6)
 */
private void ani()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xe6);
    imm(IMM8);
}

/**
 * rpe (0xe8)
 */
private void rpe()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xe8);
}

/**
 * pchl (0xe9)
 */
private void pchl()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xe9);
}

/**
 * jpe (0xea)
 */
private void jpe()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xea);
    a16();
}

/**
 * xchg (0xeb)
 */
private void xchg()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xeb);
}

/**
 * cpe (0xec)
 */
private void cpe()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xec);
    a16();
}

/**
 * xri (0xee)
 */
private void xri()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xee);
    imm(IMM8);
}

/**
 * rp (0xf0)
 */
private void rp()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xf0);
}

/**
 * jp (0xf2)
 */
private void jp()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xf2);
    a16();
}

/**
 * di (0xf3)
 */
private void di()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xf3);
}

/**
 * cp (0xf4)
 */
private void cp()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xf4);
    a16();
}

/**
 * ori (0xf6)
 */
private void ori()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xf6);
    imm(IMM8);
}

/**
 * rm (0xf8)
 */
private void rm()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xf8);
}

/**
 * sphl (0xf9)
 */
private void sphl()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xf9);
}

/**
 * jm (0xfa)
 */
private void jm()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xfa);
    a16();
}

/**
 * ei (0xfb)
 */
private void ei()
{
    argcheck(a1.empty && a2.empty);
    passAct(1, 0xfb);
}

/**
 * cm (0xfc)
 */
private void cm()
{
    argcheck(!a1.empty && a2.empty);
    passAct(3, 0xfc);
    a16();
}

/**
 * cpi (0xfe)
 */
private void cpi()
{
    argcheck(!a1.empty && a2.empty);
    passAct(2, 0xfe);
    imm(IMM8);
}

/**
 * Get an 8-bit or 16-bit immediate.
 */
private void imm(int type)
{
    ushort num;
    string arg;
    bool found = false;

    if (op == "lxi" || op == "mvi")
        arg = a2;
    else
        arg = a1;

    if (isDigit(arg[0])) {
        num = numcheck(arg);
    } else {
        if (pass == 2) {
            for (size_t i = 0; i < stab.length; i++) {
                if (arg == stab[i].lab) {
                    num = stab[i].value;
                    found = true;
                    break;
                }
            }

            if (!found)
                err("label " ~ arg ~ " not defined");
        }
    }

    if (pass == 2) {
        output ~= cast(ubyte)(num & 0xff);
        if (type == IMM16)
            output ~= cast(ubyte)((num >> 8) & 0xff);
    }
}

/**
 * Get a 16-bit address.
 */
private void a16()
{
    ushort num;
    bool found = false;

    if (isDigit(a1[0])) {
        num = numcheck(a1);
    } else {
        for (size_t i = 0; i < stab.length; i++) {
            if (a1 == stab[i].lab) {
                num = stab[i].value;
                found = true;
                break;
            }
        }

        if (pass == 2) {
            if (!found)
                err("label " ~ a1 ~ " not defined");
        }
    }

    if (pass == 2) {
        output ~= cast(ubyte)(num & 0xff);
        output ~= cast(ubyte)((num >> 8) & 0xff);
    }
}

/**
 * Return the 16 bit register offset.
 */
private int regMod16()
{
    if (a1 == "b") {
        return 0x00;
    } else if (a1 == "d") {
        return 0x10;
    } else if (a1 == "h") {
        return 0x20;
    } else if (a1 == "psw") {
        if (op == "pop" || op == "push")
            return 0x30;
        else
            err("psw may not be used with " ~ op);
    } else if (a1 == "sp") {
        if (op != "pop" && op != "push")
            return 0x30;
        else
            err("sp may not be used with " ~ op);
    } else {
        err("invalid register for " ~ op);
    }

    /* This will never be reached, but quiets gdc.  */
    return 0;
}

/**
 * Return the 8-bit register offset.
 */
private int regMod8(string reg)
{
    if (reg == "b")
        return 0x00;
    else if (reg == "c")
        return 0x01;
    else if (reg == "d")
        return 0x02;
    else if (reg == "e")
        return 0x03;
    else if (reg == "h")
        return 0x04;
    else if (reg == "l")
        return 0x05;
    else if (reg == "m")
        return 0x06;
    else if (reg == "a")
        return 0x07;
    else
        err("invalid register " ~ reg);

    /* This will never be reached, but quiets gdc.  */
    return 0;
}

/**
 * Check arguments.
 */
private void argcheck(bool passed)
{
    if (passed == false)
        err("arguments not correct for mnemonic: " ~ op);
}

/**
 * Check if a number is decimal or hex.
 */
private ushort numcheck(string input)
{
    ushort num;

    if (input[input.length - 1] == 'h')
        num = to!ushort(chop(input), 16);
    else
        num = to!ushort(input, 10);

    return num;
}

/**
 * Nice error messages.
 */
private void err(string msg)
{
    stderr.writeln("a80: " ~ to!string(lineno + 1) ~ ": " ~ msg);
    enforce(0);
}

/**
 * All good things start with a single function.
 */
void main(string[] args)
{
    /**
     * Make sure the user provides only one input file.
     */
    if (args.length != 2) {
        stderr.writeln("usage: a80 file.asm");
        return;
    }

    /**
     * Create an array of lines from the input file.
     */
    string[] lines = splitLines(cast(string)read(args[1]));

    /**
     * Name output file the same as the input but with .com ending.
     */
    auto split = args[1].findSplit(".asm");
    auto outfile = split[0] ~ ".com";

    /**
     * Do the work.
     */
    assemble(lines, outfile);
}

Next time

Somehow, we are still not finished with our assembler. But we have done most of the work. There are a couple of pseudo-ops, convenience instructions that the assembler understands, that will take this assembler from a toy to something actually usable. We will tackle those next time.

Brian Robert Callahan

[prev]

[next]

2021-04-14
Demystifying programs that create programs, part 8: Finishing opcode processing

Instructions in the first quarter of the opcode table

Logical shifts and rotates

Incrementers and decrementers

A small tweak to regMod16

Coding the incrementers and decrementers

Storing and loading the accumulator

The two odd instructions: `lxi` and `mvi`

Current state of the assembler

Next time

Brian Robert Callahan

[prev]

[next]

2021-04-14Demystifying programs that create programs, part 8: Finishing opcode processing

Instructions in the first quarter of the opcode table

Logical shifts and rotates

Incrementers and decrementers

A small tweak to regMod16

Coding the incrementers and decrementers

Storing and loading the accumulator

The two odd instructions: lxi and mvi

Current state of the assembler

Next time

2021-04-14
Demystifying programs that create programs, part 8: Finishing opcode processing

The two odd instructions: `lxi` and `mvi`