Opcodes and Instruction Decoding

Programs are specified to the computer as a series of data (grouped into bytes or words). These data are sent to a decoding unit (related to a line decoder but not precisely the same), which uses the bit values of each compiled instruction to drive microcode to run the program.

This means that there is a fair amount of latitude when chossing values for instructions. For the purposes of this discussion, we shall create a new set of opcodes. (These opcodes may differ from those used by your textbook or those that your professor uses. When in doubt, use their values on homework or exams.)

We shall set the instruction length to 32 bits. The first 8 bits shall determine uniquely the instruction (allowing 256 total instruction, more than enough for a simple machine). The remaining 24 shall be used for data, such as operands for the instruction.

Since we are talking about decoding opcodes, we really don't care what the data says at the moment. As long as it conforms to some specification, the hardware or microcode that implements that instruction ought to be able to decode the data properly. The opcode decoder merely needs to be able to pass the instruction to the correct section of the machine. So we shall now only discuss the first 8 bits. (The other 24 bits get decoded by the data decoder, which works in a similar way, except that it doesn't care what the instruction is.)

There are three main types of instructions: ALU operations (algebraic instructions), branches (GOTOs and subroutine calls), and loads & stores.

Since there are three types, we shall use the most significant two bits of the instruction ID to differentiate these instructions. (This leaves 64 possible types of instructions of each type.) To be more concrete, 11xxxxxx indicates ALU operations, 10xxxxxx indicates branches, 01xxxxxx indicates loads, and 00 shall be reserved. (For simplicity, any instruction that starts 00 will be considered a no-op by the computer - that is, it will merely do nothing for a clock cycle, and then execute the next instruction.)

ALU operations could have a memory reference as an operand. That is, instead of only using registers as operands, it could have one operand in a register and one somewhere in memory. Let's use the third most significant bit of all ALU instructions to specify those ALUs that will need a memory fetch. So 111xxxxx uniquely defines one of 32 ALU instructions that needs a memory access, while 110xxxxx uniquely defines one of 32 ALU ops that doesn't need to access memory.

The remaining 5 bits of ALU instruction ID can likewise be defined: Let's say 11x110xx is an ADD, 11x100xx is a SUBTRACT (therefore requiring complementing), 11x010xx is a MULTIPLY, and 11x000xx is a DIVIDE. 11x111xx can be used for logical operations, for example, XOR. 11x101xx can be OR. 11x011xx can be AND. 11x001 can be NOT. This leaves 2 bits at the end for the size of the instruction. 11xxxx11 could refer to data one byte wide. 11xxxx10 could refer to data two bytes wide (a halfword). 11xxxx01 could refer to data a word (four bytes) wide, and 11xxxx00 could refer to doubleword data (8 bytes wide).

Notice that we have now specified all ALU operations. The source and destination is stored in the data portion of the instruction, and therefore the data decoder can determine if the operands are registers, or constants, or where to go in memory if they are memory references.

A similar process can be used for the other types of instructions. It is important to realize that not all possible values must be used - not only does this allow for future expansion of the instruction set, but it makes the internal circuitry less complex. For example, most machines don't have more than 8 or 16 kinds of branches.


Back to the Table of Contents.