## Basic Processor Design



This lecture deals with Instruction Set Design.

## Instruction Set Terminology

#### Mnemonic (Instruction Name)

SUBI

|                             |                            | Example                 |             |
|-----------------------------|----------------------------|-------------------------|-------------|
| Syntax (Language Structure) | Assembly Language          |                         |             |
| SUBI Rx, Ry, imm4           |                            | SUBI R5, R3, 10         |             |
| Semantics (Meaning)         | Register Transfer Language |                         |             |
| $Rx \leftarrow Ry - imm4$   |                            | $R5 \leftarrow R3 - 10$ |             |
| Coding                      | Machine Code               | Binary                  | Hexadecimal |
| 100101 x y imm4             |                            | 100101 101 011 1010     | 96BA        |
|                             |                            | └ <                     |             |
| Imr                         | nediate Data Field         | Instruction Length      |             |
| Register Addres             | s Fields                   | Fields                  |             |
| Opcode Field                | )                          |                         |             |

The *instruction set* for a processor is the set of all the instructions supported by that processor.

Start with RISC<sup>1</sup> assumptions since we want a simple processor.

- Fixed instruction length
  - keeps control simple (very useful for pipelining<sup>2</sup>)
- Load/store architecture & a large number of general purpose registers
  - avoids slow memory access
- Very few addressing modes
  - complex addressing modes are seldom used
- No complex instructions
  - keep everything simple

<sup>&</sup>lt;sup>1</sup>RISC = Reduced Instruction Set Computer

<sup>&</sup>lt;sup>2</sup>RISC machines were initially designed to increase performance through pipelining - your processor will not use pipelining

- All instructions are the same length
  - typically 1 word (i.e. 16 bits)
- Load/Store Architecture
  - All arithmetic and logic instructions deal only with registers and immediate values<sup>3</sup>

| e.g. | ADD R3,R2,R5 | $\text{R5} \leftarrow \text{R3} + \text{R2}$ |
|------|--------------|----------------------------------------------|
|      | OR R3,13,R5  | $\text{R5} \leftarrow \text{R3} + 13$        |

- Separate instructions are needed for access to locations in memory.

e.g. LD [R4+13], R7  $R7 \leftarrow mem(R4 + 13)$ ST R7, [R4+13]  $mem(R4 + 13) \leftarrow R7$ 

*mem*(*nnn*) is shorthand for the data location in memory with address *nnn*.

– Instruction set is maximally orthogonal

<sup>&</sup>lt;sup>3</sup>an immediate (or literal) value is a data value encoded in the instruction word.

# Q1<sub>*RISC*</sub> How many Register Addresses in an Arithmetic/Logic Instruction?

• Usually 2 or 3 for RISC

| 2: ADD Rx, Ry     | $\mathbf{R}\mathbf{x} \leftarrow \mathbf{R}\mathbf{x} + \mathbf{R}\mathbf{y}$ |
|-------------------|-------------------------------------------------------------------------------|
| 3: ADD Rx, Ry, Rz | $Rz \leftarrow Rx + Ry$                                                       |

### Q2<sub>*RISC*</sub> How many General Purpose Registers?

• Usually  $2^n - 1$  for RISC

This gives  $2^n$  addressable registers including the dummy register,  $R0^4$ .

We then need n bits per register address in the instruction.

With a 16 bit instruction length, n = 2 (i.e. 3 registers + R0) or n = 3 (i.e. 7 registers + R0) are sensible values.

<sup>&</sup>lt;sup>4</sup>R0 is always zero

## Basic Processor Design – RISC

Q3<sub>*RISC*</sub> How many Bits do we use for Short Immediates?

• Used in instructions like

| ADD R3,5,R2   | $\text{R2} \leftarrow \text{R3} + 5$ |
|---------------|--------------------------------------|
| ST R7,[R4+13] | $mem(R4 + 13) \leftarrow R7$         |

Sensible values for a 16 bit instruction length are in the range  $4 \le s \le 9$  giving 2's complement values in the range  $-2^{s-1} \le imm \le 2^{s-1} - 1$ 

### $Q3A_{RISC}$

Do we support All Arithmetic/Logic instructions and All Load/Store instructions in both Register-Register and Register-Immediate forms?

- Some RISC processors support only Register-Immediate form for Load/Store instructions.
- Some RISC processors support Register-Register form for all Arithmetic/Logic instructions and Register-Immediate form for a subset of these instructions.

## Q4<sub>*RISC*</sub> What Instruction Fields do we provide? How are they arranged?

- RISC instruction coding is highly orthogonal any instruction may use any registers.
- Requirement for maximum length short immediate makes RISC coding tight.

Assume  $Q1_{RISC}$ : 3,  $Q2_{RISC}$ :  $2^n - 1$ ,  $Q3_{RISC}$ : *s*,  $Q3A_{RISC}$ : YES A suitable coding for Arithmetic/Logic and Load/Store instructions is:



Example: If n = 2 and s = 7 then x = 4 giving up to  $2^x$  (=16) instructions.

Most RISC processors support an instruction to set the upper bits of a register. The MIPS processor calls it LUI (load upper immediate) while the SPARC processor calls it SETHI.

For SPARC, the sequence of instructions required to set upper and lower parts of a register is:

| SETHI 200,Rx | $\mathbf{Rx} \leftarrow 200 \times 2^{10}$ |
|--------------|--------------------------------------------|
| ADD Rx,5,Rx  | $Rx \leftarrow Rx + 5$                     |

Note that the  $\times 2^{10}$  (i.e. shift left by 10) value comes from the SPARC word length (32) less the length of the long immediate used for SETHI (22). In our example it will be  $\times 2^{16-l}$  where *l* is the length of our long immediate.

To code a SETHI or LUI instruction we need fewer fields:



Example: If n = 2 and x = 4 then l = 10.

Note:  $s + l \ge 16$  for the SETHI/ADD sequence to produce a 16 bit result.

### Example Coding #3

Assume Q1<sub>*RISC*</sub>: 3, Q2<sub>*RISC*</sub>: 3 (n = 2), Q3<sub>*RISC*</sub>: 7 (s = 7), Q3A<sub>*RISC*</sub>: YES This gives x = 4 and l = 10 with the coding shown below<sup>5</sup>:



In this case up to 16 opcodes are supported, each of which supports either coding A (i.e. A0 and A1), coding B or coding C (which supports conditional branch).

This is just one of many possible codings (this one is loosely based on the SPARC instruction coding).

<sup>&</sup>lt;sup>5</sup>note that in this example short and long immediates (simm7 and simm10) are signed numbers

### Q5<sub>*RISC*</sub> What Instructions will we support?<sup>6</sup>

| Туре          | Function             | Mnemonic | Function             | Mnemonic |
|---------------|----------------------|----------|----------------------|----------|
| Arithmetic    | Add                  | ADD      | Subtract             | SUB      |
|               | Add with Carry       | ADDX     | Subtract with Borrow | SUBX     |
| Logic         | Bitwise AND          | AND      | Bitwise OR           | OR       |
|               | Bitwise Exclusive OR | XOR      | Logical Shift Right  | LSR      |
| Data Movement | Load                 | LD       | Store                | ST       |
|               | Set High             | SETHI    |                      |          |

• a complete set of common arithmetic and logical functions

(note that multiply is too complex to be included, while shift left is accomplished by adding a number to itself).

• all arithmetic and logic functions update the condition flags:

Zero (Z) is set if the result is zero and cleared otherwise.

Carry (C) is updated by ADD/ADDX/SUB/SUBX/LSR and is set to zero by other logical instructions.

<sup>&</sup>lt;sup>6</sup>This set has been chosen to match example coding #3

### Q5<sub>*RISC*</sub> What Instructions will we support?

| Туре             | Function                   | Mnemonic | SubFunction                  | Mnemonic |
|------------------|----------------------------|----------|------------------------------|----------|
| Control Transfer | Branch if equal (Z==1)     | BEQ      | Branch if not equal (Z==0)   | BNE      |
|                  | Branch if carry set (C==1) | BCS      | Branch if carry clear (C==0) | BCC      |
|                  | Branch and Link            | BAL      | Jump and Link                | JMPL     |

- This processor (like the SPARC on which it is based) uses condition code flags (C,Z) to support conditional branch<sup>7</sup>.
- Branch instructions are PC relative with a limited range (+511/-512).
- BAL and JMPL support subroutine call by saving the old PC value in a link register. If R0 (dummy register) is specified as the link register, then BAL can be used as a simple unconditional PC relative branch while JMPL can be used for jumping a long way or for returning from a subroutine.

<sup>&</sup>lt;sup>7</sup>other RISC processors such as MIPS save some complexity by not supporting condition code flags

| $Q5A_{RISC}$     | $_{C}$ Define Assembly Language Syntax <sup>8</sup> and Semantics. |                                  |                 |                                                |
|------------------|--------------------------------------------------------------------|----------------------------------|-----------------|------------------------------------------------|
|                  | Mnemonic                                                           |                                  |                 | Semantics                                      |
| Arithmetic       | ADD*                                                               | А                                | ADD Rs1,0p2,Rd  | $Rd \leftarrow Rs1 + Op2$                      |
|                  |                                                                    | where Op2 is either Rs2 or simm7 |                 |                                                |
|                  | SUB*                                                               | А                                | SUB Rs1,Op2,Rd  | $Rd \leftarrow Rs1 - Op2$                      |
|                  | ADDX*                                                              | А                                | ADDX Rs1,Op2,Rd | $Rd \leftarrow Rs1 + Op2 + C$                  |
|                  | SUBX*                                                              | А                                | SUBX Rs1,Op2,Rd | $Rd \leftarrow Rs1 - Op2 - C$                  |
| Logic            | AND*                                                               | А                                | AND Rs1,Op2,Rd  | $Rd \leftarrow Rs1 \& Op2$                     |
| -                | OR*                                                                | А                                | OR Rs1,Op2,Rd   | $Rd \leftarrow Rs1 + Op2$                      |
|                  | XOR*                                                               | А                                | XOR Rs1,Op2,Rd  | $Rd \leftarrow Rs1 \  \ Op2$                   |
|                  | LSR*                                                               | А                                | LSR Rs1,Rd      | $Rd \leftarrow Rs1 >>1$                        |
| Data Movement    | LD                                                                 | А                                | LD [Rs1+Op2],Rd | $Rd \leftarrow mem(Rs1 + Op2)$                 |
|                  | ST                                                                 | А                                | ST Rd,[Rs1+Op2] | $mem(Rs1 + Op2) \leftarrow Rd$                 |
|                  | SETHI                                                              | В                                | SETHI simm10,Rd | $Rd \leftarrow simm10 << 6$                    |
| Control Transfer | BEQ                                                                | С                                | BEQ simm10      | if $(Z == 1)$ then $PC \leftarrow PC + simm10$ |
|                  | BNE                                                                | С                                | BNE simm10      | if $(Z == 0)$ then $PC \leftarrow PC + simm10$ |
|                  | BCS                                                                | С                                | BCS simm10      | if $(C == 1)$ then $PC \leftarrow PC + simm10$ |
|                  | BCC                                                                | С                                | BCC simm10      | if $(C == 0)$ then $PC \leftarrow PC + simm10$ |
|                  | BAL                                                                | В                                | BAL simm10,Rd   | $Rd \leftarrow PC; PC \leftarrow PC + simm10$  |
|                  | JMPL                                                               | А                                | JMPL Rs1+Op2,Rd | $Rd \leftarrow PC; PC \leftarrow Rs1 + Op2$    |

<sup>8</sup>operand order follows SPARC convention

\*marked commands update the flags (C,Z)

## Basic Processor Design – RISC



Instructions are grouped so that decoding is simple.

Unfortunately this example processor does not meet your specification...

## Basic Processor Design

Consider CISC<sup>9</sup> features that might be included in the design.

- Variable instruction length
  - allows full 16-bit values to be included in your instructions
- Abilitity to act on data in memory
  - gives reduced instruction count
- Registers with specific roles
  - e.g. 8 registers but only 2 may act as address registers for load and store leads to shorter register address fields
- More addressing modes
  - e.g. store with pre-decrement or load with post-increment
- More complex instructions
  - e.g. CALL and RETURN instuctions which store and retrieve the return address on the stack

<sup>&</sup>lt;sup>9</sup>CISC = Complex Instruction Set Computer