#### Imperial College London

#### **Lecture 7**

#### **Microarchitecture**

### of a simplified RISC-V

Peter Cheung Imperial College London

URL: www.ee.imperial.ac.uk/pcheung/teaching/EE2\_CAS/E-mail: p.cheung@imperial.ac.uk

#### What is microarchitecture?

- Microarchitecture: how to implement an architecture in hardware
- Processor:
  - Datapath: functional blocks
  - Control: control signals

#### **RISC-V State Elements**

- State elements: determines everything about a processor:
  - Architectural state:
    - 32 registers
    - Program Counter (PC)
    - Memory



#### **Example Program**

- Design datapath
- View example program executing

| Address    | Instru | ucti        | on      | Type |                                        | Fields              |                     |                  |                                       | Machine Language     |          |  |  |
|------------|--------|-------------|---------|------|----------------------------------------|---------------------|---------------------|------------------|---------------------------------------|----------------------|----------|--|--|
| 0x1000 L7: | lw x   | x6,         | -4 (x9) | I    | <b>imm<sub>11:0</sub></b><br>11111111  | L1100               | <b>rs1</b><br>01001 | <b>f3</b><br>010 | <b>rd</b><br>00110                    | <b>op</b><br>0000011 | FFC4A303 |  |  |
| 0x1004     | SW X   | χ6 <b>,</b> | 8 (x9)  | S    | imm <sub>11:5</sub>                    | <b>rs2</b><br>00110 | <b>rs1</b><br>01001 | <b>f3</b><br>010 | <b>imm<sub>4:0</sub></b> 01000        | <b>op</b><br>0100011 | 0064A423 |  |  |
| 0x1008     | or x   | ζ4 <b>,</b> | x5, x6  | R    | <b>funct7</b>                          | <b>rs2</b><br>00110 | <b>rs1</b><br>00101 | <b>f3</b><br>110 | <b>rd</b><br>00100                    | <b>op</b><br>0110011 | 0062E233 |  |  |
| 0x100C     | beq x  | ζ4 <b>,</b> | x4, L7  | В    | $\frac{\text{imm}_{12,10:5}}{1111111}$ | <b>rs2</b><br>00100 | <b>rs1</b> 00100    | <b>f3</b>        | <b>imm</b> <sub>4:1,11</sub><br>10101 | <b>op</b><br>1100011 | FE420AE3 |  |  |





### **Step 1: Instruction Fetch**





| Address    | Instruction   | Type |                                           | Field               | ls               |                    | Mad                  | chine Language |
|------------|---------------|------|-------------------------------------------|---------------------|------------------|--------------------|----------------------|----------------|
| 0x1000 L7: | lw x6, -4(x9) | I    | <b>imm<sub>11:0</sub></b><br>111111111100 | <b>rs1</b><br>01001 | <b>f3</b><br>010 | <b>rd</b><br>00110 | <b>op</b><br>0000011 | FFC4A303       |

# **Step 2: Read Source Operand (rs1)**





| Address  | Instruction | n Type          |                                     | Field        | ds        |             | Ma                   | chine Language |
|----------|-------------|-----------------|-------------------------------------|--------------|-----------|-------------|----------------------|----------------|
| 0×1000 T | .7· lw x6.  | -4(x9) <b>I</b> | imm <sub>11:0</sub><br>111111111100 | rs1<br>01001 | <b>f3</b> | rd<br>00110 | <b>op</b><br>0000011 | FFC4A303       |

## **Step 3: Extend the immediate constant**



| Address    | Instruction   | Type |                                           | Field               | ls |                    | Mad                  | chine Language |
|------------|---------------|------|-------------------------------------------|---------------------|----|--------------------|----------------------|----------------|
| 0x1000 L7: | lw x6, -4(x9) | I    | <b>imm<sub>11:0</sub></b><br>111111111100 | <b>rs1</b><br>01001 |    | <b>rd</b><br>00110 | <b>op</b><br>0000011 | FFC4A303       |

## **Step 4: Calculate memory address**



## Step 5: Read data from memory & write to Reg



| Address    | Instruction   | Type |                                           | Field               | ds               |                    | Ma                   | chine Language |
|------------|---------------|------|-------------------------------------------|---------------------|------------------|--------------------|----------------------|----------------|
| 0x1000 L7: | lw x6, -4(x9) | I    | <b>imm<sub>11:0</sub></b><br>111111111100 | <b>rs1</b><br>01001 | <b>f3</b><br>010 | <b>rd</b><br>00110 | <b>op</b><br>0000011 | FFC4A303       |

#### Step 6: Determine address of next instruction



| Address     | Instruction   | Type |                                           | Field | ds        |                    | Ma                   | chine Language |
|-------------|---------------|------|-------------------------------------------|-------|-----------|--------------------|----------------------|----------------|
| 0×1000 T.7: | lw x6, -4(x9) | ī    | <b>imm<sub>11:0</sub></b><br>111111111100 | rs1   | <b>f3</b> | <b>rd</b><br>00110 | <b>op</b><br>0000011 | FFC4A303       |

### Implementation of the "sw" instruction

- **Immediate:** now in {instr[31:25], instr[11:7]}
- Add control signals: ImmSrc, MemWrite



### Implementation of the "sw" instruction

- **Immediate:** now in {instr[31:25], instr[11:7]}
- Add control signals: ImmSrc, MemWrite



## Immediate offset for I-type and S-type are different

| Instruction<br>Formats | 31 | 30 | 29  | 28   | 27  | 26   | 25   | 24 | 23 | 22  | 21 | 20 | 19 | 18 | 17  | 16 | 15 | 14 | 13    | 12 | 11 | 10 | 9   | 8   | 7 | 6 | 5 | 4  | 3   | 2  | 1 ( | ) |
|------------------------|----|----|-----|------|-----|------|------|----|----|-----|----|----|----|----|-----|----|----|----|-------|----|----|----|-----|-----|---|---|---|----|-----|----|-----|---|
| Immediate              |    |    |     |      | i   | imm[ | 11:0 | l  |    |     |    |    |    |    | rs1 |    |    | f  | unct  | 3  |    |    | rd  |     |   |   |   | ор | cod | le |     |   |
| Store                  |    |    | imr | n[11 | :5] |      |      |    |    | rs2 | 2  |    |    |    | rs1 |    |    | f  | funct | 3  |    | im | m[4 | :0] |   |   |   | op | coc | de |     |   |

| ImmSrc | ImmExt                                       | Instruction Type |
|--------|----------------------------------------------|------------------|
| 0      | {{20{instr[31]}}, instr[31:20]}              | I-Type           |
| 1      | {{20{instr[31]}}, instr[31:25], instr[11:7]} | S-Type           |

### Implementation of the "or" instruction

- Read from rs1 and rs2 (instead of imm)
- Write ALUResult to rd



## Implementation of the "beq" instruction

#### Calculate target address: PCTarget = PC + imm



# **Adding the Control Unit**



#### Two different views of the Control Unit

# **High-Level View**



#### **Low-Level View**



#### Main decoder



| Instruction | Op      | RegWrite | ImmSrc | ALUSrc | MemWrite | ResultSrc | Branch | ALUOp |
|-------------|---------|----------|--------|--------|----------|-----------|--------|-------|
| 1 w         | 0000011 | 1        | 00     | 1      | 0        | 1         | 0      | 00    |
| SW          | 0100011 | 0        | 01     | 1      | 1        | x         | 0      | 00    |
| R-type      | 0110011 | 1        | xx     | 0      | 0        | 0         | 0      | 10    |
| beq         | 1100011 | 0        | 10     | 0      | 0        | x         | 1      | 01    |

#### **ALU Decoder**



| ALUOp | funct3 | {op <sub>5</sub> , funct7 <sub>5</sub> } | ALUControl          | Instruction |
|-------|--------|------------------------------------------|---------------------|-------------|
| 00    | x      | x                                        | 000 (add)           | lw, sw      |
| 01    | X      | х                                        | 001 (subtract)      | beq         |
| 10    | 000    | 00, 01, 10                               | 000 (add)           | add         |
|       | 000    | 11                                       | 001 (subtract)      | sub         |
|       | 010    | х                                        | 101 (set less than) | slt         |
|       | 110    | х                                        | 011 (or)            | or          |
|       | 111    | x                                        | 010 (and)           | and         |

#### Example - Control for and x5, x6, x7

| ор | Instruct | RegWrite | ImmSrc | ALUSrc | MemWrite | ResultSrc | Branch | ALUOp |
|----|----------|----------|--------|--------|----------|-----------|--------|-------|
| 51 | R-type   | 1        | XX     | 0      | 0        | 0         | 0      | 010   |



# Lab 4 – A Very Basic RISC-V CPU

Start working as a Team – 2 pairs allocated by me

#### Lab objectives:

- 1. To get to know your teammates.
- 2. To establish a Github Repo for your team where everyone's contribute towards.
- 3. To learn about TWO RISC-V instructions in great details.
- 4. To design a simple CPU that executes these two instructions.
- 5. To use execute a short program using only these two instructions. The program implements the binary counter in Lab 1, but in software.
- Stretched goal to implement a third instruction accessing data memory. With this, implement the sinewave generator in software.

#### Lab 4 – Program to execute

```
1 main:
      addi
                                # load t1 with 255
             tl, zero, 0xff
      addi
                                 # a0 is used for output
              a0, zero, 0x0
4 mloop:
      addi
                                 # al is the counter, init to 0
              al, zero, 0x0
6 iloop:
      addi
                                # load a0 with a1
              a0, a1, 0
      addi
                                 # increment al
              al, al, 1
                                 # if al = 255, branch to iloop
      bne
              al, tl, iloop
10
              tl, zero, mloop
                                 # else always branch to mloop
      bne
```

#### Online RISC-V Assembler:

https://riscvasm.lucasteske.dev

| Hex Dump | 0ff00313 |
|----------|----------|
|          | 00000513 |
|          | 00000593 |
|          | 00058513 |
|          | 00158593 |
|          | fe659ce3 |
|          | fe0318e3 |

#### Lab 4 – Pseudoinstruction is easier to read

```
1 main:
      addi
              tl, zero, 0xff
      addi
              a0, zero, 0x0
4 mloop:
      addi
5
              al, zero, 0x0
6 iloop:
      addi
              a0, a1, 0
      addi
              al, al, 1
8
9
              al, tl, iloop
      bne
              tl, zero, mloop
      bne
```

```
0000000000000000
  0:
       0ff00313
                              1i
                                      t1,255
                              li
  4:
       00000513
                                      a0,0
00000593
                              li
                                      a1,0
000000000000000c :
  c:
       00058513
                                      a0,a1
                              mv
 10:
       00158593
                               addi
                                      a1,a1,1
 14:
       fe659ce3
                              bne
                                      a1,t1,c
       fe0318e3
 18:
                                      t1,8
                              bnez
```

## Lab 4 – Overall block diagram

