CS2200 Intro to Systems and Networks
Project 1
Spring 2003

This project has you finish a simulator for a single-bus implementation of the LC-2200 computer. You'll finish some elements of the hardware, design the states of the controller FSM (you could call this microcode), and then explore how to extend the machine for I/O.

The rest of this document consists of an introduction to the simulator skeleton code and then two problems:
  1. Introduction (nothing to turn in)
  2. Single-bus simulator.
  3. Cycles-per-instruction
Useful information:

Problem 1: Introduction

There is nothing to turn in for this problem

The goal of this project is to implement the Little Computer (LC-2200) using a single-bus methodology. In this project you will model the action of the LC-2200's circuits on a cycle-by-cycle basis. Modules of the simulator (written in C) will have an identifiable, one-to-one correspondence with elements on the circuit diagram. Modeling a circuit in C this way may seem artificial but is actually reasonably common in the computer design world. Consider the architect's options when trying out a new design: The rest of this introduction discusses the single-bus circuit to be simulated, the coding conventions to follow to assure that the C code you write correctly represents the physical structure of the circuit, an overview to the (extensive) skeleton simulator code provided and to the FSM ROM contents not provided.


Single-Bus Circuit

The two figures below show the single-bus implementation of the LC-2200 to be used for this project.


Coding Conventions

As mentioned above, the main problem with writing a custom simulator in a full-blown language like C is that it is all too easy to overuse the power of C and end up embedding computations beyond the physical capabilities of the circuit you are trying to simulate. In particular, there are three main pitfalls:
  1. The simulator must obey the physical limits imposed by the circuit. Obviously, the ALU can only compute what the circuit diagram says it can compute (addition, subtraction, NAND). More subtly, data transfers are only possible between functional units that have wires between them; the ALU can't perform addition with the PC as an operand. Also, exactly one bus action is possible in any clock cycle.
  2. Combinational logic must be purely combinational, i.e. it cannot store any internal state across clock cycles.
  3. Sequential components, i.e. the registers on the bus, the register file and memory, must be handled carefully so that they are updated exactly once per clock cycle.
We attack the second and third problems by adhering to a coding convention that separates the combinational logic from the sequential logic. This convention mirrors the standard circuit design convention of carefully sequestering circuit state into well-understood primitives (edge-triggered registers) and forbidding feedback loops in combinational logic.

The conventions we demand divide the source code into three section, a combinational logic section, a sequential logic section, and an "other" section (debug printfs, etc). Of these, the only section you need to modify for this project is the combinational logic section.
  1. Combinational logic is represented using zero-parameter procedures that compute the value at a particular set of wires in circuit based on the current state (and calls to other such procedures). These procedures have no internal state. For instance, the procedure below represents the ALU and the input wiring to the ALU. The return value of the procedure is exactly the value on the 32-bit bus at the output of the ALU. The procedure has no parameters, but takes logical inputs from other procedures a(), b(), and alufunc() representing the value on wires coming from registers A and B and a field from the output of the control FSM. Since the functions representing combinational logic have no state themselves and produce values that are based on register state that doesn't change during a clock cycle, they can be safely called in any order during a clock cycle and will always return the same results. Note that this programming style may generate a lot of redundant procedure calls during the execution of the program, but ignore that effect (consider it the compiler's fault...)
    static int alu(void)
    {
      switch(alufunc())
        {
        case 0: return(a() + b());		/* ADD */
        case 1: return(~(a() & b()));	/* NAND */
        case 2: return(a() - b());		/* SUB */
        case 3: return(a() + 1);		/* INC */
        default: assert(0);
        }
    }
    
  2. Sequential logic is simulated by having the simulator mirror the internals of a register which is built from two latches. Rather than go into too much gory explanation, We've just given you the simulation code for all the registers and memory. Read the comments surrounding the curstate structure and the procedure doclock() in sim-onebus.c
  3. Utility routines to initialize state, print state, etc. This code is under no restrictions. DO NOT MODIFY THE print_state() CODE -- our grading methodology depends on it!

Skeleton Code and FSM ROM

This project comes with extensive skeleton code embodying the coding conventions above. Read through the comments in sim-onebus.c before adding anything. There are two major pieces missing: (1) most of the combinational logic blocks and (2) the contents of the FSM's ROM (the microcode).

The missing combinational logic blocks are easier. The code as provided essentially represents the circuit in the two figures below. Compare that to the completed circuit in the two figures above. mission in the first part of Problem 2 is to construct the missing pieces.


The second major missing piece is the bulk of the ROM contents in the FSM. Since the FSM implements the microcode for the machine, filling in the ROM will require the bulk of your design time and debugging effort.

The FSM is built with an 8-bit register and one big ROM like this:
              -------------
              |           |----- halt
           8  |           |----- LdPC
   state --/--|           |----- LdA
           4  |           |----- LdB
  opcode --/--|           |----- LdMAR
              |           |----- LdIR
       z -----|           |----- LdZ
              |   FSM     |----- WrREG
              |   ROM     |----- WrMEM
              |           |----- DrPC
              |           |----- DrALU
              |           |----- DrREG
              |           |----- DrMEM
              |           |----- DrOFF
              |           |  2
              |           |--/-- alufunc
              |           |  2
              |           |--/-- rsel
              |           |  8
              |           |--/-- nextstate
              |           |
              -------------
Note that despite being built with one big ROM, Mealy-machine style, we insist that the FSM follow Moore machine conventions: the outputs are a function of the current state only: The outputs of the FSM control the datapath. Here's more detail about the meaning of the output bits:
     25: halt                halt signal (to the simulator!)
     24: LdPC                load PC register from the bus
     23: LdA                 load ALU A input
     22: LdB                 load ALU B input
     21: LdMAR               load memory address register
     20: LdIR                load instruction register
     19: LdZ                 load zero bit
     18: WrREG               write register file
     17: WrMEM               write memory
     16: DrPC                drive the PC value onto the bus
     15: DrALU               drive the ALU
     14: DrREG               drive the register file output
     13: DrMEM               drive the memory output
     12: DrOFF               drive the sign-extended IR offset field
  11-10: alufunc             0b00 = ADD, 0b01 = NAND, 0b10 = SUB, 0b11 = INC
    9-8: rsel                0b00 = RA, 0b01 = RB, 0b10 = RDST
    7-0: nextstate           these can be a function of inputs also
You can think of a ROM as a block of combinational logic or as a block of memory. We could specify the contents of the ROM as a block of memory but it would be really hard to read! Instead, we specify it symbolically like this:
    case FETCH4: return(FSM_LDPC  | FSM_DRALU  | FSM_INC        | dispatch());

Problem 2: Single-Bus Circuit Simulator

In one sentence: design the microcode for the circuit, finish writing the simulator and then test it.

A. [20 points] Design a sequences of states for the FSM in the LC-2200 control logic such that it implements the LC-2200 instruction set. Draw and turn in a detailed state transition diagram for your FSM. The FSM must be a Moore type (outputs depend only on the current state).

Hint: We used about 30 states. You can draw the whole state transition diagram on one sheet of paper.

A. [60 points] Read and understand the skeleton code, then design and implement procedures to finish the simulator. The simulator skeleton, sim-onebus.c already includes code needed to simulate the registers in the datapath and all the logic in the control unit. That leaves you with two main pieces of work as described in the introduction: (1) design and implement the missing combinational logic in the data path and (2) finish implementing the FSM by filling in your microcode from part A. Turn in the source code to your finished simulator, sim-onebus.c.

B. [0 points] Test your simulator, e.g. with the test.s provided and your fib.s solution from Homework 1. Convince yourself that it works correctly. A solution version of fib.s will be posted under Homework 1 on the course web page after the Homework 1 due date. Nothing to turn in for this question.

There are obviously many correct ways of designing the control FSM for this datapath. Since you want the project to work, I recommending designing the FSM (at least initially) so that it is as clear and understandable as possible. However, since different FSMs give different performance, it's impossible to resist declaring the following CONTEST:

C. [0 points] What is the execution time (in cycles) of your simulator running your fib.s from Homework 1? You may improve your execution time by modifying the control FSM, but don't modify the datapath. In other words, we should be able to cut-and-paste your microcode into our simulator and have our simulator work correctly. Nothing to turn in for this question (we'll just run your simulator to find out).


Problem 3: Cycles per Instruction (CPI)

This problem has you investigate the Cycles per Instruction (CPI) achieved by your LC-2200 implementation.

A. [5 points] Make a table listing the CPI achieved for each LC-2200 instruction (except HALT) in your implementation. Include the time to fetch the instruction. For BEQ, assume the branch is taken 50% of the time.

B. [5 points] Compute the average CPI for your LC-2200 implementation executing the programs gcc and spice by using the instruction frequencies listed in the table below (adapted from Figure 3.38 of Patterson and Hennessy). Ignore for the moment that the LC-2200 doesn't have enough memory (or instructions) to execute either gcc or spice! Again, for BEQ, assume the branch is taken 50% of the time.

Instruction class LC-2200 examples gcc spice
Arithmetic add, nand, addi 48% 50%
Data transfer lw, sw 33% 41%
Conditional branch beq 17% 8%
Jump jalr 2% 1%


C. [5 points] The cycles used for instruction fetch are added to every single instruction. What additional hardware would be needed to overlap instruction fetch with instruction execution? Describe the additional hardware components needed and where they would be wired into the existing LC-2200 datapath.

D. [5 points] One of the most conspicuously missing instructions in the LC-2200 is any way to test an inequality. What hardware changes would it take to implement the MIPS SLT (set-if-less-than) instruction? Describe the additional hardware components needed and where they would be wired into the existing LC-2200 datapath.


End of CS2200 Spring 2003 Project 1