; ; 20-Apr-98 ; ; RCS: $Id$ ; Lecture notes for Friday, 17-Apr-98 The point of this lecture was to introduce the first implementation of a circuit capable of simulating the LC instruction set. This is cool because it first bridges the gap between assembly language and ciruits. We've brought together water & earth and will spend the rest of the term happily wallowing in the mud thus created. Or something like that. 1. Last Friday we talked about performing computation with circuits. The computation was a simple equation. There were two circuit implementations: a. All combinational logic. Buy combinational-logic versions of the functional units required (e.g. adders and multiplier) and wire them up as indicated by the equation. b. Single-bus approach. This approach has three steps: i. Buy exactly one each of combinational-logic versions of the functional units required. I.e. even if the equation has a dozen multiply operations in it, you buy only one multiplier. ii. Wire 'em up with one bus, using tri-state buffers and registers. iii. Perform the computation has a sequence of steps using an FSM to provide control signals in the correct sequence. The single-bus approach is one example of a structured way of attacking a circuit problem. We'll eventually dive into others (e.g. with multiple busses), but the idea of splitting the circuit into a "datapath" part and a "control" part will persist. 2. The single-bus is more flexible than the pure-combinational circuit because you can potentially support a variety of equations by just reprogramming the ROMs in the FSM without changing any wiring. Ideally, we want to support any computable computation and we want the machine to be programmable as well. If the computation needs storage space for temporary variables or whatever, we can add an arbitrarily large RAM to the single-bus circuit to expand the range of things that are computable. The next step is to make the computation running on the circuit be an "intepreter". An intepreter implements an abstraction be reading the "instructions" of the abstraction and executing them in some other, presumably more primitive setting. For us that means we'll be reading maching instructions stored in memory... The state-transition diagram for the FSM will be the usual loop for an interpreter: fetch an instruction. execute an instruction. repeat. That's all there is to it! After this lecture you can run out and build your own computer. However, we'll spend the rest of the term learning how to make it "better" (which usually means "faster"). 3. We'll build a computer using the LC instruction set from Project 1 and the single-bus methodology. The LC is pretty simple: it has only 8 instructions (add, nand, lw, sw, beq, jalr, noop, halt). Still, it's a complete machine that you can write real programs in. A single-bus circuit for the LC is given in this diagram: http://www.cc.gatech.edu/~kenmac/3760/notes/lc-onebus.ps True to the single-bus methodology, the circuit includes one of each item needed to interpret the guts of each LC instructions. For instance, an "add" instruction will need to extract two registers from a register file, add them up and put them back. So we'd better at least have an adder and a register file. -- An adder, a NAND unit and a subtractor (the subtractor is for the BEQ instruction). These three are encapsulated in one ALU block with a two-bit "func" code used to select which one to use. In effect, the ALU has inside it three units followed by a 4-to-1 multplexor with "func" wired to the select inputs. -- A register file with the 16 32-bit registers of the LC. -- A memory unit with 4GB (2^32) words of RAM. The address used for memory comes from a special register, the MAR ("Memory Address Register") loadable from the bus. A couple of new items need more explanation. -- The PC register by itself is intended to maintain the program counter (address of the next instruction). -- The NOR gate (32 inputs, one output) and Z register are used for testing equality for the BEQ instruction (i.e. compute A-B with the ALU and check whether the result is zero with the big NOR gate). Finally, there's one key item that enables intepretation by tying the datapath back to the control FSM. -- The IR ("Instruction Register") will be loaded with the current instruction as we go through the process of intepreting the program. An LC instruction has only 20 bits. Assume that the IR register is connected to the lower 20 bits of the (32-bit) bus. The various 4-bit fields of the instruction are broken out into named items on the circuit diagram (OP, RA, RB, RD) in a way that matches the definition of the instruction format, e.g.: R-type instructions (add, nand): bits 19-16: opcode bits 15-12: reg A bits 11-8: reg B bits 7-4: unused (should be all 0s) bits 3-0: reg DST The control part of the circuit is an FSM, as usual, with one twist. The control circuit takes inputs from the datapath and provides all the control signals to the datapath as output. The one twist is that we take advantage of the (convenient) instruction layout so that the whole IR doesn't need to be passed as inputs to the FSM. Since (it will turn out that) the register file address will always be one of the three instruction fields (RA/RB/RD) in the IR, we can select one of the those three fields as the address to the register file using a mux. In other words, instead of adding 12 inputs and 4 outputs to the FSM for RA/RB/RD and "raddr", we can get away with just adding the mux and two outputs ("rsel") to the FSM. 4. The state transition diagram for the FSM will be in the form of a loop that reads an instruction, interprets it and then goes back to read the next instruction. More specifically, we'll read an instruction, perform an n-way branch based on the OP field of the machine code for the instruction, then in each of the n branches, perform actions specific to the instruction. Like this: read instruction <-------------\ | | | | branch on OP | / / | \ \ ... | / / | \ \ ... | / | | | \ ... | code code code code code ... | for for for for for ... | add nand lw sw beq ... | \ | | | / | \ \ | / / | \ \ | / / | increment PC | \ | -------------------------/ Here's a fragment of a state transition table for the FSM. The state transition table is essentially a merging of the truth tables for the Next State ROM and the Output ROM in the FSM. The OP field and Z bit are inputs to the FSM and given in binary. I don't know how big the State/Next State fields are going to be, so I'll write them as decimal numbers. The outputs of the FSM are given as a symbolic bus action, as usual (it should be obvious how to convert these to individual LdXXX/DrXXX signals). OP Z Current | Bus Next 3210 0 State | Action State --------------------------------------- XXXX X 0 | MAR <- PC 1 XXXX X 1 | IR <- RAM 2 ! i.e. read memory 0000 X 2 | 17 ! ADD 0001 X 2 | ? ! NAND 0010 X 2 | ? ! LW 0011 X 2 | ? ! SW 0100 X 2 | ? ! BEQ 0101 X 2 | ? ! JALR 0110 X 2 | ? ! HALT 0111 X 2 | - 57 ! NOOP 1000 X 2 | ? ! 1001 X 2 | ? ! 1010 X 2 | ? ! 1011 X 2 | ? ! 1100 X 2 | ? ! 1101 X 2 | ? ! 1110 X 2 | ? ! 1111 X 2 | ? ! [...] XXXX X 17 | A <- REG[RA] 18 ! do right thing XXXX X 18 | B <- REG[RB] 19 ! for ADD XXXX X 19 | REG[RB] <- A+B 57 ! (i.e. ALU does +) [...] XXXX X 57 | A <- PC 58 ! point to the XXXX X 58 | B <- 1 59 ! next instruction XXXX X 59 | PC <- A+B 0 Oops, I think the 32-bit constant "1" with it's tri-state buffer enabled by "DrOne" is missing from the diagram... The diagram illustrates the path for the ADD instruction only (well, also NOOP, which is trivial). To complete the machine, we have to fill in the paths for the rest of the instructions. We'll come back to this circuit in a subsequent lecture and also (in gory detail) in Project 2.