;-*- Text -*- ; 16-May-98 ; ; RCS: $Id$ ; Lecture notes for CS3760, Wednesday, 13-May-98 Started on pipelines. Many details are summarized in the note: http://www.cc.gatech.edu/~kenmac/3760/notes/pipe.ps ---------------- Story: 1. Recognize the difference between latency and throughput. It's not clear which is important until you think about the application. 2. Here's an application: multiply a number by 15%. Ordinarily a latency-critical operation since the waiter is looking over your shoulder when you do it, but... 3. You can improve throughput through parallelism if you have multiple, independent operations to perform. Parallel execution can take a number of forms. One is out-and-out duplication. Show a circuit with registers, N boxes, a mux and a state machine. 4. However, if your operation has multiple *internal* operations, you can apply the "pipelining" or assembly-lining trick: start the first unit on the next job before subsequent units have finished the first job. 5. For circuits, the trick is this: add registers between units. You can then clock the whole circuit faster. It takes a little applied creativity to pick an optimal pipeline, since not all possible registers are necessarily profitable. 6. Pad registers: have to make sure that intermediate results stay synchronized, even if there's no operation on a particular path. Pipeline analysis rule: 1. Check that you have the same number of registers along every path from input to output. 7. Pipeline synthesis rules: 0. Always start with a bank of registers on the output and none on the input. 1. First rule: it's safe to "duplicate" any bank of registers. 2. Second rule: it's safe to "slide" a bank across a unit by removing a register from every output and adding a register to every input (or vice versa). If you're cool, write a quick proof that pipelines synthesized this way satisfy the analysis rule. I didn't put this on the homework (it's too obvious).