; ; 29-Apr-98 ; ; RCS: $Id$ ; Lecture notes for Wednesday, 29-Apr-98 In this lecture, I discussed carry-lookahead adders and then multipliers. Chapter 4 in the book has a great discussion of these topics, so the details here are sketchy. The discussion of the multipliers in the book takes the tack of building the sequential. 1. carry lookahead a. motivation: looking at all bits in one big unit gives O(1) delay but is far too much logic; but our one-bit-at-a-time circuit is O(n) and will be too slow for a big adder. If each full-adder cell is implemented as a sum-of-products in the conventional way (2 layers of logic) then the propagation delay of a 32-bit adder will be 64 times the delay of an individual AND or OR gate (a "layer" of logic). This one-bit-at-a-time circuit is called a "ripple-carry" adder, because the carry information propagates from the LSB to the MSB like a ripple across water. b. The carry-lookahead approach takes advantage of the fact that the equation for carry through a stage can be decomposed into "propagate" and "generate" terms, P and G: C_i+1 = G_i + P_i*C_i. For one full-adder cell, G = AB and P = A+B. c. More usefully, though, P and G can be defined for larger chunks of the circuit, like 4-bit adder units. Given P & G for four 4-bit adder units, and a second circuit can take those 8 signals (along with the carry-in to the whole adder, C_0) and compute the individual carry-in signals to the 4-bit units in constant time. This approach doesn't accelerate the C_0 or even the C_4 input, but it does help the C_8 and C_12 inputs which would otherwise have to ripple through all the lower-order adders. d. Finally (& best), the P & G trick can be applied heirarchically in a tree, reducing the delay of the whole circuit to O(log(n)) time. e. The lookahead approach is used in real processors, although one can play low-level transistor circuit tricks to speed up the ripple-carry chain considerably. 2. arithmetic overflow a. we use a fixed number of bits for arithmetic. We want some indication of when we've exceeded the representable range. For instance, if you add two positive, 2s-complement numbers, if the result is too big the answer comes out *negative*. 3. combinational multiplier. a. similar intitial problem to the adder: we have this unit that takes two 32-bit numbers as input and gives a 64-bit output. Computing this function in one big ROM is infeasible. b. attack the problem by analyzing its structure: draw out a multiplication and look at what operation is common to every bit. Like a VLSI designer, we are happiest if we can define a 1-bit cell carefully, then replicate it as mindlessly as possible... c. The 1-bit cell is an AND gate (a 1-bit multiplier!) and a full adder. d. The replication pattern can take advantage of the fact that we only really add 32 bits at a time, even though the result is going to be 64 bits. e. still, at the end we have 32x32=1024 cells, which is a lot. 4. sequential multiplier a. "fold" the combinational multiplier: use a single, 32-bit adder in 32 steps with shift registers to move the multiplier, multiplicand and product around appropriately. The book develops this approach in a really nice way. b. You could go further and use a _1_ bit adder in 1024 steps. This is the approach your calculator uses. c. Conventional processors use various approaches. One approach is to build several layers of the full-combinational multiplier and then reuse them. E.g. build 8x32 cells and use them four times.