; ; 16-Apr-98 ; ; RCS: $Id$ ; Lecture notes for Monday, 13-Apr-98 This lecture was the first of two reviewing assembly language and machine code. Since you know MIPS pretty well, the second point (aside from review) was to talk about _why_ assembly language features are the way the are and what the alternatives might be. First, the discussion of MIPS in chapter 3 and the first part of chapter 4 is great. Even if you think you know this stuff cold it's well worth reading. 1. My main observation about assembly languages is that, grossly, they're all the same... Once you've seen one you should be able to pick up any one. They're the same because they have the same goal and the same mechanisms: -- Everyone has the same processor + memory model. -- You start by taking a source program, blowing away all the pretty control structure and reducing it to a sequence of simple operations. -- The simple operations have to be things that can be done by hardware (e.g. by the single-bus datapath we looked at), so they're things like "add", "subtract", "shift". -- The operations have to be strung together somehow. Since source languages are (mostly) sequential, processors (mostly) assume that operations are sequential. In other words, you encode flow of control by encoding the exceptions (the GOTOs). So, when presented with a new assembly language you should ask three things: a. What set of operations do I get. b. How do I get my program variables to be operands of these operations. c. How exactly do the "gotos" work (particularly the conditional ones). 2. Operations. As suggested, these are all pretty similar, e.g. arithmetic: add/sub/mult/div, logic: and/or/not/xor, bit shifts: sll/sra/srl. On a MIPS, all instructions are fixed-length at 32 bits. They devote 6 of those bits to the "opcode" that encodes the basic function of the instruction. 3. Operands. This is the most fun part because there are lots of possibilities. I can think of two dimensions in classifying how machines pass operands to operations. One is the number of operands specified explicitly (3, 2, 1 0). The other is how much visible "structure" there is to the storage system (e.g. do you have registers) 3-operand in memory: 3-operand in registers: ++ very flexible e.g. MIPS -- very big RISC conclusion: registers ex: VAX w/completely separate lw/sw operations is the sweet spot. 2-operand in memory: 2-operand in registers: One operand is dst & one src less flexible, less big 1-operand in memory: 1-operand + registers An implicit accumulator is dst and one src. This one was very common early on and still common in low-cost implementations 0-operand All operands implicit on a stack. Easy to compile to, but not mainstream in hardware at this time. There are historical examples of all types. The judgement of history is that providing 3-operands instruction operating on register and keeping load/store instructions completely separate is the "sweet spot", at least for high-performance. The ultimate reason is that instructions can be easily identified as "independent" and thus safe-to-execute in parallel. Think of a two-instruction sequence: add R1, R2, R3 add R4, R5, R6 The instructions can be executed in parallel. More about that before the end of the quarter... Another aside: compilers often use a 3-operand-in-memory language as their intermediate format. It turns out to be easy to convert that into 3-operands-in-register at the last stage of compilation. If you have more variables than registers, then you pick some variables to "spill" to memory. The MIPS gives two formats for basic arithmetic instruction, the "R" format and the "I" format: R: op rs rt rd shamt fcode (bits) (6) (5) (5) (5) (5) (6) I: op rs rt -- immediate -- (bits) (6) (5) (5) (16) The R format takes three registers. The I format allows some instructions to take an immediate (signed 16-bit number) operand instead of one of the source registers. As mentioned, load and store are separate. In MIPS, they use the I format where the immediate value serves as a signed offset to an address in a register. 4. The GOTOs. "branches" are conditional, "jumps" are unconditional (at least in MIPS terminology). These instructions change the value in the program counter (PC). There are two things to ask about gotos: a. How are conditions communicated to the conditional branches. b. How to specify the target (the value to be put into the PC). a. There are several popular options for conditional branches. i. The branch instructions themselves perform a compare operation, so testing and branching takes one instruction. You then have a large family of branch instructions, e.g. BEQ, BLT, BLE, etc. This is what the MIPS does. ii. The compare and branch instructions are separate, so it takes two instructions. Then there's the issue of how the result of the compare is communicated to the branch. Two more options. x. The result of the compare is written to a general register. Typically you have a large family of compare instructions and only one or two branch instructions, e.g. BTRUE, BFALSE. This is what the Alpha does. y. The results of the compare are encoded into a special set of bits called condition codes in a condition code register (CCR) that's separate from the register file. Branch instructions then check the condition codes. This option was very popular (e.g. x86) but is limiting in the same way that special-purpose registers are limiting -- it introduces unnecessary dependencies in code. b. The target can be specified in several ways. Usually there's more than one; at least one immediate and one from a register. MIPS has three: -- Branch instruction use the I format (two registers to be compared plus a 16-bit offset). The offset is interpreted as signed, multiplied by 4 (since all instructions are four bytes) and added to the PC. The branch thus have a range of +2^15-1 to -2^15 instructions (+2^17-1 to -2^17 bytes). -- There's a jump-register instruction (JR) that uses the R format. -- Finally, there's a plain jump instruction (J) that takes a 26-bit number in a special J (as in jump) format: J: op -------- immediate ------- (bits) (6) (26) This puppy is pretty wierd. What it does is multiply the immediate value by 4 (as you'd expect) and the _replace_ the bottom 28 bits of the PC with the result. The jump thus lets you jump anywhere within a 256MB (2^28) "segment" of address space, but _not_ out of it (since the top 4 bits of the PC remain unchanged). This is wierd behavior but not a practical drawback. User code sits in the bottom of memory and is not likely to grow beyond 256MB. Finally, there are varients of JR and J that do a jump-and-link, i.e. they copy the PC of the instruction following the jump instruction into a register (implictly R31). These instructions (JALR, JAL) are useful for calling procedures. That's the topic for Wednesday.