; ; 16-Apr-98 ; ; RCS: $Id$ ; Lecture notes for Wednesday, 15-Apr-98 This lecture was the second of two reviewing assembly language and machine code. I talked about procedures in what turned out to be a pretty confusing way. However, again, the book does a really nice job of explaining this (Sections 3.6 and A.6). 1. Procedures are a key abstraction for structuring programs. If you step back for a moment, it may seem odd that they're still required in machine language since so many abstractions (data types, control structures) can be compiled away by the compiler. In fact the compiler can get rid of a lot of procedures as well, but: a. They're still useful for "compressing" code that's reused. b. Not all the code you need to interoperated with is available to the compiler (e.g. libraries). The key to the procedure abstraction is the "calling convention". This is the set of rules for passing arguments and using local variables within a procedure. The definition of the procedure abstraction (or the "calling convention") is really a software issue (defined by the programmer/compiler), but the ability to interoperate is so crucial that the calling convention is ordinarily specified by the operating system and often even suggested by the processor designers. In fact, Pre-RISC, processors conventionally included extensive hardware support for procedures. At least support for pushing and popping return address from stacks and often much more (read about the VAX or, better yet, the Intel i432). RISCs have scaled things back to a minimum. 2. When you want to define a calling convention, you need to define three things: a. control transfer (e.g. call/return) b. data transfer (arguments and return values) c. procedure-local data (automatic variables & etc) a. Control transfer. There are two things. When you call a procedure, you have to pass it a return address. When you return from a subroutine, you have to jump to the return address. Of these the second is a more important restriction when designing an instruction set: i. you _must_ have a jump-to-variable capability, e.g. JR. ii. it's _nice_ to have a jump-and-link so that it only takes one instruction to get the return address. MIPS uses JAL to call a procedure. JAL write the return address into register 31 ("$ra"). b. Data transfer. You have to pass arguments and return values somehow. Registers are fast, so that's the obvious place. MIPS defines four argument registers and two result registers. It's a little unusual to have the result registers separate from the argument registers. If you have more arguments or more results (e.g. you're passing around big structures), then these things must be passed in memory. c. Procedure-local data. Now we get to the interesting part. Obviously, procedure-local data should go into the registers because the registers are fast. But then you have to figure out how to "reuse" the registers when you call a procedure since there's only one set of registers! MIPS designates 18 registers to be used for local variables/temporaries within a procedure. int foo(int a) { int x = 25; /* want to put in R7 */ return(a + x); } int blah() { int x = 37; /* want to put in R7 */ int y; /* want to put in R8 */ y = foo(x) return(x + y); } For instance, in the code above, both blah() and foo() have local variables. Compiled independently, the compiler may well put the local variable x in the same register (e.g. R7) in both blah() and foo(). When blah() calls foo(), blah's copy of x must be saved for the duration of the call. There are two questions: (i.) where in memory are such temporary items saved and (ii.) which proceedure, blah (the "caller") or foo (the "callee"), performs the save/restore. i. Where. As you undoubtably know, the standard data structure for procedure-local state is a stack. The reason for this is that (for most languages) the call graph of a program is always a tree. Stacks may be implemented in innumerable ways. The MIPS convention is to use a big block of memory and hold a pointer to top of the stack in a register. Even that description is ambiguous because there are at four ways to do that: -- sp points to the last element, stack grow up -- sp points to the last element, stack grow down -- sp points to the next empty slot, stack grow up -- sp points to the next empty slot, stack grow down MIPS use the second strategy. So "push" and "pop" primitives look like this (assume x is a 32-bit integer) push(x) SP <- SP - 4 MEMORY[SP] <- x pop(x) x <- MEMORY[SP] SP <- SP + 4 MIPS stacks grow down because of the unix convention of placing code/data/heap at the bottom of user memory (heap growing up) and the stack at the top of user memory (growing down). As a hint for your calling convention in Project 1, we don't have a heap, so it's useful to place code/data/stack at the bottom of memory and have the stack grow up. Define your convention carefully. Typically, one doesn't use pop() and push() operations one at a time. Instead, a procedure allocates a chunk of space for itself on the stack on entry and then deallocates it on exit. Any data private to the procedure may be written to this chunk of space during the execution of the procedure by using the SP as a pointer. See the example at the end of this note. i. Which. Now we have private space for each invocation of each procedure. The remaining question is who saves the registers, the procedure doing the calling (the "caller") or the procedure called (the "callee"). Some things have to be saved by the caller. For instance, the return address register is written by the JAL instruction, so before blah() can execute the JAL to call foo(), blah must save RA (the return address for whomever called blah()) so that blah's RA isn't destroyed by the JAL. Other registers can be saved/restored by either blah() or foo(). MIPS splits its registers roughly half and half (the "S" registers are callee-saved, the "T" registers and arguments are caller-saved). I know of compilers that treat everything as caller-saved. 3. To wrap up, to define a calling convention, define the following things: a. define how the registers are to be used (args/returns & who saves the rest) b. define the stack convention (e.g. up/down) c. (slightly redundant) explain the mechanics: i. what the caller does at call time ii. what the callee does on entry iii. what the callee does on exit iv. what the caller does after return 4. By way of example, here's an incredibly conservative way to compile blah() using the MIPS register convention (forgive me if I mess up the assembly conventions a little bit). It's incredibly conservative in that it saves and restores every single register at the appropriate times. A compiler would obviously optimize away 90% of this code: ! int blah() ! { ! i. on entry: allocate ! stack & save callee-saved ! stuff. blah: add $sp, -4 * 23, $sp ! allocate private space: ! 1 return address ! + 4 caller-saved A0-A4 ! + 10 caller-saved T0-T9 ! + 8 callee-saved S0-S7 sw $s0, 0($sp) ! save callee-saved sw $s1, 4($sp) ! (all unnecessary...) sw $s2, 8($sp) sw $s3, 12($sp) sw $s4, 16($sp) sw $s5, 20($sp) sw $s6, 24($sp) sw $s7, 28($sp) ! int x = 37; ! int y; ! mov $t0, 37 ! put 37 into $t0 ! (picked $t0 arbitrarily) ! y = foo(x) ! at call: save caller-saved ! stuff, the call. sw $ra, 32($sp) ! save ra (necessary) sw $a0, 36($sp) ! save args (unnecessary) sw $a1, 40($sp) sw $a2, 44($sp) sw $a3, 48($sp) sw $t0, 52($sp) ! save ts (one of these is sw $t1, 56($sp) ! necessary) sw $t2, 60($sp) sw $t3, 64($sp) sw $t4, 68($sp) sw $t5, 72($sp) sw $t6, 76($sp) sw $t7, 80($sp) sw $t8, 84($sp) sw $t9, 88($sp) mov $a0, $t0 ! write args jal foo ! do the call ! ! after return: restore ! caller-saved, collect ! the result from v0/v1 lw $ra, 32($sp) ! restore ra (necessary) lw $a0, 36($sp) ! restore args (unnecessary) lw $a1, 40($sp) lw $a2, 44($sp) lw $a3, 48($sp) lw $t0, 52($sp) ! restore ts (one of these is lw $t1, 56($sp) ! necessary) lw $t2, 60($sp) lw $t3, 64($sp) lw $t4, 68($sp) lw $t5, 72($sp) lw $t6, 76($sp) lw $t7, 80($sp) lw $t8, 84($sp) lw $t9, 88($sp) mov $t1, $v0 ! recover the argument ! return(x + y) ! at exit: leave the result ! in $v0, restore the ! callee-saved stuff, ! deallocate stack ! and return add $v0, $t0, $t1 ! compute result into v0 lw $s0, 0($sp) ! restore callee-saved lw $s1, 4($sp) ! (all unnecessary...) lw $s2, 8($sp) lw $s3, 12($sp) lw $s4, 16($sp) lw $s5, 20($sp) lw $s6, 24($sp) lw $s7, 28($sp) add $sp, -4 * 23, $sp ! deallocate stack space jr $ra ! return using JR