Homework 2 Solutions
CS 3411 - Programming Language Concepts - Fall '98
Due: 1:30pm Thursday 5 November (Follow handin instructions on the class Web page.)

Got Questions? Try office hours, email or the class newsgroup.

  1. Develop access formulas for 0-based 3-d matrices stored in 1) row-major and 2) column-major order. In other words, given a three-dimensional array A, give a formula for computing the address of element A[i, j, k]. Assume that the array is stored in memory starting at location BASE and that the array consists of elements of size ESIZE (bytes).

  2. The terms "row-major" and "column-major" are actually insufficient to correctly describe the generalization to 3 (or more) dimensions. With 3 dimensions there are actually (at least) 3 distinct ways of laying out storage but real languages only use generalizations of row-major and column-major. Remember that one way of thinking about these words is "what dimension varies most quickly"? If you scan over the linear representation of the array in memory and have a "speedometer" showing the i, j and k indices, which one is spinning the quickest? With row-major, the column or "rightmost" index varies most quickly. With column-major, the row of "leftmost" index varies most quickly. These are the two generalizations we are interested in. For "row-major" storage of a 3d structure, the rightmost index still varies most quickly but now it is the 3rd dimension (k). 

    Think of a 2x2x2 cube facing you. Element 0,0,0 is in the upper-left corner. If we look at the very next position in the "linearized" stored array, which element do we find? The one to the right? The one below? Or the one behind? For "row-major" the righmost index (k) varies most quickly, so the next one in memory is the one BEHIND the first element. After we have stored all the elements behind 0,0,0 where do we go next? To the right or below? For row-major the k index varies most quickly, the j index varies the next most quickly and the i index stays the same for the longest period of time. That means we stay in the same row, so now we start storing elements to the right of the first element. That is we vary the column index. Since the leftmost index (the row) varies the least quickly, we will store the "top slice" of the 3d array first and then the slice below it. 

    For column-major, the leftmost index varies most quickly, that is, the row, then the column and the k dimension stays the same for the longest. Thus we store the "front" slice of the cube first, then the slice behind it and so forth.

    The third (weird) storage  possibility would be to store the left slice of the cube first and then the slice to its right and so forth but no language that I know of uses this scheme. This corresponds to the j index varying least quickly. We would still have to decide what varies the next most (i or k?).

    Here are the access formulas for "row-major" (rightmost index varies most quickly) and for "column-major" (leftmost index varies most quickly). Assume that there are I elements in the i dimension , J elements in the j dimension and K elements in the k dimension.

    rightmost index varies most quickly ("row-major"):

        BASE 
            + i*K*J*ESIZE     // slices above
            + j*K*ESIZE       // rows to the left
            + k*ESIZE         // elements in front

    leftmost index varies most quickly ("column-major"):

        BASE 
            + k*J*I*ESIZE     // slices in front
            + j*J*ESIZE       // columns to the left
            + i*ESIZE         // elements above


    Both of these can be expressed as a recurrence for the general case. (Whew!!)
     

  3. Consider the following C program:

  4.     int fun( int *i ) {
            *i += 5;
            return 4;
        }
        void main() {
            int x = 3;
            x = x + fun(&x);
        }

    What is the value of x after the assignment statement in main, assuming:

        1) operands are evaluated left to right.

            3 + the result of fun()
           3 + 4
           7

        2) operands are evaluated right to left.

            x + the result of fun
         x + 4 but fun adds 5 to x so x is now 8
         8 + 4
         12
     

  5. Design a scheme to allow record fields to be accessed using indices in addition to field names. Make sure your scheme allows you to uniquely refer to subfields in nested records and that record field "indices" can be computed at run-time using arbitrary expression (like array indices). (Hint: You may need to keep around additional "bookkeeping" information at run-time to make this possible.)

    We just need a special syntax that means "this expression is a field index". Let's just use the familiar square brackets. We will continue to use the field selector (. operator). But now we can say things like:

              r.foo    // give me the field named foo in record r
            r.[3]    // give me the 4th field in record r (assume that fields are 0-based
            r.[x+2]    // calculate the field index 

    This notation nests properly:

            r.foo.[3].[x+2]

    Notice there is no syntactic confusion with the array selector operator ([]) because of the continued use of the dot notation. So we can still have arrays of records containing arrays etc.

            a[x+3].[y+2]

    We will need to keep what your book calls a "dope vector" around at runtime to check for invalid "field indexes" at runtime. The dope vector will simply contain all the valid offsets of each record type. This is analogous to supporting array index "bounds checks" at runtime (like Java). Since arrays can be dynamically allocated in most languages, each individual array object must keep its bounds at runtime. (C does NOT do this.) The same will be true for our record indexing scheme.

  6.  
  7. Describe a notation for referring to arbitrary "slices" of two-dimensional arrays. You may describe a notation used in an existing language or create your own notation. Make sure that your notation allows you to describe "subregions" and not just entire rows and columns.

    Your book describes a notation that Fortran 90 uses. That would be fine. To describe a row or column subregion you need to "fix" one dimension and then provide begin:end indices of the subregion in the other dimension.

        M[ 2, 3:7 ]    // row 2, columns 3-7
        M[ 0:4, 5 ]    // rows 0-4 in column 5

    You can add a "wildcard" like an asterisk to mean "all rows" or "all columns":

        M[ 2, * ]      // all of row 2

    This notation generalizes to allow  rectangular subregions:

        M[ 2:5, 3:7 ] // intersection of rows 2:5 and 3:7 

    So how about those scary "diagonal regions" everyone was worried about? Just have two special values that indicate that a diagonal is desired (these will need to be reserved words) like d1 and d2. Then provide the appropriate begin:end dimensions as before:

        M[ d1, 3:7 ] // upper-left to lower-right, elements 3-7
       M[ d2, 2:6 ] // lower-left to upper-right, elements 2-6

    It is possible to allow negative values to indicate the oppositive order if this is useful or desirable:

        M[ -d1, 7:3 ] // lower-right to upper-left, elements 7, 6, 5, 4, 3

  8.  
  9. Tagged-variants (also known as discriminated unions) attempt to make unions typesafe by introducing a tag field that should always indicate which variant was last assigned. Pascal supports a tag field but it is up to the programmer to ensure that the tag has the correct value. Ada has a mechanism for forcing the tag and variant to always correspond. Devise a syntax that keeps tags and variants consistent. Can your scheme be statically type-checked or does it require dynamic type-checking. Explain.

  10. Here is an example of an Ada variant record:

        type NODE (TAG : BOOLEAN) is
            record
                case TAG is
                    when true => COUNT : INTEGER;
                    when false => SUM : FLOAT;
                end case;
            end record;

    You can think about this problem generally but one "easy" solution would have been to simply lookup Ada's mechanism for doing this and describe it! The basic idea is to not allow a "partial assignment". Whenever you change the tag you must change the variant and vice-versa. Attempts to alter the tag and a variant are always type-checked to make sure that the tag and variant being assigned match. A very simple solution would be to only allows assignments like this:

        MY_NODE.TAG, COUNT := true, 10;
       MY_NODE.TAG, SUM := false, 3.1415;

    "Partial assignments" are not allowed (like this):

        MY_NODE.TAG := true; // not allowed!
        MY_NODE.COUNT := 10; // record is possibly inconsistent in between

    Ada really likes compile-time type checking so it does not allow the tag value to be computed. It must a literal value, so you can't say things like:

       MY_NODE.TAG := FLAG, 3.1415;

    It would be possible to allow "computed tags" but you would need runtime type checking to ensure that the value actually assigned was correct and consistent.

    Ada goes a bit further and requires that you assign ALL FIELDS in the record (even non-variant fields) when you change the tag. Ada has a notation for "literal record values" so the actual notation might look like this (the tag value, which must a literal, comes first):

        MY_NODE := (true, 10);

  11. Some languages (SIMSCRIPT, Algol) have chosen to implement 2-d arrays using a 1-d array of pointers, each of which points to a 1-d array of elements. What are the advantages and disadvantages of this scheme compared to the more standard row-major and column-major representations?

    advantages:
        -- allows "ragged right" structures where different rows have differing number of arguments
        -- easy to manipulate row slices
        -- rows can be deallocated and reallocated independently
        -- easy to "swap rows" without moving data!
        -- easy to alter # of columns dynamically (at runtime)

    disadvantages:
        -- additional ptr indirection for element access
        -- simple references can't have offset computed at compile time (M[3,4])