Homework Assignment 4

CS1311-Scheme Spring 2001
Due 8:00 AM, Friday March 2, 2001

Objectives:

Story:
As you may remember from last time, our ship has been invaded by a large group of monkeys. Thankfully, James the Monkey Tamer has trained this group into a well-oiled monkey machine. They swab the decks, batten down the hatches and even cook a darn good five course meal. Everybody is happy: Captain Kurt is relaxing since he can count on the monkeys to take care of the ship, Madame Monica is plotting a way to get them onto Iron Chef, and the rest of the crew is having fun playing shuffleboard and drinking 1311X beverage. In fact, everything is going so well that Captain Kurt decides to give the crew shore leave on a tropical desert island. While the crew is delighted, the monkeys are offended. They think that they should be the ones getting the break since they're the ones doing all of the work. To make a long story short, the monkeys and the captain had an all-night bargaining session, and it was decided that the monkeys would man (monkey?) the ship while the crew went ashore, but they would do so only if the crew brought back fresh coconuts from the island. The captain agreed and the next day everybody went ashore. However, since coconuts grow on trees, the captain needs to make sure that everyone is confident working with trees.

Restrictions:
For this homework, you may use only non-side effecting functions (ie, you may not use let, hash-tables, display, etc... If you don't know what a side effect is, you're probably safe). Also, you must solve the problems with the algorithm we specify. No 'shortcuts' will be tolerated. (If you're thinking of using "flatten," sorry! You'll have to traverse the data structure the "right way").

We're going to be using the following format to represent data in trees:

((node-addr (node-data
             (left-child-addr right-child-addr)))
 (node-addr (node-data                         
            (left-child-addr right-child-addr)))
 ...
 (node-addr (node-data
            (left-child-addr right-child-addr))))
So, the tree:
             CS1311X
               ^
              / \
             /   \
            /     \
      pirates    antelope
        ^            ^
       / \          / \
      /   \        /   \
     /     \      /     \
   NULL   NULL  NULL    NULL
could be represented as:
with 2001 as the head node.
((2002 (pirates
        (0 0)))
 (2001 (CS1311X
        (2002 2003)))
 (2003 (antelope
        (0 0))))

Problem 1
5 points

Write an accessor function for these nodes called dereference. It will take in exactly two arguments: an address and the tree list. It should return the list that contains the address. For example:

> (dereference 1003 '((1002 (simon (1003 0)))
                      (1001 (alvin (1002 0)))
                      (1003 (theodore (0 0)))))
returns:
(1003 (theodore (0 0)))

Problem 2
5 points

Write another accessor function called get-node which will take in tan address and the tree list and return the node from the tree. For example:

> (get-node 1003 '((1002 (simon (1003 0)))
                   (1001 (alvin (1002 0 0)))
                   (1003 (theodore (0 0)))))
returns:
(theodore (0 0))

Problem 3
5 points

Write another accessor function called get-data which will take in an address and the tree list and return the portion of the node at that address that contains the node's data. For example:

> (get-data 1003 '((1002 (simon (1003 0)))
                   (1001 (alvin (1002 0)))
                   (1003 (theodore (0 0)))))
returns:
theodore

Problem 4
5 points

Write another accessor function called get-connections which will take in an address and the tree list and return the portion of the node at that address that contains the addresses of the node's children. For example:

> (get-connections 1001 '((1002 (simon (1003 0)))
                  	  (1001 (alvin (1002 0)))
	                  (1003 (theodore (0 0)))))
             
returns:
(1002 0)

Problem 5
5 points

Write two more accessor functions called get-left and get-right which will take in an address and the tree list and will return the left or right child of the node at that address. For example:

> (get-left 1001 '((1002 (simon (1003 0)))
                   (1001 (alvin (1002 0)))
                   (1003 (theodore (0 0)))))
             
returns:
1002

and
> (get-right 1001 '((1002 (simon (1003 0)))
                    (1001 (alvin (1002 0)))
                    (1003 (theodore (0 0)))))
returns:
0

For the next few examples let's use the following tree:

		A
       	       / \
             /     \
           B         C
          / \       / \
         /  NULL   /   \
        D         E     F
       / \       / \   / \
      / NULL   / NULL /  NULL
   NULL      NULL   NULL
Which could be represented as:
((1003 (C (1005 1006)))
 (1001 (A (1002 1003)))
 (1002 (B (1004 0)))
 (1005 (E (0 0)))
 (1004 (D (0 0)))
 (1006 (F (0 0))))
with 1001 as the head address.
For clarity, and to keep things from getting too cluttered, this tree will be represented as TREE in examples.

Problem 6
15 points

Given a node in a tree, sometimes it's useful to know how far down into a tree the node is. Write a function that will return the depth of a node in a tree. Return false if the node is not in the tree. Call your function find-depth. It will take in three parameters: the address of the head node, the tree to search, and the node to get the depth of. We'll say that the root has a depth of zero.

> (find-depth 1001 TREE 'A)
0
> (find-depth 1001 TREE 'B)
1
> (find-depth 1001 TREE 'E)
2
> (find-depth 1001 TREE 'barn)
#f

Problem 7
15 points

Sometimes it's useful to know how far from the 'bottom' of a tree a node is. This is usually referred to as a node's 'height'. We'll say that a leaf node has a height of one, and that the height of a non-leaf node is the maximum of the height of a node's left and right children plus one. Leaf nodes are nodes which have zero children. Write a function called get-height which will take in the head address of a tree, the tree and a node and will return the height of the node.

> (get-height 1001 TREE 'E)
1
> (get-height 1001 TREE 'B)
2
> (get-height 1001 TREE 'A)
3

Problem 8
15 points

Nodes in a tree which have no children are often referred to as "leaf nodes". Nodes which have one or more children are often referred to as "internal nodes." Write a function called number-of-internals which will take in a binary tree and return the number of internal nodes in the tree.

> (number-of-internals 1001  TREE)
3

Problem 9
15 points

Occasionally it's useful to collect all of the leaves of a tree. Write a function called collect-leaves which takes in the address of the head and the tree and returns a list of all of the data in the leaf nodes, in order, from leftmost node to rightmost node. For example:

> (collect-leaves 1001 TREE)
(D E F)

Problem 10
15 points

A node is said to be balanced if the difference between the height of its left subtree and the height of its right subtree is no greater than one. Write a function called balanced that will take in the address of a node and the list representation of the tree that node's in and will return the literal 'left if the node is unbalanced to the left, 'right if the node is unbalanced to the right, and 'balanced if the the node is balanced. A node is unbalanced to the left if its left subtree is higher; a node is unbalanced to the right if its right subtree is higher.


Once the crew arrives on the island, they notice something rather strange: all of the big coconuts grow on the right side of the coconut trees and the small coconuts grow on the left side of the coconut trees. "Jimminy Cricket!" Captain Kurt exclaimed, "this must be an island of binary search trees!"

Binary search trees (commonly abbreviated BST's) are binary trees with a special property that is probably best explained with a picture:

                  42
		/    \
	       /      \
	      36       54
	     /  \     /  \
	    12  38   48  99
	   /        /      \
          4        45      128

(assume the leaf nodes have NULL for children; they are omited to keep
the drawing clean)
Notice that all of the nodes to the right of 42 are greater than 42, while all of the nodes to the left of 42 are less than 42. In general, all of the nodes to the left of a given node are "bigger", while all nodes to the right of a given node are "smaller". Note that it would still be a BST if the nodes were sorted differently - for instance, if the larger nodes were to the left and the smaller nodes were to the right. However, for this class let's just say that "bigger" values will always go to the right. So what would this tree look like in our Scheme notation?
the root of the tree is at address 101
((101 (42 (102 103)))
 (102 (36 (104 105)))
 (103 (54 (106 107)))
 (104 (12 (108 0)))
 (105 (38 (0 0)))
 (106 (48 (109 0)))
 (107 (99 (0 110)))
 (108 (4 (0 0)))
 (109 (45 (0 0)))
 (110 (128 (0 0))))
We'll call this tree BSTREE in the examples that follow.

So why is this property so special? Let's say that you wanted to know if the value '137' is in the tree above. If this were just a plain binary tree, then you'd have to check all ten nodes for the value, but since it's a BST, you can do much less work. First you check the root node: 42, then since 137 is greater than 42 you move to its right child: 54. Continue like this to 99, then 128. Since 128 doesn't have any children, and since this is a BST, you know that the value you're looking for isn't in the tree. You only had to check a small portion of the tree instead of the whole thing. Neat, eh? While this may not seem so interesting on a small tree like the example above, consider really massive trees (like maybe a tree containing the Social Security database) and you'll see that you can really save some time with a BST.

Problem 11
10 points

The pirates want to pick only the largest coconuts. Since this island has coconuts growing on BST's, and since no pirate in his right mind likes doing more work than he has to, then it makes sense to exploit the BST's to go straight to the largest coconuts. Write a function called get-largest which will take in the head address of the tree and the list representation of the BST and will return the value of the largest element in the tree.

> (get-largest 101 BSTREE)
128

Problem 12
25 points

Now that you can easily get the largest coconut from each tree, Robert the Gashing has a realitization: "We're spending a lot of time climbing trees, but we're only getting the one largest coconut from each. What if we took several of the largest coconuts from each tree? That way we could climb fewer trees and I can get onto more important things... like working on my tan." Give our pal Robert a hand by writing a function called sum-largest that will take in the address of the head node, the list representation of the tree and a positive integer (let's call that integer k) and return the sum of the k largest coconuts in the tree. For example:

> (sum-largest 101 BSTREE 2)
227
> (sum-largest 101 BSTREE 4)
329
HINT:You may want to break this one up into some sub-problems. Perhaps a function that gets the addresses of the n largest elements in the tree and a function that sums the values contained at those addresses.

Of course, sometimes a tree branches out into more than two limbs at one place. In computer science this is usually referred to as an n-ary tree (since one node can have up to n child nodes). We'll represent these trees just like any other tree, but the node structure will look like this:

(data (child-1 child-2 ... child-n))
note that a node can have zero or more children (as opposed to having a whole bunch of zeros to signify a leaf node you would just have an empty list).

A n-ary tree could look something like this:
        		kurt
          ______________/|\________________
         /               |                 \
       jim              bill               david
      / | \                                /   \
  bruce | stevie                        drew   doug
        |                               / \
      danny                            /   \
                                    jared   jonathan
and would be represented something like:
the root node is 301
((301 (kurt (308 410 512)))
 (308 (jim (420 430 440)))
 (420 (bruce ()))
 (440 (stevie ()))
 (430 (danny  ()))
 (410 (bill ()))
 (512 (david (111 112)))
 (111 (drew (143 144)))
 (143 (jonathan ()))
 (144 (jared ()))
 (112 (doug ())))
In examples we'll refer to this as N-TREE to save space.

Problem 13
25 points

Since a n-ary tree is just a tree, you can do most of the same things to an n-ary tree that you can to a binary tree. (Since you used good abstraction it's pretty easy, right?) Write a function called nary-depth which will take in the head address of an n-ary tree, the tree in our list form, and an item to look for. It should traverse the tree and return the depth of the element. If the element is not in the tree then return #f. For example:

>(nary-depth 301 N-TREE 'bill)
1
>(nary-depth 301 N-TREE 'jonathan)
3
>(nary-depth 301 N-TREE 'kurt)
0
>(nary-depth 301 N-TREE 'rohan)
#f

A brief plot interlude
After picking coconuts from the trees, the crew of the Good Ship CS1311X settled down for a day of fun in the sun. After frolicking on the beach and drinking lots of frosty CS1311X beverage, the pirates are a happy crew. So it comes as quite a surprise when the re-board the ship and see that the monkeys have rearranged the rooms on the ship! They've moved rooms and hallways and kitchens and everything. Everybody is going crazy: some doors open out onto blank walls, the crow's nest is where the kitchen used to be, the kitchen is where the Captain's cabin used to be, and the Captain's cabin is at the top of the mast where the crow's nest used to be! Apparantly the ship's carpenter, Andrew the Absent-minded, left his toolchest unlocked during the day of beach vacationing. Until the ship can be fixed the crew is going to have to learn how to find their way around the new ship's layout.

Problem 14
35 points

Help the crew get used to their new ship arrangement. Write a function called find-path which will take in a starting address, an ending address, and a graph representing the ship. It should then do a depth-first search to get a path (without cycles) from the starting place to the ending place. The ship could look something like this:

                 
    kurt-cabin------------galley
    |    |               /     \________ 
    |     \___poopdeck__/               \
    |                                  crows-nest
    |
  main-hall
     \
      \
       \
      brig
Our representation for the graph will be slightly more complicated than the ones for the trees. The format for each node will be as follows:
(address (room-name (addresses of adjacent rooms)))
so the graph above could be represented as:
((42 (kurt-cabin (41 43 45)))
 (45 (poopdeck (42 43)))
 (43 (galley (42 45 50)))
 (50 (crows-nest (43)))
 (41 (main-hall (99 42)))
 (99 (brig (41))))
Once again, we'll abbreviate to save some space and call this graph SHIP. so an example search would be:
> (find-path 50 41 SHIP)

and a possible path that could be returned would look like:
(crows-nest galley poop-deck kurt-cabin main-hall)
or, equally valid:
(crows-nest galley kurt-cabin main-hall)

Problem 15
35 points

While a depth first search will always find a path, it might have many more nodes than is strictly necessary. A breadth first search, however, will always give a path with the fewest number of nodes. Let's simplify things and say that all rooms are the same distance apart. Let's help out the captain so he doesn't have to walk as far. Write a function called shortest-path which will perform a breadth-first-search to find the shortest path between two rooms. As before, it'll take in the addresses of two rooms and the graph, returning a shortest path (no cycles, order still matters) from the first room to the second. For example:

> (shortest-path 42 50 SHIP)
(kurt-cabin galley crows-nest)
but NOT
(kurt-cabin poopdeck galley crows-nest)