INTELLIGENT SYSTEMS QUALIFIER

Spring 2002

April 3, 2002

 

General Note: Please answer the section on Artificial Intelligence and either the section on Robotics or on Perception.

----------------------------------------------------------------------------

ARTIFICIAL INTELLIGENCE

Note: Please answer any four (4) of the following six (6) questions.

 

Q1. One can use consistent, inconsistent but admissible, or inadmissible heuristic values in conjunction with an A* search. What are the advantages and disadvantages of each of these choices compared to the other ones?

 

Q2. Some artificial intelligence researchers believe that partially observable Markov decision process models (POMDPs) provide a good foundation for planning under uncertainty. 1) Sketch some approaches that find optimal or approximately optimal solutions for POMDPs. 2) Explain what is known about the complexity of finding optimal or approximately optimal solutions for POMDPs. 3) Give an overview of current research directions in the area of planning with POMDPs.

 

Q3. Your task is to help a credit card company implement a fraud detection scheme. You are considering belief networks, Markov decision process models, and decision trees (from inductive learning) for this purpose. 1) How could each of these knowledge representation schemes be used to help detect unauthorized credit card use? (If you feel that some of them cannot be used for this purpose, explain why this is the case.) 2) What are the advantages and disadvantages of each of these knowledge representation schemes compared to the other ones?

 

Q4. When we perform analogies from old to new cases (e.g., from the solar system to an atom) we carry over a number of inferences (e.g., electrons revolve around the nucleus), but we do not do so in a delusional way (e.g., electrons have climates, the nucleus is a giant fireball). How do current systems of reasoning by analogy (such as case-based reasoning or structure mapping) avoid such "delusional" inferences? In other words, given that reasoning by analogy is inherently unsound, how can one inference be better than another?

 

Q5. You're designing a truth-maintenance system, which is a system that maintains a dependency network between all the facts it knows, and can update that network dynamically as new facts are introduced into the knowledge base. Thus the system knows, for any particular fact in the knowledge base, what other facts support that fact and whether that fact is currently believed (note that the system may not believe a fact if some supporting facts are also not believed). Contrast the bookkeeping that is necessary for a system that uses a monotonic logic with the bookkeeping necessary for nonmonotonic logics. Specifically, contrast how the system needs to respond to any change in its knowledge for 1) a monotonic logic, 2) a simple nonmonotonic logic utilizing default values, and 3) a more complex nonmonotonic logic where nonmonotonicities can result from more complex interactions between known facts.

 

Q6. What are the advantages of memory-based (aka instance-based) approaches to machine learning with respect to parametric density estimators or regression methods? Are they most suited for high or low-dimensional learning problems? In which of the following applications do you think a memory-based method could be put to good use: credit card fraud detection, diagnosis of diseases from a list of symptoms, fingerprint recognition, handwritten digit recognition. Why or why not?

 

ROBOTICS:

Note: Please answer any two (2) of the following three (3) questions.

 

Q1. Dynamic and unpredictable environments call for different types of robotic architectures than more structured settings. Consider these two different scenarios:

1) A humanoid entertainment robot that must work in the living room of a house with children present.

2) Robot manipulators that are used for laboratory tasks such as drug testing or for humane genome sequencing, dealing with very repetitive operations.

a) How do these tasks differ in terms of the knowledge that can be brought to bear in each case? Don't only simply discuss the specific knowledge required but the qualities of the knowledge and how it differs from one case to the other (the epistemological differences).

b) How about the same for perceptual requirements?

c) Describe the criteria you might use in making a good decision in how to create a robot architecture for one example or the other and why that choice is less appropriate for the non-selected task domain.

d) Finally, lay out the basics of the architecture (data flow, control flow, perception, etc.) for the example you chose in (c). Use figures to help present your ideas.

 

Q2. Describe the basic principle behind particle filters (aka the condensation algorithm, Monte Carlo localization) as used for localizing a robot in an environment. Under what circumstances would you prefer a particle filter as opposed to the more classical Kalman filter approach? When would you prefer a Kalman filter? Describe the basic difference in representational power between the two approaches.

 

Q3. Design a reinforcement learning system to control a two-link robot arm (see figure).

The arm moves only in two dimensions. It is anchored to a table at joint 1; the first link extends to joint 2, which connects the two links together. Motors control both joints, which can be set to any angle. There are four commands or "actions" that can be sent to the arm: rotate joint 1 CW one step; rotate joint 1 CCW one step; rotate joint 2 CW one step; rotate joint 2 CCW one step. The control system is able to read the angles at each joint and also to detect the X,Y location of the tip of the arm (via an overhead camera).

The task is to move the tip of the arm to a given location (GoalX,GoalY) from a randomly initialized starting location.

a. Design a reinforcement learning system that will learn to reach the goal location. Describe the algorithm and the training process. Describe the data structures you would use. What reward function would you use? Why?

b. After learning successfully to move to the goal location, suppose the system is given the task of moving to a different goal location. If you knew that this might occur (that the robot might have to learn different goals) would you prefer to program the system with a model-based or a model-free learning system? Why?

PERCEPTION:

Note: Please answer any two (2) of the following three (3) questions.

 

Q1. A well-known technique in computer vision is to decompose an image in terms of principal components, derived from a large set of training examples using PCA. Why is this done? What is the problem of such an 'eigenface' approach in the case of face detection and/or face recognition? Can it easily model spatial deformations in the image? In the same application, does a straight PCA approach lead to a well-behaved density in the original 'face-space'? If not, describe an extension that would have this characteristic.

 

Q2. One proposal for improving the quality of depth estimates in stereo vision is the use of multiple baselines. The basic argument is that matching is easier when the baseline is small, while depth estimates are more accurate when the baseline is large (Matthies, Szeliski, and Kanade, 1989). Suppose you have a linear array of N equally-spaced cameras. Using a standard SSD-based matching criteria, develop a recursive estimation algorithm which produces a sequence of depth maps of increasing accuracy by processing the camera images in linear order. Under appropriate simplifying conditions, prove that the estimation error decreases as more images are processed. What is the fundamental benefit of having multiple baselines? Under what conditions will the multi-baseline approach be superior to binocular stereo? If you were able to process the multi-baseline images in a batch framework (Okutomi and Kanade, 1993), rather than sequentially, is it possible to obtain a better result? How does the multi-baseline approach compare to more recent algorithms, such as space carving, which also employ multiple cameras?

 

Q3. A basic problem in early vision tasks such as stereo, motion estimation, segmentation, etc. is the selection of window size. Rectangular windows are often used because of their convenience, but this causes a breakdown at region boundaries of the assumption that all of the pixels inside the window were generated by the same process in the world. Briefly describe the deleterious effects of this problem on stereo depth maps and one other early vision task. One solution to this problem in the stereo case would be the use of the EM algorithm. Given a soft assignment of pixels inside a window to region of constant depth in the scene, a LS disparity estimate can be computed. Given a disparity estimate over the window, regions of constant disparity can be identified and segmented. Develop an EM-based algorithm for variable-window stereo matching and discuss its properties. How does the EM approach compare to others in the literature?