HOMEWORK 4: Implementing Recognizers

This is an INDIVIDUAL assignment.

Objective

In this assignment we'll shift gears away from low-level Swing stuff, and explore how to build recognizers for digital ink and gestures. You'll learn how to implement a simple recognizer that's robust enough to work for a wide range of gestures, and how to integrate such a recognizer into your application.

This assignment is to do an implementation of the SiGeR recognizer discussed in class and the lecture notes, and integrate it into the Notebook application. You'll use the recognizer to detect command gestures that will let you do things like tag pages using the pen/mouse, and move and delete on-screen objects, without having to use the regular GUI controls.

The learning goals for this assignment are:

Experience implementing a basic recognizer.
Experience creating recognition templates.
Experience integrating a recognizer into a Swing application.

Please see the note flagged IMPORTANT in the deliverables section at the end of this document.

Description

In this homework, we'll implement the SiGeR recognizer and integrate it into the Notebook application as a way to perform commands (meaning: certain gestures will be recognized as commands to control the application, rather than as simple digital ink that makes up the content of pages).

We'll use a modal method of telling the system which strokes should be interpreted as command gestures and which should be interpreted as ink. Strokes made with the "normal" (left) mouse button down should be interpreted as ink, while strokes made with the "context/menu" (right) button should be interpreted as command gestures. This means that in the code that handes strokes you'll now need to look at the modifiers to figure out whether to process the strokes as ink or pass them off to the recognizer.

Command gestures should appear on the notebook page with a different visual appearance than ink, and then disappear once the command has been completed. For example, if your ink is black, you might draw command gestures in red, and then have them disappear once the gesture has been completed.

You should define at least 6 templates for different sorts of command gestures. These gestures should let you do the following:

Tag the page as being one of your four special types (appointment, action item, and so forth). In effect, you should be able to use gestures to accomplish the same thing as the toggle buttons in your UI (flagging something as "Important" by drawing a star gesture on the page, for example). Tags added via gestures should interact well with tags added via the normal controls; for example, drawing the "Action Item" gesture on a page should cause the corresponding toggle button to be set; drawing it again should remove the tag. Tags should be addable or removable via either gestures or the toggles, and these should stay consistent with each other. Note that if you have four types of tags, you'll need four distinct gestures that you recognize.
Delete content. A special gesture should indicate that you want to delete the ink or typed text below the gesture.
Select content for movement. Finally, a special gesture should select a region of content to allow it to be moved. For example, you might circle a region of ink to select it. After this, the selected content should be rendered differently to indicate that it is selected (it could change color, or have a box drawn around it, or glow if you want to get really fancy). Then, the next mouse/pen input should allow the user to move the selected area on the page (press on the selected area, drag to new position, and then release, for example).

Implementing the Recognizer

See the slides for details on how to implement SiGeR. Here are a few additional tips.

Decide on the representations you want to use first. By this I mean, figure out how you'll specify templates in your code, and how you'll represent the direction vector of input strokes.

I'd suggest defining a bunch of constants for the 8 ordinal directions SiGeR uses (N, S, E, W, NE, NW, SE, SW). Both direction vectors and templates will be defined as lists of these. You may also want to define some special "combination" constants (a "generally east" constant that means either E, SE, or NE for example). These latter combination constants will only be used in the definition of templates, not in the vector strings you will produce from input gesture strokes. In other words, they allow you to define your templates a bit more loosely than just the specific 8 directions.

While defining such a set of human-readable constants isn't specifically necessary (you could just do everything in terms of the strings and regexp patterns described below), it can be very helpful for debugging to be able to write a template as a set of directions, rather than a raw regexp pattern.

Next, write a routine that takes an input gesture and produces the direction vector from it. In other words, given a series of points, it'll spit out a vector containing a list of the 8 ordinal direction constants. This direction vector represents the true shape of the input gesture.

Here's the only tricky part: you'll need to write a routine that turns the direction vector into an actual string of characters that contain the same information as in the vector, and another routine that takes the template info and produces a regexp pattern from it. The idea is that you'll see if the regexp pattern for the template actually matches the stringified representation of the direction vector.

There's a lot of flexibility in how you define the symbols in these strings. For the direction vector string, you'll probably just have a series of combinations of 8 separate letters, each representing one of the 8 ordinal directions.

For the regexp pattern, you'll want to generate a pattern that can match occurrences of any of these 8 letters, as well as "either-or" combinations of them ("generally easy" for example, might be a pattern that matches either the letters representing E, SE, or NE). You'll also need to generate a pattern that can deal with "noise" at the first and end of the input string. The slides have some examples that show how to do this.

The actual matching process will then just compare an input stroke string to the list of template patterns, and report which (if any) it matches.

Defining the Templates

Your templates will be defined in your code (unless you get really fancy), most likely as a set of declarations that look something like this:

int QUESTION_MARK = { UP, RIGHT, DOWN, LEFT, DOWN }

Remember that you'll need to define templates for each of your tagging buttons, as well as delete and move. It may take a bit of tweaking to come up with a gesture set that's distinguishable, and may also require some tweaks to define the templates at the proper level of specificity.

Integrating the Recognizer

Remember that we're using a mode to distinguish ink (left mouse button) input versus gesture input (right mouse button). Gesture input should be drawn on screen while the gesture is being made, so that it provides feedback to the user. The gesture should disappear once the mouse is released.

One way to get this effect is to augment your BasicNotepageUI slightly, to keep a reference to the single current gesture being drawn (which may be null if no gesture is in progress). The paint code then draws the stroke and text display lists from the model, then the current gesture (if there is one), so that the gesture appears over the top of everything else.

Note that since the gesture is only a transient feature of the page, it's ok to put this in the BasicNotepageUI, and not the model.

When the gesture is complete, it can be removed from the set of items to be displayed, and handed off to the recognizer to be processed.

If the gesture is not recognized, you should indicate this by displaying a message in the status bar that says "unrecognized gesture" or something.

If the gesture is recognized, what you do next depends on exactly what command was recognized.

For one of the tagging gestures, you should just update the tag data associated with that page; make sure that any changes are reflected in the state of the status buttons also. Making a tag gesture on a page that already has that tag should remove it from the page.

For the delete gesture, once you've recognized the gesture you need to figure out what object(s) the gesture was drawn over. The way I'd suggest doing this is to take the bounding box of the gesture and then identifying which on-page objects have bounding boxes that are strictly contained in the gesture's box. Deleted items should simply be taken out of the display list(s) so that they do not appear; be sure to notify your listeners when the model changes so that the display is updated correctly.

The select-to-move gesture is perhaps the weirdest, because it introduces a new mode into the UI. First, when items are selected (perhaps through a circling gesture), they should be drawn differently on screen to indicate that they are selected. Again, you probably want to compare the bounding box of the gesture to the bounding boxes of the on-page objects to determine what objects the user intends to select.

You need to keep track of the selected objects in some way; I'd suggest adding yet another item to the BasicNotepageUI, which is a list of the selected objects (strokes or text); the paint() code will then display anything in this list differently (through color or a highlighted bounding box or whatever). Again, since this is transient state it does not need to go in the model, so can live in the BasicNotepageUI.

As long as something is selected, you're potentially in "move mode." My suggestion for how to implement this is to look at any mouse press that happens; if something is selected, and the mouse press is inside one of the selected objects, then dragging the mouse moves the object (which should just be a matter of updating the X,Y coordinates of the items in the model, and notifying any listeners). If the press happens outside of a selected item, you can "de-select" the selected stuff (take it out of the selected list and just draw it normally). This ends "move mode." The basic behavior here should be much like any paint program--when something is selected you can click into it to drag it; but as soon as you click outside, the object is de-selected.

Extra Credit

As usual, there are a lot of ways to make this fancier than described.

Implement a richer command set than described here (cut/copy/paste, for instance). Bonus points vary depending on how complex the new commands are, but probably +1 or +2 per new command.
Gratuitous graphical richness, such as having deleted objects crumple themselves up or vanish in a puff of smoke, or having selected items surrounded by a pulsating glow. Variable, but probably +1 or +2 points.
Implement the Tivoli 9-square recognizer and use it to identify at least one gesture. +4 points for this one.

As usual, if you do something else that's above and beyond the call of duty, let us know in your README file and we may assign some extra credit for it.

Deliverable

This is an INDIVIDUAL assignment; while you may ask others for help on Java or Swing details, please write the code on your own.

To turn in the assignment, please follow the same process as last time:

Create a new directory using your last name as the name of the directory. Please compile your application and place both the sources and classfiles, as well as a README.txt file (described below) into this directory; ZIP this directory and mail it to the TA.

The README.txt file should contain your name and email address, the version of Java you used (either 1.4.x or 1.5.x, please) as well as any special info I might need in order to run your code. (Command line arguments, etc.)

If you do any extra credit work, be sure to let us know about it in the README file so that we don't miss it!!

IMPORTANT: I'm letting people come up with their own gesture vocabulary for this project. What this means though is that YOU MUST provide a description to us of what that gesture set is. Remember that SiGeR gestures are directional also (a square bracket started at the top is a different gesture than a square bracket started at the bottom). You need to provide us with enough detail that we're not having to reverse engineer your code to figure out how to make your gestures. A short graphical cheat-sheet that shows the actual gestures you're using would be a great way to do this. We will deduct points from people that make us spend lots of time to figure out your gestures!

Please take care to remove any platform dependencies, such as hardcoded Windows path names or dependence on a particular look-and-feel that may not exist on all platforms. Also, if you use any images in your application, please make sure that you include these in your ZIP file and that your code will refer to them and load them properly when from from inside the directory that's created when I unZIP your code.

Grading for this assignment, and future assignments, will roughly follow this breakdown:

60% functionality
40% good architectural design, coding style, commenting

Please let the TA or me know if you have any questions.