CS8803:Visual Data Analysis

Alex Endert, Instructor
Spring 2015
9:35 - 10:55am, Clough 278

Course Schedule

Note: schedule and readings subject to adjustments before the start of the semester.

Syllabus

Visual data analysis explores visual analytics from an HCI perspective. The course focuses on the cognitive processes involved in gaining insights and understanding from data through interactive visual interfaces. The focus of this course is to learn about analysis, and how interactive analytic interfaces can help enhance these processes.

Visualization

An understanding of how to bind data values and attributes to visual encodings and metaphors to illuminate insights from data.

Analysis

Cognitive processes of gaining understanding from data.

Interaction

Designed visual controls and actions enabling people to act upon their cognitive processes to reason about, and explore, their data.



The readings, discussions, and assignments in this course will aim to meet the following learning objectives:

Summary of weight for course assignments:
Class participation 25%
.....Panel Presentation 15%
.....In-Class Discussion Participation 10%
HW Assignments 20%
.....HW 1 5%
.....HW 2 5%
.....HW 3 10%
Semester project 35%
Final Exam 20%

Course grades follow the standard grading scale:
A = [90-100], B = [80-90), C = [70-80), D = [60-70), F = less than 60

How to turn in assignments:
Assignments are due at the beginning of the class in which they are due. IMPORTANT: Turn in two hardcopies copies of your assignment. Unless told otherwise during class or on the individual assignment description, please do this for all your assignments this term. Emailing the assignment will not be accepted (we get enough email already). You must turn in hardcopy.

Late Turn-in:
For each class period late, 25% of the total grade will be deducted from an assignment's score.

Class Participation
It is expected that students will come to class, be prepared by doing the readings, and will pay attention and participate in discussions. Doing all three regularly will earn full credit. I particularly value active participation in discussion during, and outside of, class. I understand that laptops, tablets, and smartphones are a valuable part of our learning experience. I expect that you will use them for productive reasons during class time. If you want to surf the internet on your laptop in class, take another course.

Group Panel Presentations
When leading a discusion of a paper as a group, you will be evaluated based on the ability to foster a discussion about the paper(s) assigned. Metrics will include:

An effective way to break up the group presentation is to have on team member be responsible for a specific set of points to be made about the paper. Think about timing. You probably don't want to spend more than 15-20 minutes giving a summary of the paper (feel free to use videos or demos is they are available online). Have questions ready in your presentation. Be prepared to have questions asked to you. You, the panel, should have a deep understanding of the work as it is presented in the paper.

Assignments

There are two types of assignments in the course: homework, and the group project.

HW1 - What makes data analysis complex?
In 1 page (single spaced, 12 pt. font), describe what makes analysis complex? Be sure to cite at least 6 readings from Weeks 1-3 to help make your claim. Be sure to include a discussion on the information that studies have presented so far. Be sure to include at least 3 reasons why analysis is complex, and use the reading to back up your claims. The goal of this assignment is to synthesize the reading and discussion we have had in the first part of the course. What are analytic techniques that have been deveoped to try and ease this process of analysis?
Remember, this is a hard limit of 1 page, so ensure that your points are made precisely and accurately. You ability to integrate the readings as citations to substantiate your arguments is an important part of your grade.

HW2 - Create an ACH Table
We have discussed in class what an ACH (Analysis of Competing Hypotheses) Table is. This assignment will have you create one. Using the ficticious intelligence analysis (text) data on t-square, perform the sensemaking task of finding any susicious activity that someone should investigate further (i.e., hypotheses), and why (i.e., evidence). You will turn in the ACH table, as well as a short, 2-paragraph description of what your final findings are. You will be graded based on the ability for your ACH table to support your decision and findings in your write up. Also, your ACH table will be evauated based on criteria including: Did you correctly score your evidence? Did you catalog your hypotheses? Do you have a set of exhaustive and exlusive hypotheses - where only one can be true?
You may use any software that is not directly intended to be an ACH software. For example, using Excel is allowed, but software like ACH2.0, competinghypotheses.org, or other ACH software is not allowed. Please turn in your final ACH table, and your 2-paragraph writeup digitally on t-square.

HW3 - Capturing, Interpreting, and Visualizing User Interaction
In 2 pages (single spaced, 12 pt. font), give an overview of how user interaction has been captured, interpreted/analyzed, and visualized. Use and cite at least 6 of the readings we've covered in class. Think about categories or trends that have emerged. How would you categorize the approaches so far? What defines your categories? What areas are open to further research? Further, why is it so important to analyze user interaction logs? What is analytic provenance, and what makes it important?
Remember, this is a hard limit of 2 pages, so ensure that your points are made precisely and accurately. You ability to integrate the readings as citations to substantiate your arguments is an important part of your grade.

Group Project
The group project for this semester will consist of building a visual data analysis prototype to use for a sensemaking task. The task is to analyze a simulated intelligence analysis dataset (one of the VAST challenge datasets, available on t-square). There are two primary components to this project:
Component 1: Developing an Analysis Prorotype Your team will design and develop a prototype for visual data analysis. Specifically, you will pick 1 visual metaphor, 1 analytic model, and at least 1 interaction. So, for example, you could pick to use a cluster, spatial layout, generated by PCA, showing relationships between documents spatially, and give people the ability to steer the PCA decomposition using a set of graphical controls (direct manipulation, dynamic querying).
Think about the methods that you will use to visualize the output of your model. You will take that information and bind it to a visual encoding, and place it in a visual metaphor. What is the visualization telling the user about the data, and about the model. Think carefully about how your chose to display specific outputs of specific models. There are many types of models out there, and many visual metaphors, so there are plenty of combinations.
For the interaction, think about what the most usable method for having users interact with the visualization and the model may be. Is it to give users direct, graphic controls over values of the parameters of the model? Is it to allow them to perform interactions in the visualization, and steer the model based on inferences computed on the interaction logs (e.g., semantic interaction, v2pi, etc.)?
Component 2: Describe your Analytic Provenance Your team will go through the dataset and "solve" the challenge (i.e., detect suspicious activity in the data). Given the content we talked about in class, your team will prepare a document describing your process. It should include information regarding what structured analytic techniques you used, what hypotheses and evidence you found (and discarded), how you worked collaboratively, how your tool helped foster sensemaking activities, how you handled cognitive biases, etc. Your team will prepare a document (max 10 pages) that describes your process.

The milestones for this project include:
(Milestone 1, due Feb. 17&19) Project proposal presentation: your team will present a short, 10-12 minute presentation to the class regarding your design and implementation plan for your prototype. Your presentation should convey information about your project including: What visual metaphor have you chosen? What analytic model have you chosen? What interactions will you design for? What analytic processes will this support?
(Milestone 2, due Apr. 16&21) Final Demonstration and Analytic Provenance Document: Your team will prepare to demo your prototype to the class. Prepare a scenario/use case to show how your tool was able to help create insights for this dataset. Plan for a 10 minute presentation/demo during class. Additionally, you will hand in your analytic provenance document to the instructor at this time (max 10 pages).
(Milestone 3, due Apr. 21 Apr. 23) Video Demonstration of Prototype: Your team will prepare a short 2-5 minute video demonstrating the functionality of your system. This is intended to give the audience a short, and concise description of what your team has done over the course of the semester. Please email a link to the instructor (Alex Endert) by 11:59pm on Apr. 23.


Instructor & TA

Instructor:
Alex Endert
endert@gatech.edu
TSRB 341
Office hours: Mondays 2:00 - 3:00pm and Tuesdays 11:00am - noon (other times available by appointment)

TA:
Jonathan Bidwell
bidwej@gatech.edu
Office hours: Thursday, Friday 5:00 - 6:30pm (Broadband Institute, 479 10th St. 30318, 1st Floor)