Due March 7 CS 7450 - Information Visualization Spring 2006

Homework 4: Critiquing Commercial InfoVis Systems

This assignment will familiarize you with a number of systems that have been built for analyzing multivariate data sets. You will be working with the Spotfire Pro tool from Spotfire, the SeeIt tool from Visual Insights (company no longer exists), the InfoZoom tool from Siemens, and the InfoScope tool from MacroFocus. The first two will be installed on the MS Windows cluster in the College of Computing and the last two are available in Prof. Stasko's lab and have reader tools available on the web.

The goals of the assignment are for you to learn the capabilities provided by these types of systems, learn the basic visualization methods that they provide, and assess their utility in analyzing information repositories. You will work with some provided data sets in the assignment. Think about the kinds of questions that an analyst would be asking about the data sets.

The assignment has four parts:

1. Gain familiarity with the systems
Familiarize yourself with the visualization techniques and the user interfaces of the different systems. Each one has a tutorial that you should try out with a sample data set. Work your way through the tutorial and become familiar with the system, its interface and its capabilities.

2. Examine the sample data sets
Each tool includes a few sample data sets, but often it's best to learn with something new. Five data sets are supplied below for you to consider: foods' nutritional data (5976 items, 32 vars.), stocks (500 items, 30 vars.), baseball statistics (322 items, 24 vars.), college information (51 items, 22 vars.) and professor's salaries (1160 items, 16 vars.). You must work with the food nutrition data set and you are free to pick the one other set that is most interesting to you. Briefly scan the text of the files and familiarize yourself with the variables. Write down a few hypotheses to be considered, tasks to be performed, or questions to be asked about the data elements. Think about all the different kinds of analysis tasks that a person might want to perform in working with a data set such as these. For instance, someone working with a data set about breakfast cereals might have tasks like:

Try not to make all of your questions be about correlations, which seems to be a common thing to do.

3. Load and examine the data sets into the systems
Load the nutrition and other data set that you selected into each of the visualization tools that we're studying, then consider your hypotheses, tasks, and questions. Also use the systems to explore the data sets and see if you can discover other interesting or unexpected findings in the data sets. Put yourself in the shoes of a data analyst, and consider questions that such a person would confront.

4. Write a report on your findings
Write up a summary of your findings. Include your hypotheses/tasks/questions and what you found. Furthermore, critique the different tools in a general sense. (Feel free to include screenshots to help explain your analyses and critiques.) What are the systems' strengths and weaknesses? How do their visualization capabilities differ? For what kinds of user tasks is each tool suited? Focus more here on the visualization techniques as opposed to the particular user interface quirks, though you should feel free to comment on UI aspects when they are particularly good or bad. Additionally, for each tool, list one unanticipated findings or discovery made with that tool. Here, make sure to include a screenshot that shows how the tool facilitated the finding.

Please prepare your write-up so that we can put it on our web site. Your document should be in pdf format and is limited to a maximum of 12 pages. Additionally, please bring two hardcopies to class on the day that it is due.

Food (tab delimited)
Food (Excel)
Stocks (tab delimited)
Stocks (Excel)
Universities (tab delimited)
Universities (Excel)
Professor salaries (tab delimited)
Professor salaries (Excel)
Baseball (tab delimited)
Baseball (Excel)

Acknowledgments: Special thanks go out to Dominique Brodbeck and Luc Girardin of Macrofocus, Bill Wright of Oculus (formerly of Visible Decisions), Michael Spenke and Christian Beilken of the Fraunhofer Institute for Applied Information Technology, and Christopher Ahlberg of Spotfire for assisting with software acquisition for this assignment.