| Due February 5 | CS 7450 - Information Visualization | Spring 2002 |
This assignment will familiarize you with a number of systems that have been built for analyzing multivariate data sets. You will be working with the Spotfire Pro tool from Spotfire, the SeeIt tool from Visual Insights, the Eureka tool from Inxight, and the InfoZoom tool from HumanIT. Spotfire and SeeIt are installed on the Windows NT cluster in the College of Computing. You must download Eureka and InfoZoom. You will received email about how to download Eureka. InfoZoom can be found at www.humanit.com/en/products/infozoom/download.html.
The goals of the assignment are for you to learn the capabilities provided by these types of systems, learn the basic visualization methods that they provide, and assess their utility in analyzing information repositories. You will work with some provided data sets in the assignment. Think about the kinds of questions that an analyst would be asking about the data sets.
The assignment has four parts:
1. Gain familiarity with the systems
Familiarize yourself with the visualization techniques and the user
interfaces of the different systems. Each one has a tutorial that you can try
out with a sample data set. Work your way through the tutorial and
become familiar with the system, its interface and its capabilities.
2. Examine the sample data sets
Each tool includes a few sample data sets, but often it's best to
learn with something new. We have supplied five data sets with the
assignment for you to consider: Cereals (78 items, 15 vars.), Mutual
funds (987 items, 14 vars.), Cars (407 items, 10 vars.), Films (1742
items, 10 vars.) and Grocery store surveys (5164 items, 8 vars.).
Pick the two that are most interesting to you to use in the
assignment. Briefly scan the text of the files and familiarize
yourself with the variables. Write down a few hypotheses tobe
considered about the data elements. Recall all the different kinds of
analysis tasks that we discussed last week (outliers, correlations,
clusters, trends, associations, presentation, etc.)
3. Load and examine the data sets in the systems
Load the two data sets that you selected into each of the
visualization tools that we're studying. Test your hypotheses from
part 2. Can you confirm or deny the hypotheses. Also use the systems
to explore the data sets and see if you can uncover other interesting
or unexpected attributes of the data sets.
Put yourself in the
shoes of a data analyst, and consider questions that person would
confront.
4. Write a report on your findings
Write up a summary of your findings. Include your hypotheses and what
you confirmed/denied. Also report on any other interesting findings
that you made. Furthermore, critique the different tools in a general
sense. What are their
strengths and weaknesses? How do their visualization capabilities
differ? For what kinds of user tasks is each tool suited? Focus more
here on the visualization techniques as opposed to the particular user
interface quirks, though you should feel free to comment on UI aspects
when they are particularly good or bad. Include screenshots to help
explain your analyses and critiques.
Please prepare your write-up in HTML format so that we can put it on our web site. Additionally, please bring two hardcopies to class on the day that it is due.
Cars (comma separated text)
Cars (Excel)
Cereals (comma separated text)
Cereals (Excel)
Films (comma separated text)
Films (Excel)
Grocery store survey (comma separated text)
Grocery store survey (Excel)
Mutual funds (comma separated text)
Mutual funds (Excel)
Acknowledgments: Special thanks to Christopher Ahlberg of Spotfire and Ramana Rao of Inxight for assisting us with software acquisition for this assignment.