Jigsaw Home | System Views | Video Tutorial | Example Document Sets | In the News | VAST '07 Contest

Jigsaw: Visual Analytics for Exploring and Understanding Document Collections

Example Jigsaw Datafiles
Below is a collection of example Jigsaw datafiles. Jigsaw datafile format is a simple xml where each document is represented by a node. Within the node are document metadata, the text of the document, and finally associated entities. These files use the ".jig" suffix. Please feel free to use these datafiles in your own research. We simply ask that you acknowledge where you acquired them.

  • Autism webpages - Top ~500 web pages about autism
  • Bible - King James version of the Bible
  • CHI papers - The title, abstract and metadata for every ACM CHI conference paper from 1999 to 2010.
  • InfoVis and VAST papers - The title, abstract and metadata for every IEEE InfoVis and VAST conference paper from 1995 to 2017 (List of concept terms identified in titles and abstracts)
  • NSF award information including title, summary, and metadata from 2000-2009 (special thanks to Remco Chang for original scrape of these)
    CISE/CCF - CISE Computing and Communication Foundations (keywords identified within CISE awards)
    CISE/CNS - CISE Computer and Network Systems
    CISE/IIS - CISE Interactive and Intelligent Systems
    EHR/DGE - EHR Graduate Education
    EHR/DRL - EHR Research on Learning in Formal and Informal Settings
    EHR/DUE - EHR Undergraduate Education
    EHR/HRD - EHR Human Resource Development
    ENG/IIP - ENG Industrial Innovation and Partnership
  • 9/11 Report - Each page as a separate document

 

Last modified: November 4, 2013