Tornado Codes for Archival Storage Matthew Woitaszek University of Colorado in Boulder This presentation examines the application of Tornado Codes, a class of low density parity check (LDPC) erasure codes, to archival storage systems based on massive arrays of idle disks (MAID). The fault tolerance of Tornado Code graphs is analyzed, and it is shown that it is possible to identify and mitigate worst-case failure scenarios in small (96 node) graphs through use of simulations that find and eliminate critical node sets that can cause Tornado Codes to fail even when almost all blocks are present. The resulting graph construction procedure is then used to construct a 96-device Tornado Code stripe storage system with capacity overhead equivalent to RAID 10 that tolerates any 4 device failures. This system is demonstrated to be superior to parity-based RAID. After establishing the fault tolerance of Tornado Codes, a log-structured extent-based file system is constructed using the Tornado Coded stripe storage. The file system is combined with a MAID simulator to emulate the behavior of a large-scale storage system, with the goal of employing Tornado Codes to increase fault tolerance while minimizing power consumption. The effect of power conservation constraints and design choices on system throughput is examined, and a policy of placing multiple data nodes on a single device is shown to increase read throughput with an identifiable decrease in fault tolerance. Finally, the system is implemented on a 100 TB Lustre storage cluster, providing GridFTP accessible storage with higher reliability and availability than the underlying storage architecture. Brief Biography: Matthew Woitaszek is a member of the Research Systems Evaluation Team at the National Center for Atmospheric Research in Boulder, CO. His research interests include high-performance systems with emphasis on fault tolerant storage. He received his M.S. in Computer Engineering from the Rochester Institute of Technology and expects to receive his Ph.D. in Computer Science from the University of Colorado, Boulder, in May 2007.