765
Comprehensive Measurement and Analysis of the User-Perceived I/O Performance in a Production Leadership-Class Storage System
Lipeng Wan, Matthew Wolf, Feiyi Wang, Jong Youl Cho, George Ostruchov and Scott Klasky
Oak Ridge National Laboratory, Oak Ridge National Laboratory, Oak Ridge National Laboratory, Oak Ridge National Laboratory, Oak Ridge National Laboratory, Oak Ridge National Laboratory

With the increase of the scale and intensity of the parallel I/O workloads generated by those scientific applications running on high-performance computing facilities, understanding the I/O dynamics, especially the root cause of the I/O performance variability and degradation in HPC environment, have become extremely critical to the HPC community. In this paper, we run extensive I/O measuring tests on a production leadership-class storage system to capture the performance variabilities of large-scale parallel I/O. Analyzing these results and its statistic correlation revealed some valuable insights into the characteristics of the storage system and the root cause of I/O performance variability. Further, we leverage these findings and propose an I/O middleware design refactoring which can improve the performance of the parallel I/O by optimizing the data striping and placement. Our preliminary evaluation results demonstrate the effectiveness of proposed approach.