As a platform processing big data across multiple machines, Hadoop relied heavily on the network to perform. Despite this, the behaviour of network traffic in Hadoop clusters is still poorly understood. This lack of understanding makes it difficult to explore and evaluate network-based Hadoop innovations. In this paper, we explore Hadoop traffic, and present Keddah, a toolchain for capturing, modelling and reproducing Hadoop traffic for use in simulators. This paper provides researchers with an understanding of Hadoop network activity as well as the means to recreate the traffic for profiling network components.