Agar: A Caching System for Erasure-Coded Data
Raluca Halalai, Pascal Felber, Anne-Marie Kermarrec and François Taïani
University of Neuchâtel, University of Neuchâtel, INRIA, IRISA

Erasure coding is an established data protection mechanism. It provides high resiliency with low storage overhead, which makes it very attractive to storage systems developers. Unfortunately, when used in a distributed setting, erasure coding hampers a storage systems performance, because it requires clients to contact several, possibly remote sites in order to retrieve their data. This has hindered the adoption of erasure coding in practice, limiting its use to cold, archival data. Recent research showed that it is feasible to use erasure coding for hot data as well, thus opening up new perspectives for improving erasurecoded storage systems. In this paper, we address the problem of minimizing access latency in erasure-coded storage. We propose Agara novel caching system tailored for erasure-coded content. Agar optimizes the contents of the cache based on live information regarding data popularity and access latency to different data storage sites. Our system adapts a dynamic programming algorithm to optimize the choice of data blocks that are cached, using an approach akin to Knapsack algorithms. We compare Agar to the classical Least Recently Used and Least Frequently Used cache eviction policies, while varying the amount of data cached between a data chunk and a whole replica of the object. We show that Agar can achieve 16% to 41% lower latency than systems that use classical caching policies.