TITLE: Hardware Accelerators for Deep Learning: a Proving Ground for
The computing industry has a power problem: the days of ideal transistor scaling are over, and chips now have more devices than can be fully powered simultaneously, limiting performance. New architecture-level solutions are needed to continue scaling performance, and specialized hardware accelerators are one such solution. While accelerators promise to provide orders of magnitude more performance per watt, several challenges have limited their wide-scale adoption.
Deep learning has emerged as a sort of proving ground for hardware acceleration. With extremely regular compute patterns and wide-spread use, if accelerators can’t work here, then there’s little hope elsewhere. For accelerators to be a viable solution they must enable computation that cannot be done today and demonstrate mechanisms for performance scaling, such that they are not a one-off solution. This talk will present deep learning algorithm-hardware co-designs to answer these questions and identify the efficiency gap between standard hardware design practices and full-stack co-design to enable deep learning to be used with little restriction. To push the efficiency limits, this talk will introduce principled unsafe optimizations. A principled unsafe optimization changes how a program executes without impacting accuracy. By breaking the contract between the algorithm, architecture, and circuits, efficiency can be greatly improved. To conclude, future research directions centering around hardware specialization will be presented: accelerator-centric architectures and privacy-preserving cloud computing.
Brandon Reagen is a computer architect with a focus on specialized hardware (i.e. accelerators) with applications in deep learning. He received his Ph.D. from Harvard in May of 2018. Over the course of his Ph.D., Reagen made several research contributions to lower the barrier of using accelerators as general architectural constructs including benchmarking, simulation infrastructure, and SoC design. Using his knowledge of accelerator design, he led the way in highly efficient and accurate deep learning accelerator design with his work on principled unsafe optimizations. In his thesis, he found that for DNN inference intricate, full-stack co-design between the robust nature of the algorithm and the circuits they execute on can result in nearly an order of magnitude more power-efficiency compared to standard ASIC design practices. His work has been published in conferences ranging from architecture, ML, CAD, and circuits. Reagen is now a research scientist at Facebook in the AI Infrastructure team.