Recent advances in deep learning have enabled researchers across many disciplines to uncover new insights about large datasets. Deep neural networks have shown applicability to image, time-series, textual, and other data, all of which are available in a plethora of research fields. However, their computational complexity and large memory overhead requires advanced software and hardware technologies to train neural networks in a reasonable amount of time. To make this possible, there has been an influx in development of deep learning software that aim to leverage advanced hardware resources. In order to better understand the performance implications of deep learning frameworks over these different resources, we analyze the performance of three different frameworks, Caffe, TensorFlow, and Apache SINGA, over several hardware environments. This includes scaling up and out with single- and multi-node setups using different CPU and GPU technologies. Notably, we investigate the performance characteristics of NVIDIA’s state-of-the-art hardware technology, NVLink, and also Intel’s Knights Landing, the most advanced Intel product for deep learning, with respect to training time and utilization. To our best knowledge, this is the first work concerning deep learning bench-marking with NVLink and Knights Landing. Through these experiments, we provide analysis of the frameworks’ performance over different hardware environments in terms of speed and scaling. As a result of this work, better insight is given towards both using and developing deep learning tools that cater to current and upcoming hardware technologies.