Bridging the Gap Between Deep Learning Models and “the Metal” with Apache TVM & VTA
Luis Ceze, Professor @ UW-CSE, Co-founder and CEO @ OctoML (https://homes.cs.washington.edu/~luisceze/)
Room: Emerald 1
There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms – such as mobile phones, embedded devices, and accelerators (e.g., FPGAs, ASICs) – requires significant manual effort. In this talk I will present our work on the TVM stack, which exposes graph-level and operator-level optimizations to provide performance portability to deep learning workloads across diverse hardware back-ends. TVM solves optimization challenges specific to deep learning, such as high-level operator fusion, mapping to arbitrary hardware primitives, and memory latency hiding. It also automates optimization of low-level programs to hardware characteristics by employing a novel, learning-based cost modeling method for rapid exploration of code optimizations. To address threat of changes in algorithms, models, operators, or numerical systems threaten to the viability of specialized hardware accelerators, we developed VTA, a programmable deep learning architecture template tightly coupled to TVM. VTA achieves this flexibility via a parametrizable architecture, two-level ISA, and a JIT compiler. The TVM/VTA was recently incubated as an Apache Foundation project and is benefiting from a thriving community of developers.
Luis Ceze is a Professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington, Co-founder and CEO at OctoML, and Venture Partner at Madrona Venture Group. His research focuses on the intersection between computer architecture, programming languages, machine learning and biology. His current focus is on approximate computing for efficient machine learning and DNA-based data storage. He co-directs the Molecular Information Systems Lab (MISL), the Systems and Architectures for Machine Learning lab (SAML) and the Sampa Lab for HW/SW co-design. He has co-authored over 100 papers in these areas, and had several papers selected as IEEE Micro Top Picks and CACM Research Highlights. His research has been featured prominently in the media including New York Times, Popular Science, MIT Technology Review, Wall Street Journal, among others. He is a recipient of an NSF CAREER Award, a Sloan Research Fellowship, a Microsoft Research Faculty Fellowship, the IEEE TCCA Young Computer Architect Award and UIUC Distinguished Alumni Award.