It can be hard to imagine in today’s commodity landscape which every smartphone, tablet, laptop, desktop and server seems to run on a small set of fairly similar chips, but there was once an era of far greater diversity in the hardware forming the “heart” of our computing world. This legacy has historically lived on in the rarified world of supercomputing, with all manner of exotic hardware used to squeeze out every available instruction cycle. As gaming and high-intensity graphics workloads proliferated, specialized accelerator hardware in the form of GPUs became similarly standardized. The rise of deep learning with its exponentially more complex and specialized workloads has brought us a new renaissance in specialized hardware, this time with companies themselves building their own chips to achieve every bit of possible efficiency and speed.
From Google’s TPUs to Amazon’s Inferentia chip, it is almost a requirement of modern cloud companies to build their own deep learning accelerators based on their experiences at the edge of the AI frontier. These companies largely restrict their creations to inhouse use, though Google has externalized a “light” version of its TPUs for outside use in the form of its Edge TPU.
With its legacy in GPU design, Nvidia’s hardware has long been a force in the deep learning community and has moved aggressively from repurposed GPUs to building dedicated deep learning machines.
Yet, if my inbox is any indicator, companies everywhere are popping up with their own custom silicon designs and creative assemblies of COTS hardware to speed the training and inferencing processes ever further and allow the creation of ever-larger models.
Simultaneously, as Google’s Edge TPU release reminds us, AI is increasingly moving to the edge. We are witnessing a spectrum of AI applications with different needs in terms of balancing accuracy, power consumption and mobility.
Applications that require maximal accuracy at all costs still outsource their inferencing to the cloud, using wired, 4G or wireless connectivity to stream input data to GCP, AWS and other cloud vendors for processing.
This is less practical for continuous live video processing with limited available bandwidth, such as a remote surveillance camera network. These require inferencing models to execute entirely onboard the device.
Some utilize a hybrid model where onboard filtering is used to identify candidates for more sophisticated remote processing.
Power consumption is a primary limiting factor for such applications, requiring a careful balance between raw processing ability, electricity consumed and heat generated.
Autonomous vehicles place particularly demanding requirements upon deep learning hardware, with fixed window deadlines, high accuracy, high bandwidth inputs and low power consumption, with the need for all of that to run entirely onboard the vehicle, rather than being livestreamed back to the cloud for processing.
Tesla has become the latest company to roll out its own proprietary deep learning hardware, touting its new chips as the future of its autonomous driving capabilities and sparking a war of words with former vendor Nvidia.
Putting this all together, the proliferation of new AI hardware and the growing number of companies producing their own inferencing equipment reminds us just how much of a Wild West the deep learning space is at the moment, with so many unknowns and a development cycle that can mean today’s bleeding edge chips are next year’s obsolete design.
In the end, as more and more of the work pushing the boundaries of deep learning occurs in the cloud, the best option for most companies may be to simply outsource their AI needs to the vast commercial cloud, allowing them to leverage everything from GPUs to the latest generation Nvidia hardware to bleeding-edge designs like TPUs and Inferentia chips or the mobile hardware being released by those companies that enable increasingly seamless transitions to the low-power mobile environment.
In some cases, the cloud even makes it possible to use the same training workflow to build a maximal accuracy model for the cloud and a minimized low-power version for mobile with just a mouse click, using the exact same tools and training data.
Most importantly, as deep learning remains a fluid and rapidly evolving space, the cloud offers insulation against this change, allowing companies to build and leverage state-of-the-art models without worrying about the underlying hardware. Much like ordinary CPU-based cloud virtual machines that are constantly transitioned to newer and better hardware without requiring any modification, deep learning in the cloud benefits from enormous investments by the cloud companies in abstracting as much as possible away from the challenges of running on ever-improving accelerators. For example, code built for Google’s previous generation TPU accelerators can run unmodified on their newest generation hardware, often achieving considerable performance improvements without changing a line of code, recompiling or rebuilding anything.
For those sitting at the edge of deep learning, today’s cloud offers perhaps the best environment for AI work, from the world’s most powerful inferencing hardware to execution stability in the face of constant hardware improvements to a seamless pipeline from the cloud to the edge to leading open development software environments.
Perhaps most importantly of all, the cloud is where the world’s leading AI companies themselves do all their work, meaning companies building their AI empires in the cloud are able to leverage the collective advances literally driving the deep learning world.
Read More: Forbes