It makes sense that established chipmakers like Intel are trying to carve out a slice of the AI chip market, which is anticipated to be worth over $90 billion by 2025.
While Intelâs engineers have been working to overcome the challenges associated with using Deep Learning methodsânamely by modifying the chipmakerâs current Xeon server processor lineupâin order to properly compete with other players, they also found that they needed to develop dedicated AI accelerator solutions.
Machine learning (ML) essentially works in two steps: training and inference. Both of these points require different computational approaches and, ideally, a different chip architecture, too. Thatâs why Intel is developing 2 different chips: each dedicated to their one respective aspect of deep learning.
A statue of Intelâs logo. Image courtesy of Flickr.
The training part of the ML process is going to be tackled by Intelâs Nervana NNP-T chip (the âTâ even stands for âtrainingâ). This system on chip (SoC) supports up to 24 tensor processing clusters, on-die static RAM, a networking capability, 60MB of on-chip memory, and 4x8GB of high-efficiency memory (HBM2)âwhich result in up to 119 theoretical operations per second (TOPS).
The chip isnât actually being produced by Intel itself: it is instead being outsourced to TSMC. Itâs built on a 16nm manufacturing process, consists of 27 billion transistors on a 690 mmÂ² die, and can reach core frequencies of up to 1.1Ghz. The SoC supports a x16 PCIe 4.0 connection, which allows for an aggregate bandwidth of 3.6 terabytes per second and is expected to consume between 150 and 250W of power.
The product is designed from the ground up to handle AI-specific tasks where it should greatly outperform current systems, but it wonât replace traditional processors for the majority of computing tasks. Instead, both the NPP-T and NNP-I chips are meant to work in tandem with existing CPUs.
To deal with the inference side of the equation, Intel is developing its Nervana NNP-I SoC (the âIâ stands for âinferenceâ). This chip is being produced in-house, on Intelâs own 10-nanometre process. It supports up to 4 64GB LPDDR4x memory modules and features a shared 24MB L3 cache, two Icelake CPU cores, and 12 ICE (inference compute engine) cores that can work independently of one another.
Each ICE core is capable of delivering up to 4.8 TOPS. The SoC has a DRAM bandwidth of up to 68GB/s, can be plugged into an M.2 slot, and draws between 10 and 50W of power.
Despite all of the above, having a powerful and efficient AI chip is worth nothing without good software support. Thatâs why Intel is working on optimising existing source libraries, developing their own deep learning library for Apache Spark and Hadoop clusters (called BigDL), and offering its own distribution of the OpenVINO toolkit, which should help optimise pre-trained ML models.
Intelâs goal is to make the transition to its hardware easy for developers: this is by ensuring that its NNP chips will be able to seamlessly integrate with existing tools and libraries.
Intel is already sourcing its chips to partners like Baidu and Facebook and is also planning on making its chipsets broadly available in 2020.
Even though Intel is one of the last major players to develop dedicated AI-processing hardware, its newest offerings might just be able to close the gap between itself and its early-bird competitors. After all, by developing both training and inference chips, Intel is offering a complete AI processing solution, which couldânot least thanks to its decades worth of experienceâsway a large portion of the market into opting for its chips.