Intel Ponte Vecchio Early Silicon Puts Out 45 TFLOPs FP32 at 1.37 GHz, Already Beats NVIDIA A100 and AMD MI100
Intel in its 2021 Architecture Working day presentation set out fine technical specifics of its Xe HPC Ponte Vecchio accelerator, together with some [quite] preliminary general performance claims for its latest A0-silicon-primarily based prototype. The prototype operates at 1.37 GHz, but achieves out at minimum 45 TFLOPs of FP32 throughput. We calculated the clock pace dependent on uncomplicated math. Intel obtained the 45 TFLOPs amount on a machine operating a single Ponte Vecchio OAM (one MCM with two stacks), and a Xeon “Sapphire Rapids” CPU. 45 TFLOPs sees the processor now conquer the marketed 19.5 TFLOPs of the NVIDIA “Ampere” A100 Tensor Core 40 GB processor. AMD just isn’t faring any greater, with its output Intuition MI100 processor only providing 23.one TFLOPs FP32.