Intel Ponte Vecchio Early Silicon Puts Out 45 TFLOPs FP32 at 1.37 GHz, Already Beats NVIDIA A100 and AMD MI100
Intel in its 2021 Architecture Working day presentation put out fantastic specialized details of its Xe HPC Ponte Vecchio accelerator, which includes some [very] preliminary overall performance promises for its present A0-silicon-based prototype. The prototype operates at one.37 GHz, but achieves out at the very least 45 TFLOPs of FP32 throughput. We calculated the clock velocity primarily based on uncomplicated math. Intel obtained the 45 TFLOPs amount on a equipment functioning a single Ponte Vecchio OAM (one MCM with two stacks), and a Xeon “Sapphire Rapids” CPU. forty five TFLOPs sees the processor by now conquer the advertised 19.five TFLOPs of the NVIDIA “Ampere” A100 Tensor Core 40 GB processor. AMD is just not faring any much better, with its manufacturing Intuition MI100 processor only supplying 23.one TFLOPs FP32.