AMD Announces World’s Fastest HPC Accelerator for Scientific Research¹

AMD Announces World’s Fastest HPC Accelerator for Scientific Research¹

AMD Instinct™ MI100 accelerators revolutionize higher-general performance computing (HPC) and AI with marketplace-foremost compute functionality

Initially GPU accelerator with new AMD CDNA architecture engineered for the exascale era

SANTA CLARA, Calif., Nov. 16, 2020 (Globe NEWSWIRE) —  AMD (NASDAQ: AMD) today introduced the new AMD Instinct™ MI100 accelerator – the world’s swiftest HPC GPU and the very first x86 server GPU to surpass the ten teraflops (FP64) general performance barrier.1 Supported by new accelerated compute platforms from Dell, Gigabyte, HPE, and Supermicro, the MI100, combined with AMD EPYCTM CPUs and the ROCm™ four. open program platform, is created to propel new discoveries forward of the exascale period.

Constructed on the new AMD CDNA architecture, the AMD Instinct MI100 GPU enables a new class of accelerated techniques for HPC and AI when paired with twond Gen AMD EPYC processors. The MI100 gives up to 11.5 TFLOPS of peak FP64 general performance for HPC and up to forty six.1 TFLOPS peak FP32 Matrix overall performance for AI and device studying workloads2. With new AMD Matrix Core know-how, the MI100 also provides a virtually 7x boost in FP16 theoretical peak floating stage general performance for AI coaching workloads in comparison to AMD’s prior era accelerators.3

“Today AMD can take a important step ahead in the journey towards exascale computing as we unveil the AMD Intuition MI100 – the world’s fastest HPC GPU,” said Brad McCredie, corporate vice president, Details Centre GPU and Accelerated Processing, AMD. “Squarely qualified toward the workloads that make any difference in scientific computing, our latest accelerator, when mixed with the AMD ROCm open up application system, is built to offer scientists and researchers a remarkable basis for their do the job in HPC.”

Open up Software program Platform for the Exascale Era

The AMD ROCm developer software package provides the foundation for exascale computing. As an open up resource toolset consisting of compilers, programming APIs and libraries, ROCm is made use of by exascale software developers to create significant general performance purposes. ROCm 4. has been optimized to supply functionality at scale for MI100-dependent methods. ROCm four. has upgraded the compiler to be open supply and unified to help both OpenMP® five. and HIP. PyTorch and Tensorflow frameworks, which have been optimized with ROCm 4., can now attain bigger functionality with MI1007,eight. ROCm 4. is the most recent offering for HPC, ML and AI application builders which enables them to build overall performance moveable computer software.

“We’ve been given early entry to the MI100 accelerator, and the preliminary effects are quite encouraging. We have ordinarily seen substantial functionality boosts, up to two-3x as opposed to other GPUs,” stated Bronson Messer, director of science, Oak Ridge Management Computing Facility. “What’s also essential to figure out is the effect program has on performance. The reality that the ROCm open software package platform and HIP developer resource are open source and work on a wide variety of platforms, it is one thing that we have been certainly pretty much obsessed with because we fielded the very to start with hybrid CPU/GPU procedure.”

Important abilities and characteristics of the AMD Instinct MI100 accelerator involve:

  • All-New AMD CDNA Architecture- Engineered to power AMD GPUs for the exascale period and at the heart of the MI100 accelerator, the AMD CDNA architecture features extraordinary efficiency and electrical power performance
  • Major FP64 and FP32 Functionality for HPC Workloads – Delivers industry major eleven.5 TFLOPS peak FP64 performance and 23.one TFLOPS peak FP32 efficiency, enabling researchers and scientists throughout the globe to speed up discoveries in industries including existence sciences, power, finance, academics, government, protection and additional.1
  • All-New Matrix Main Technologies for HPC and AI – Supercharged efficiency for a entire variety of single and mixed precision matrix operations, these as FP32, FP16, bFloat16, Int8 and Int4, engineered to increase the convergence of HPC and AI.
  • twond Gen AMD Infinity Fabric™ Know-how Instinct MI100 offers ~2x the peer-to-peer (P2P) peak I/O bandwidth above PCIe® 4. with up to 340 GB/s of aggregate bandwidth per card with a few AMD Infinity Fabric™ Hyperlinks.four In a server, MI100 GPUs can be configured with up to two totally-connected quad GPU hives, just about every supplying up to 552 GB/s of P2P I/O bandwidth for rapid facts sharing.4
  • Extremely-Fast HBM2 Memory– Options 32GB Superior-bandwidth HBM2 memory at a clock level of 1.two GHz and provides an extremely-significant 1.23 TB/s of memory bandwidth to assist substantial facts sets and enable eliminate bottlenecks in relocating information in and out of memory.five
  • Help for Industry’s Most recent PCIe® Gen 4. – Made with the latest PCIe Gen 4. know-how guidance offering up to 64GB/s peak theoretical transport info bandwidth from CPU to GPU.6

Offered Server Remedies
The AMD Instinct MI100 accelerators are predicted by end of the 12 months in units from important OEM and ODM partners in the business markets, including:

Dell
“Dell EMC PowerEdge servers will guidance the new AMD Instinct MI100, which will allow speedier insights from data. This would enable our prospects achieve much more strong and successful HPC and AI success speedily,” said Ravi Pendekanti, senior vice president, PowerEdge Servers, Dell Technologies. “AMD has been a valued partner in our help for advancing innovation in the facts centre. The large-functionality abilities of AMD Instinct accelerators are a pure fit for our PowerEdge server AI & HPC portfolio.”

Gigabyte
“We’re delighted to once again do the job with AMD as a strategic associate featuring consumers server hardware for large overall performance computing,” stated Alan Chen, assistant vice president in NCBU, GIGABYTE. “AMD Instinct MI100 accelerators characterize the following level of superior-efficiency computing in the info heart, bringing increased connectivity and data bandwidth for strength analysis, molecular dynamics, and deep discovering training. As a new accelerator in the GIGABYTE portfolio, our customers can search to profit from improved efficiency across a array of scientific and industrial HPC workloads.”

Hewlett Packard Enterprise (HPE)
“Customers use HPE Apollo systems for function-built capabilities and efficiency to deal with a selection of elaborate, data-intensive workloads across large-efficiency computing (HPC), deep discovering and analytics,” explained Monthly bill Mannel, vice president and basic supervisor, HPC at HPE. “With the introduction of the new HPE Apollo 6500 Gen10 In addition program, we are additional advancing our portfolio to boost workload general performance by supporting the new AMD Intuition MI100 accelerator, which permits larger connectivity and details processing, alongside the 2nd Gen AMD EPYC™ processor. We seem ahead to continuing our collaboration with AMD to develop our choices with its most current CPUs and accelerators.”

Supermicro
“We’re fired up that AMD is earning a large influence in significant-functionality computing with AMD Instinct MI100 GPU accelerators,” explained Vik Malyala, senior vice president, field software engineering and business progress, Supermicro. “With the mix of the compute electrical power acquired with the new CDNA architecture, along with the large memory and GPU peer-to-peer bandwidth the MI100 brings, our customers will get obtain to excellent methods that will satisfy their accelerated compute needs and crucial company workloads. The AMD Instinct MI100 will be a wonderful addition for our multi-GPU servers and our considerable portfolio of significant-overall performance methods and server setting up block methods.”

MI100 Specifications

Compute
Models
Stream
Processors
FP64
TFLOPS
(Peak)
FP32
TFLOPS
(Peak)
FP32
Matrix
TFLOPS
(Peak)
FP16/FP16
Matrix
TFLOPS
(Peak)
INT4 |
INT8
TOPS
(Peak)
bFloat16
TFLOPS
(Peak)
HBM2
ECC
Memory
Memory
Bandwidth
one hundred twenty 7680 Up to
11.five
Up to 23.1 Up to
46.1
Up to
184.6
Up to
184.six
Up to
ninety two.three
TFLOPS
32GB Up to 1.23
TB/s

Supporting Sources

About AMD
For much more than 50 many years AMD has driven innovation in higher-effectiveness computing, graphics and visualization systems ― the developing blocks for gaming, immersive platforms and the information middle. Hundreds of thousands and thousands of consumers, leading Fortune five hundred companies and cutting-edge scientific exploration amenities close to the globe rely on AMD technological know-how daily to strengthen how they dwell, operate and perform. AMD workers all over the entire world are targeted on constructing great merchandise that force the boundaries of what is doable. For much more data about how AMD is enabling these days and inspiring tomorrow, visit the AMD (NASDAQ: AMD) websiteblogFacebook and Twitter pages.

CAUTIONARY Statement
This push release includes ahead-searching statements relating to Innovative Micro Devices, Inc. (AMD) these kinds of as the functions, operation, general performance, availability, timing and predicted advantages of AMD products together with the AMD Instinct™ MI100 accelerator, which are designed pursuant to the Secure Harbor provisions of the Private Securities Litigation Reform Act of 1995. Ahead searching statements are typically recognized by words and phrases these types of as “would,” “may possibly,” “expects,” “thinks,” “programs,” “intends,” “initiatives” and other terms with equivalent meaning. Investors are cautioned that the ahead-on the lookout statements in this press release are primarily based on existing beliefs, assumptions and expectations, talk only as of the day of this push launch and involve challenges and uncertainties that could lead to actual outcomes to vary materially from latest anticipations. These statements are matter to specific recognized and not known risks and uncertainties, numerous of which are tricky to forecast and usually past AMD’s regulate, that could result in actual benefits and other long term activities to vary materially from individuals expressed in, or implied or projected by, the forward-hunting info and statements. Product elements that could lead to actual final results to differ materially from present-day expectations consist of, with out limitation, the pursuing: Intel Corporation’s dominance of the microprocessor industry and its aggressive enterprise procedures the capacity of 3rd social gathering brands to manufacture AMD’s products on a well timed foundation in enough portions and using aggressive systems expected producing yields for AMD’s products the availability of critical equipment, elements or production procedures AMD’s capability to introduce merchandise on a timely basis with attributes and functionality levels that supply worth to its consumers world wide financial uncertainty the loss of a sizeable client AMD’s capability to produce earnings from its semi-custom SoC products the effect of the COVID-19 pandemic on AMD’s small business, financial condition and benefits of functions political, authorized, economic hazards and purely natural disasters the effect of governing administration steps and regulations such as export administration polices, tariffs and trade defense measures the impact of acquisitions, joint ventures and/or investments on AMD’s enterprise, which include the announced acquisition of Xilinx, and the failure to combine obtained organizations AMD’s capacity to total the Xilinx merger the effect of the announcement and pendency of the Xilinx merger on AMD’s company possible protection vulnerabilities likely IT outages, data reduction, facts breaches and cyber-attacks uncertainties involving the ordering and cargo of AMD’s solutions quarterly and seasonal sales styles the limitations imposed by agreements governing AMD’s notes and the revolving credit score facility the competitive markets in which AMD’s products and solutions are offered current market disorders of the industries in which AMD products are marketed AMD’s reliance on 3rd-social gathering mental residence to layout and introduce new merchandise in a well timed manner AMD’s reliance on 3rd-party companies for the structure, manufacture and supply of motherboards, software package and other laptop or computer system factors AMD’s reliance on Microsoft Corporation and other program vendors’ support to layout and build computer software to run on AMD’s merchandise AMD’s reliance on 3rd-get together distributors and include-in-board companions the possible dilutive result if the 2.a hundred twenty five% Convertible Senior Notes owing 2026 are converted long run impairments of goodwill and technological innovation license buys AMD’s capability to attract and retain capable staff AMD’s means to generate enough profits and running income stream or acquire external financing for investigate and advancement or other strategic investments AMD’s indebtedness AMD’s potential to deliver adequate income to services its debt obligations or satisfy its working money requirements AMD’s capability to repurchase its excellent debt in the party of a alter of manage the cyclical nature of the semiconductor market the effect of modification or interruption of AMD’s interior company procedures and facts programs compatibility of AMD’s products with some or all business-common computer software and components expenditures associated to defective products and solutions the performance of AMD’s supply chain AMD’s capacity to depend on third get together supply-chain logistics functions AMD’s stock price tag volatility worldwide political situations unfavorable currency trade price fluctuations AMD’s skill to efficiently management the revenue of its goods on the grey market place AMD’s potential to adequately defend its technologies or other mental home latest and future claims and litigation probable tax liabilities and the impact of environmental guidelines, conflict minerals-similar provisions and other legal guidelines or restrictions. Buyers are urged to evaluate in element the threats and uncertainties in AMD’s Securities and Exchange Fee filings, including but not constrained to AMD’s Quarterly Report on Type ten-Q for the quarter finished September 26, 2020.

©2020 Superior Micro Gadgets, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, AMD Instinct, Infinity Cloth, ROCm and combos thereof are emblems of Innovative Micro Products, Inc. The OpenMP identify and the OpenMP logos are registered emblems of the OpenMP Architecture Overview Board. PCIe is a registered trademark of PCI-SIG Corporation. Python is a trademark of the Python Software package Foundation. PyTorch is a trademark or registered trademark of PyTorch. TensorFlow, the TensorFlow brand and any linked marks are logos of Google Inc. Other product names made use of in this publication are for identification uses only and may possibly be emblems of their respective organizations.
_______________________________

  1. Calculations performed by AMD Efficiency Labs as of Sep eighteen, 2020 for the AMD Instinct™ MI100 (32GB HBM2 PCIe® card) accelerator at 1,502 MHz peak improve motor clock resulted in eleven.fifty four TFLOPS peak double precision (FP64), 46.one TFLOPS peak single precision matrix (FP32), 23.one TFLOPS peak one precision (FP32), 184.six TFLOPS peak half precision (FP16) peak theoretical, floating-issue efficiency. Published benefits on the NVidia Ampere A100 (40GB) GPU accelerator resulted in nine.7 TFLOPS peak double precision (FP64). 19.5 TFLOPS peak one precision (FP32), seventy eight TFLOPS peak 50 percent precision (FP16) theoretical, floating-stage performance. Server brands might fluctuate configuration offerings yielding distinctive outcomes. MI100-03
  2. Calculations carried out by AMD General performance Labs as of Sep three, 2020 on the AMD Instinct™ MI100 (32GB HBM2 PCIe® card) accelerator at one,502 MHz peak engine clock resulted in forty six.one TFLOPS peak theoretical single precision (FP32 Matrix) Math floating-stage overall performance. The Nvidia Ampere A100 (40GB) GPU accelerator released results are 19.5 TFLOPS peak solitary precision (FP32) floating-position overall performance. Nvidia results located at: https://www.nvidia.com/content material/dam/en-zz/Options/Details-Centre/nvidia-ampere-architecture-whitepaper.pdf. Server brands may possibly differ configuration offerings yielding different benefits. MI100-01
  3. Calculations performed by AMD Efficiency Labs as of Sep eighteen, 2020 for the AMD Instinct™ MI100 accelerator at one,502 MHz peak enhance engine clock resulted in 184.57 TFLOPS peak theoretical 50 % precision (FP16) and forty six.fourteen TFLOPS peak theoretical one precision (FP32 Matrix) floating-stage performance. The outcomes calculated for Radeon Instinct™ MI50 GPU at one,725 MHz peak motor clock resulted in 26.5 TFLOPS peak theoretical 50 % precision (FP16) and 13.twenty five TFLOPS peak theoretical single precision (FP32 Matrix) floating-place effectiveness. Server manufacturers may perhaps differ configuration offerings yielding unique results. MI100-04
  4. Calculations as of SEP 18th, 2020. AMD Instinct™ MI100 built on AMD CDNA technologies accelerators supporting PCIe® Gen4 supplying up to 64 GB/s peak theoretical transport knowledge bandwidth from CPU to GPU per card. AMD Instinct™ MI100 accelerators contain three Infinity Fabric™ backlinks delivering up to 276 GB/s peak theoretical GPU to GPU or Peer-to-Peer (P2P) transportation level bandwidth performance for each GPU card. Combined with PCIe Gen4 aid furnishing an combination GPU card I/O peak bandwidth of up to 340 GB/s. MI100s have three backlinks: ninety two GB/s * three links per GPU = 276 GB/s. 4 GPU hives supply up to 552 GB/s peak theoretical P2P functionality. Twin four GPU hives in a server supply up to one.one TB/s total peak theoretical immediate P2P performance for each server. AMD Infinity Material connection technological know-how not enabled: 4 GPU hives offer up to 256 GB/s peak theoretical P2P efficiency with PCIe® 4.. Server producers may possibly range configuration choices yielding various benefits. MI100-07
  5. Calculations by AMD Efficiency Labs as of Oct 5th, 2020 for the AMD Instinct™ MI100 accelerator built with AMD CDNA 7nm FinFET system know-how at 1,200 MHz peak memory clock resulted in one.2288 TFLOPS peak theoretical memory bandwidth general performance. The success calculated for Radeon Instinct™ MI50 GPU designed with “Vega” 7nm FinFET course of action engineering with 1,000 MHz peak memory clock resulted in 1.024 TFLOPS peak theoretical memory bandwidth overall performance. CDNA-04
  6. Operates with PCIe® Gen 4. and Gen three. compliant motherboards. Performance could range from motherboard to motherboard. Refer to procedure or motherboard company for unique item general performance and attributes.
  7. Testing Executed by AMD effectiveness labs as of Oct thirtieth, 2020, on three platforms and software package versions standard for the launch dates of the Radeon Intuition MI25 (2018), MI50 (2019) and AMD Intuition MI100 GPU (2020) working the benchmark application Quicksilver. MI100 system (2020): Gigabyte G482-Z51-00 system comprised of Twin Socket AMD EPYC™ 7702 64-Core Processor, AMD Instinct™ MI100 GPU, ROCm™ three.ten driver, 512GB DDR4, RHEL eight.2.  MI50 platform (2019): Supermicro® SYS-4029GP-TRT2 process comprised of Dual Socket Intel Xeon® Gold® 6132, Radeon Instinct™ MI50 GPU, ROCm 2.10 driver, 256 GB DDR4, SLES15SP1. MI25 platform (2018): Supermicro SYS-4028GR-TR2 technique comprised of Dual Socket Intel Xeon CPU E5-2690, Radeon Instinct™ MI25 GPU, ROCm two..89 driver, 246GB DDR4 program memory, Ubuntu 16.04.5 LTS. MI100-14
  8. Screening Executed by AMD functionality labs as of October 30th, 2020, on three platforms and computer software variations typical for the launch dates of the Radeon Intuition MI25 (2018), MI50 (2019) and AMD Intuition MI100 GPU (2020) working the benchmark application TensorFlow ResNet fifty FP 16 batch size 128. MI100 platform (2020): Gigabyte G482-Z51-00 technique comprised of Dual Socket AMD EPYC™ 7702 sixty four-Main Processor, AMD Instinct™ MI100 GPU, ROCm™ three.10 driver, 512GB DDR4, RHEL 8.2. MI50 system (2019): Supermicro® SYS-4029GP-TRT2 method comprised of Twin Socket Intel Xeon® Gold® 6254, Radeon Instinct™ MI50 GPU, ROCm three..six driver, 338 GB DDR4, Ubuntu® sixteen.04.six LTS. MI25 platform (2018): a Supermicro SYS-4028GR-TR2 method comprised of Dual Socket Intel Xeon CPU E5-2690, Radeon Instinct™ MI25 GPU, ROCm two..89 driver, 246GB DDR4 system memory, Ubuntu sixteen.04.5 LTS. MI100-15

Contacts:
Gary Silcott
AMD Communications
+1 512-602-0889
Gary.Silcott@amd.com

Laura Graves
AMD Trader Relations
+1 408-749-5467
Laura.Graves@amd.com

Primary Logo

Source: Sophisticated Micro Devices, Inc.

Add Comment

Your email address will not be published. Required fields are marked *