The specification of kestrel supercomputer built by HPE for the National Renewable Energy Laboratory (NREL) under the US Department of energy has been officially announced After NREL announced the plan last year, now we finally know that it will use amd Xiaolong Genoa, Intel sapphire rapids and NVIDIA H100 accelerator hardware, and can provide up to 44 PFlops of computing power.
(via WCCFTech)
With the support of the latest software and hardware technologies of the three technology giants, kestrel aims to replace the existing Eagle supercomputing. At a recent meeting, HPE first revealed the hardware specifications of this supercomputing system.
It can be seen that kestrel supercomputing adopts the combination scheme of standard node and acceleration node, which has the peak performance of 44 petaflops - 5.5 times higher than Eagle supercomputing.
● the standard node adopts Intel's latest sapphire rapids Xeon scalable CPU (in this case, a 52 core / 104 thread SKU).
● among the 2304 standard nodes, the dual CPU scheme is used (a total of 4608 sapphire rapids SP processors, 239616 cores / 479232 threads).
● 75 Pb of data storage (Lustre) and 256gb of ddr5 memory for each of the 2304 nodes (560 Pb of system memory in total).
Another 132 acceleration nodes:
● kestrel has selected the NVIDIA H100 GPU accelerator card @ amd dual epyc Genoa server processor combination based on hopper graphics architecture for each node.
● there are 528 NVIDIA hopper H100 GPU acceleration cards and 264 AMD epyc Genoa chips (up to 96c / 192T).
As for the exact CPU / GPU configuration model in the acceleration node, it is not known at present. If the top-level configuration scheme is adopted, kestrel is expected to obtain a total of 8921088 CUDA cores (H100 sxm5) + 25344 Zen 4 CPU cores.
In addition, the kestrel acceleration node also has 42 TB hbm3 high bandwidth cache + 20 TB system memory, supplemented by 8 Dav nodes (including up to 16 NVIDIA a40 GPU acceleration cards).
All these use HPE's slighshot Dragonfly interconnection scheme. The following are some highlights of HPE slighshot interconnection:
● industry leading performance and scalability
● adopt 100gbe and 200gbe high-speed interfaces
● High Performance Switches with high base, 64 ports and 12.8 tb/s bandwidth
● expandable to 250000 + host ports / up to 3 hops
● innovative hardware congestion management, adaptive routing and QoS quality of service control
● adopt standard Ethernet protocol, supplemented by optimized high performance computing (HPC) features
● link level retry and low delay forward error correction
● open and standardized API management interface
Finally, although the kestrel supercomputing has a per watt characteristic of 10.4 gflops (far less than the frontier supercomputer with an energy efficiency of more than 50 gflops/watt recently announced), its cost is still quite expensive (even higher than the exaflops system).
If all goes well, NREL's kestrel supercomputer is expected to be deployed in 2024.