
Department Manager, Scalable Systems Group, Sandia National Laboratories. US.
“The Last Differentiating Capability of HPC Systems is Fading”
Abstract
Over the last three decades, high-performance computing (HPC) systems have evolved from highly-specialized hardware running custom software environments to platforms that are almost entirely composed of commodity components. While some aspects of large-scale HPC systems continue to be enhanced to meet performance and scalability demands, HPC systems have been distinguished by interconnect technology. The emergence of cloud computing and hyperscalers has led to the deployment of massive data centers containing systems much larger than the fastest HPC systems. But, until recently, these systems were easily differentiated from HPC machines by the use of commodity ethernet networks. It appears that these two worlds are now converging and may be headed towards a common solution. This talk will describe how interconnect hardware and software technologies for HPC systems has been impacted by cloud computing and offer a perspective on future challenges that will need to be addressed to ensure that interconnect technology continues to meet the requirements of extreme-scale HPC systems.
Bio
Ron Brightwell received his BS in mathematics in 1991 and his MS in computer science in 1994 from Mississippi State University. He joined Sandia National Laboratories in 1995 after serving as a research assistant in the system software group at the MSU/NSF Engineering Research Center for Computational Field Simulation. While at Sandia, he has designed and developed high-performance implementations of the Message Passing Interface (MPI) standard on several large-scale, massively parallel computing platforms, including the Cray T3D and T3E, the Intel Paragon and TeraFLOPS (ASCI/Red), and Sandia’s Computational Plant clusters. His research interests include high-performance, scalable communication interfaces and protocols for system area networks, operating systems for massively parallel processing machines, and parallel program performance analysis libraries and tools. He is also currently pursuing a PhD in the Department of Computer Science at the University of New Mexico.
Click to download 2024 Slides in pdf.
