search instagram arrow-down

Balazs Gerofi

Bio

Dr. Balazs Gerofi is an expert in system software and parallel / distributed computing. In particular Balasz is interested in operating systems (kernel architectures for many-core CPUs, memory management, file systems), HPC (parallel and distributed I/O, resiliency), virtualisation, and fault tolerant computing (replication, checkpoint-restart, message-logging).

Balazs is a Research Scientist at the System Software Research Team, part of the RIKEN Advanced Institute for Computational Science (AICS), Tokyo, Japan

He is the Co-Editor of Operating Systems for Supercomputers and High Performance Computing 1st edition, October 2019 

Page at Multicore World 2018

IMG_2826


 

Towards Dynamic Resource Management in Next Generation HPC Environments

Balazs Gerofi – Research Scientist

System Software Research Team

RIKEN Center for Computational Science (RIKEN-CCS) – Tokyo, Japan

Thursday 20  February 2020 – 9:30 am

Abstract

Workload diversity in high-performance computing (HPC) environments has experienced an explosion in recent years. The increasing prevalence of Big Data processing, in-situ analytics, artificial intelligence (AI) and machine learning (ML) workloads, as well as multi-component workflows is pushing the limits of supercomputing systems that have been primarily designed to serve parallel simulations.

In addition, with the growing complexity of the hardware there is also a growing interest for multi-tenancy and for a more dynamic, cloud-like execution environment. All these trends bring together a large variety of runtime components that do not cooperate well with each other, which in turn can lead to suboptimal performance.

This talk will enumerate a number of representative workloads that stress the limitations of the traditional HPC center. We then highlight some of the underlying forces which shape requirements of next generation systems and propose a cross-stack coordination layer that aims to resolve these conflicts. Finally, through some of our previous efforts in this space we demonstrate the benefits of the overall approach.

 

SLIDES

VIDEO

2Y1A0191

Balazs Gerofi (Riken) & Nicolás Erdödy (Open Parallel) at Multicore World 2018

 

 

%d bloggers like this: