search instagram arrow-down

Abstracts 2026

Luca Carloni.
Columbia University, USA.

Professor and chair of Computer Science at Columbia University in the City of New York.

Semiconductors made AI possible, and AI will continue to advance at the speed of hardware. In the era of sustainable AI, future systems must integrate diverse computing resources under increasingly strict constraints on energy, area, and design complexity, with chiplet-based assembly in advanced packages emerging as a key enabler. Meeting these demands requires new computing platforms built on modular abstractions, such as a tile-based organization, enabling agility, reuse, and collaboration at scale.

The talk presents ESP, an open-source SoC platform developed to support rapid integration of heterogeneous components, including specialized accelerators, within a tile-based architectural framework. Its capabilities are illustrated through a heterogeneous SoC implemented in a 12 nm process and running Linux-SMP. Building on this experience, the talk distills broader lessons on how open, agile platforms can accelerate research, promote design reuse and shared infrastructure, and support a collaborative approach to sustainable AI.

Taisuke Boku.
Director, Advanced HPC-AI Research and Development Support Center, Japan

Professor, Center for Computational Sciences, University of Tsukuba

In Japan, National Flagship System (NFS) has been developed based on multi-core or many-core general purpose CPU with tens of thousand of computation node as massively parallel CPU systems. According to these systems, the governmental programs for code development and scientific computation promotion has not been mainly supporting accelerated computing such as GPU computing. On the other hand, GPU-ready systems have been introduced in several supercomputing centers in national universities which are categorized as National Infrastructure Systems (NIS) in HPCI, the supercomputer collaborative use framework in Japan. However, the government (MEXT) finally decided to step into GPU computing even with NFS of next generation after Fugaku, under project name of “Post-Fugaku”, and the basic design of the system has been started by RIKEN R-CCS under the code name of “Fugaku-NEXT”, which is based on Fujitsu’s next generation CPU (MONAKA-X) and NVIDIA GPU. Under such a situation, we need rapid code development toward large scale GPU-ready systems both for NFS and NIS, supported by governmental program. Responding to the governmental call, we launched a new national center named “Advanced HPC-AI Research and Development Support Cente”, or HAIRDESC for short, in Kobe Japan. The governmental support for the center is 4.5 years from Oct. 2025 to Mar. 2030, toward the Fugaku-NEXT operation starting plan. In HAIRDESC, we will construct a standard set of GPU coding with wide variation of coding styles, application fields, multiple levels from novice to expertise, and with multiple vendor’s GPUs (AMD and NVIDIA). HAIRDESC is also supported by three core organizations: Univ. of Tsukuba, Univ. of Tokyo and Inst. of Science Tokyo, where top-level GPU researchers gather and operate the largest scale of GPU supercomputers under MEXT.

In this talk, I will present current plan of Fugaku-NEXT (by courtesy of R-CCS) and progress of GPU system installation in NIS centers, followed by HAIRDESC plan and activities including the advanced GPU research at three core organizations. I also talk about the memory architecture system differences in two CPU-GPU coupling technologies, NVIDIA GH200 and AMD MI300A with performance analysis.

Karen E. Willcox.
The University of Texas at Austin, USA.

Director, Oden Institute for Computational Engineering and Sciences | Associate Vice President for Research | Professor of Aerospace Engineering and Engineering Mechanics | W. A. “Tex” Moncrief, Jr. Chair in Simulation-Based Engineering and Sciences | Peter O’Donnell, Jr. Centennial Chair in Computing Systems |

In silico experimentation is the way of the future: Computing enables engineering designers to explore new ideas beyond what is possible in physical experiments. But simulating complex physics is computationally expensive — just a single simulation can take days on a supercomputer, making it practically impossible for a designer to fully explore the high-dimensional space of design options. That’s where reduced-order models comes in — surrogate models that are empirically learned but firmly grounded in the underlying physics. These reduced-order models give the designer the predictive power of a sophisticated physics simulation, but they do it at lightning speed, compressing days of computation into seconds. In this talk, I will show how these computational speedups are a game changer for the design of complex engineering systems, such as next-generation rocket engines. 

Andrew Wheeler.
HPE, US.

SVP and Director, Hewlett Packard Labs, Hewlett Packard Enterprise.
HPE Fellow / VP, HPC & AI Advanced Development, HPE.

Andrew Jones.
Microsoft, UK.

Future AI Infrastructure, Supercomputing & HPC
Azure Specialized Engineering.

Harnessing Responsible AI for Science: Taming Open Data

Manish Parashar.
University of Utah, USA.

Inaugural Chief AI Officer at the University of Utah | Executive Director of the Scientific Computing and Imaging (SCI) Institute | Presidential Professor in the Kalhert School of Computing.

Artificial intelligence (AI) and open data have become essential engines for scientific discovery and innovation. However, realizing this transformative potential requires a transdisciplinary approach that ensures research and development can effectively and responsibly leverage the diversity of data sources. Despite the exponential growth of available digital data sources and the ubiquity of non-trivial computational power for processing this data, realizing data-driven, AI-enabled science workflows remains challenging. In this talk, I will discuss the importance of democratizing AI R&D, including access to open data and advanced cyberinfrastructure. I will introduce the University of Utah’s One-U Responsible AI Initiative, which aims to catalyze an innovation ecosystem at the University of Utah and across the state. I will also present the vision, architecture, and deployment of the National Data Platform project, as part of a broader national cyberinfrastructure, aimed at catalyzing an open and extensible data ecosystem for science.

Jason Trout.
Senior Engineer, SADRAM Inc, USA/New Zealand.

Symbolically Addressed DRAM, or SADRAM, is an augmentation of DDR memory in which data can be addressed via any arbitrary symbol, instead of solely by the physical address at which it is stored. When a new record is written to SADRAM, the record’s indexing key is extracted, sorted, and stored hierarchically by augmented row buffer logic within one of the DRAM chips. This provides database-like access to memory, freeing the main processor to offload pointer arithmetic and data organisation tasks associated with sorting data, and also makes novel storage methods possible (such as compact storage of sparsely populated arrays). For algorithms modified to take advantage of SADRAM, we expect to achieve 1) increased performance via memory-wall circumvention, and 2) reduced energy consumption from a reduction in the amount of data having to travel from CPU to DDR. 

We are actively prototyping the SADRAM concept using the AMD Alveo v80 FPGA card, equipped with HBM memory, to simulate a typical 8192-bit wide access to a DRAM’s cell array by its row buffer. The SADRAM algorithm is similar to B-Trees, featuring a very wide ‘node’ fanout of 128 for a x16 DRAM chip and 32-bit key.  SADRAM logic compares all nodes in the row buffer against new data simultaneously; thus, there is no performance tradeoff for the very wide fanout, in contrast to a similarly wide processor implementation.

While the primary application for SADRAM is sorting and storage-by-symbol, it features a general purpose processor, opening the door for the implementation of other algorithms. We invite the Multicore World global community to provide feedback on applications and explore together further developments to be incorporated into our five-year roadmap.

Ewa Deelman.
University of Southern California, Los Angeles, USA.

Research Professor and Research Director
USC Information Sciences Institute
AAAS Fellow, IEEE Fellow, USC/ISI Fellow

As scientific applications increasingly span HPC clusters, cloud resources, GPUs, and edge systems, managing workflows across these heterogeneous environments has become a critical challenge. Traditional centralized workflow scheduling approaches struggle to adapt to dynamic resource availability, varying data locality, and performance heterogeneity. To address these limitations, the talk presents SWARM, a multi-agent orchestration framework that enables decentralized, adaptive workflow scheduling for heterogeneous computing systems.

SWARM’s architecture decomposes the scheduling problem into interdependent components — job selection, job scheduling, consensus algorithms, and overlay network formation. It integrates simulation, emulation, and system prototyping to evaluate algorithmic and architectural trade-offs on real systems including the NSF FABRIC testbed and DOE computing facilities at ANL, LBNL, and ORNL. Through this layered methodology, SWARM enables systematic exploration of both mathematical formulations and intelligent coordination strategies for distributed resource management.

This talk describes SWARM advances, its design of self-managing heterogeneous systems that dynamically coordinate resources under uncertainty. This approach provides a new architectural foundation for heterogeneous computing — one that prioritizes adaptability, decentralized intelligence, and co-evolution between scheduling algorithms and system state.

Ilkay Altintas.
University of California, San Diego, USA.

Chief Data Science Officer, San Diego Supercomputer Center
Founder and Director, Societal Computing Innovation Lab (SCIL)
Division Director, Cyberinfrastructure and Convergence Research (CICORE)

Societal Computing reframes computing as an innovation engine for collective resilience and impact —linking cutting edge science, data, models, AI systems, and communities to solve complex challenges at scale. In this talk, I will outline a vision for building AI-ready data ecosystems that empower researchers, educators, policymakers, and the public to work from a shared digital fabric. Drawing on lessons from wildfire science, resilient agriculture, public health, and education, I will describe how structured collaboration, national cyberinfrastructure, and responsible AI create a new kind of societal operating system. Through examples from the National Data Platform and the Wildfire Science and Technology Commons, I will show how convergence research becomes actionable when we bridge data stewardship, computational workflows, multi-modal AI, and community-centered design. The talk will highlight emerging opportunities for building trustworthy, inclusive, and durable socio-technical systems that enable science and society to learn, adapt, and innovate together.

Simon McIntosh-Smith.
University of Bristok, UK.

Professor of High Performance Computing
Director of the Bristol Centre for Supercomputing (BriCS), including Isambard-AI and Isambard 3

The UK’s Isambard-AI national supercomputer service was built and deployed in record time, from breaking ground in June 2024 to going into production in August 2025. Optimised specifically for artificial intelligence applications, and ranked #11 on the current Top500, Isambard-AI has already grown to over 1,400 users and 400 projects in its first six months. In this talk, we’ll summarise some of the early learnings from running an AI national service, as well as our experience of using technologies such as modular data centres to enable the rapid deployment of this new 5MW supercomputer.

Nathan DeBardeleben.
Los Alamos National Laboratory, USA.

High Perf. Computing Design (HPC-DES)
UltraScale Systems Research Center (Co-Exec Director Technical Operations)
Senior Research Scientist

This talk provides an overview of the ArtIMis (AI for Mission) project at LANL.  ArtIMis is an institutional investment, lab leadership supported, AI initiative that brings together over 100 LANL scientists on focused, AI R&D for LANL’s various missions.  In this talk, Dr. DeBardeleben will covers goals, accomplishments, and plans of ArtIMis in its second year including multiphysics foundation models, AI for material discovery and fracture, agentic AI uses, AI for therapeutics, and other topics.  Grand challenges around AI include how to accelerate scientist workflow through agentic control of simulation workflows and fast surrogate models enabling exploration of parameter spaces for design and discovery.

Will Kamp.
Kamputed Ltd., New Zealand

Director – Senior FPGA/HPC Engineer.

It’s been a big year for the SKA, 2025 has seen receptors emerging from the desert terrains, pointing skyward to collect the faintest of signals from the galactic plane. Feeding the hungry correlators that masticate the data into visibilities for the science data pipeline to digest.

For the Mid Correlator team it has been a year of testing, verification, bug fixing, testing, verification, …, factory acceptance test completion, deployment, and integration with the other components of the telescope. Culminating in the {embargoed news}. 2026 will see the deployment of an eight receptor Mid Telescope.

Meanwhile the FPGA team has started work on a new technology platform, replacing the custom Stratix10 FPGA boards and passive optical fibre interconnect, with a more data-centre like configuration with Agilex7 PCIe cards with HBM in x86 servers, interconnected with 400GbE switches. With this new architecture and upgraded technology the correlator is shrunk to half the space, and half the power.

The AXIoE (Advanced eXtensible Interconnect over Ethernet) I developed for the SKA correlator, and published is also making its way into 6G cellular base stations and quantum computers. The world is building larger multi-FPGA systems and need a way to control and coordinate these over a reliable networked connection. This demonstrates the value of being involved in the development of mega-science projects such as the SKA that push on the boundaries of what is possible.

Alok N. Choudhary.
Northwestern University, US.

Harold Washington Professor, ECE and CS Departments

Electronic Health Records (EHRs) contain rich temporal and multimodal information that can be used in AI/ML based models for clinical decision-making, yet their complex, heterogeneous nature presents significant challenges. These dataset characteristics in size, volume, heterogeneity and complexity are quite different from those used for training LLMs (what is
commonly knowns as AI models today). Not making the right decisions (or even a suboptimal decision) can have adverse outcomes. This talk will present challenges, approaches, results and lessons from applying AI in Critical care as well as Chronic disease management.

Kento Sato.
RIKEN Center for Computational Science (R-CCS), Japan.

Team Principal, High Performance Big Data Research Team.

The FugakuNEXT project advances Japan’s high-performance computing infrastructure, aiming to extend the capabilities of the current Fugaku system while exploring new architectures for the AI-for-Science era. This talk introduces ongoing research and development on system software solutions that will underpin FugakuNEXT and beyond.

Emily Casleton.
Los Alamos National Laboratory, USA.

Statistical Sciences Group, CAI-4

Large-scale AI models are now incorporated into many of our workflows, including search, coding, and national-security, yet research on defensible methods to probe and test what they can and should do have lagged behind model development. However, Stanford AI Experts predict that “The era of AI evangelism is giving way to evaluation”.

In this talk I will discuss three interconnected trends with examples that are reshaping evaluation practice. 1) Bespoke metrics for model assessments and how they can inform the loss function, or what you optimize during training. Mean squared error and cross entropy are the most common loss functions and accuracy/precision/recall are default metrics, but in AI models that are built for science, bespoke metrics can be more informative. 2) Capability-based benchmark suites. By organizing benchmarks around what a model should do rather what it knows, the evaluation can better quantify the model’s usefulness to potential users. 3) Design-of-experiments for benchmark construction. This approach yields smaller but more informative benchmarks and lead to accuracy tied to meta-data.

Anirban Mandal.
University of North Carolina at Chapel Hill. US.

Director for the Network Research and Infrastructure (NRIG) group at RENCI (Renaissance Computing Institute),

The FABRIC network testbed is an indispensable “research instrument”, functioning as a crucial enabler for experimentation and evaluation of distributed scientific workflow technologies on next-generation cyberinfrastructure. This presentation will focus on Edge-to-Core workflows, which are critical for science domains like disaster response using UAVs, requiring efficient orchestration and management of sensor data across edge devices, the network, and core cloud resources. Research leveraging the FABRIC testbed provides tools for scientists to include edge computing devices in computational workflows, essential for low-latency applications.

The presentation will also delve into a radically alternative, fully decentralized approach to resilient resource management for scientific workloads, inspired by swarm intelligence (SI) and multi-agent systems. This research includes the development of a novel, greedy consensus algorithm for distributed job selection, with implementations utilizing hierarchical topologies deployed and evaluated directly on FABRIC. FABRIC acts as the essential “digital wind tunnel”, providing isolated and reproducible environments necessary to test complex workflow execution and resource management under controlled anomalous conditions that production systems cannot support.

Paola Buitrago.
Pittsburgh Supercomputing Center. USA.

Director AI and Big Data Group.

AI workloads are increasingly shaped by the capabilities and constraints of specialized hardware. In this talk, we present the NSF Neocortex AI Supercomputer as a national-scale experimental testbed for AI-specialized architectures, and share practical experiences deploying and operating such systems for the research community. We discuss design choices, programming models, and operational lessons learned, and highlight how domain scientists and AI researchers have leveraged Neocortex to scale large models and data-intensive workflows.

The talk concludes with insights on the evolving role of AI-specialized hardware in advancing AI-for-science and informing future supercomputing infrastructures.

Ana Gainaru.
Oak Ridge National Laboratory, USA.

Computer Scientist (co-lead of the Data Understanding thrust in SciDAC RAPIDS and lead of the Self-improving AI models thrust in The Transformational AI Models Consortium (ModCon))

The next generation of HPC application is represented by hybrid approaches that weave together traditional simulations and modern AI. However, a critical bottleneck in integrating HPC with AI is the “lack of awareness” between workflow components. The outputs of HPC applications are often analyzed only sparingly before archival, effectively becoming inaccessible for future training codes due to the manual, time-consuming processes of finding, and processing datasets for each analysis purpose, frequently outweighing the cost of re-running simulations. This fragmentation results in complex, brittle workflows where data management is treated as an afterthought. In this presentation, we propose a unified framework for managing the complex lifecycle of data in hybrid AI-HPC systems. We will address the limitations of current domain-specific solutions by introducing abstractions that map the relationships between raw simulation outputs, processed training sets, and surrogate model inference. By creating a system where data provenance and transformation history are persistent, we enable workflows that “learn” from previous executions. Attendees will learn how to design workflows that minimize redundant processing, facilitate cross-domain optimization transfer, and ensure that the massive datasets required for AI training remain accessible, structured, and reusable.

Giulia Guidi.
Cornell University, US.

Assistant Professor of Computer Science.

The diverse and non-trivial challenges of parallelism in data analytics require computing infrastructures that go beyond the demand of traditional simulation-based sciences. The growing data volume and complexity have outpaced the processing capacity of single-node machines in these areas, making massively parallel systems an indispensable tool. However, programming on high-performance computing (HPC) systems poses significant productivity and scalability challenges. It is important to introduce an abstraction layer that provides programming flexibility and productivity while ensuring high system performance. As we enter the post-Moore’s Law era, effective programming of specialized architectures is critical for improved performance in HPC. As large-scale systems become more heterogeneous, their efficient use for new, often irregular and communication-intensive data analysis computation becomes increasingly complex. In this talk, we discuss how sparse linear algebra can be used to achieve performance and scalability on extreme-scale systems while maintaining productivity for emerging data-intensive scientific challenges. (TBC)

Amal Gueroudji.
Argonne National Laboratory (ANL), US.

Postdoctoral researcher, Mathematics and Computer Science Division – ANL.

Contemporary scientific workflows operate at unprecedented scales of complexity and heterogeneity, and while they accelerate discovery, they challenge our ability to measure, reproduce, and trust them. In this talk, we present our work on three foundational directions that aim to bring stability to this constantly shifting landscape. First, we develop performance characterization and provenance methods that allow us to observe, understand, and ultimately improve distributed and hybrid workflows. Second, we incorporate resilience as a built-in property for experiment-driven systems where failures are costly and time-critical. Finally, we explore reliability in agentic workflows, where trustworthy autonomy depends on transparent provenance and verifiable decision making. Together, these directions outline a path toward scientific workflows that remain dependable even as they evolve, adapt, and scale.