search instagram arrow-down

Abstracts 2020

Multicore World 2020_logo_large

Updated 16 January 2020

Preparing for Extreme Heterogeneity in High Performance Computing

Jeffrey S. Vetter

Distinguished R&D Staff Member. Leader, Future Technologies Group

Oak Ridge National Laboratory (ORNL), Tennessee, USA


While computing technologies have remained relatively stable for nearly two decades, new architectural features, such as heterogeneous cores, deep memory hierarchies, non-volatile memory (NVM), and near-memory processing, have emerged as possible solutions to address the concerns of energy-efficiency and cost.

However, we expect this ‘golden age’ of architectural change to lead to extreme heterogeneity and it will have a major impact on software systems and applications. Software will need to be redesigned to exploit these new capabilities and provide some level of performance portability across these diverse architectures.

In this talk, I will sample these emerging technologies, discuss their architectural and software implications, and describe several new approaches (e.g., domain specific languages, intelligent runtime systems) to address these challenges.


An AI-Guided Multiscale Modelling of Platelet Dynamics on Parallel Processors

Yuefan Deng

Professor of Applied Mathematics

Associate Director, Institute for Engineering-Driven Medicine

Stony Brook University, New York, USA


We present the general methodologies of cell modelling by coupling in vitro experiments, multiscale modelling, and artificial intelligence, while demonstrating their power in fast and accurate modelling of platelet aggregations. Platelet aggregations stimulate blood clotting that causes heart attacks and strokes, resulting in 20 million deaths worldwide each year. To reduce such deaths, we must discover new drugs. To discover new drugs, we must understand platelets’ dynamics that, with modelling, involves setting up the basic space and time discretisation in huge ranges of 5-6 orders of magnitudes, resulting from the relevant fundamental interactions at atomic, molecular, cell, and fluid scales. To achieve the desired accuracy at the minimal computational costs, we must select the correct physiological parameters in the force fields as well as the spatial and temporal discretisation, by machine learning.



PNNL’s Data-Model Convergence Initiative – 2020 Update

James A. Ang, Ph.D.
Chief Scientist for Computing, Physical and Computational Sciences Directorate
Pacific Northwest National Laboratory (PNNL), Richland, Washington, USA

Date: TBC


PNNL’s Data-Model Convergence (DMC) Initiative was launched in January 2019. The DMC Initiative is pursing integration of high performance computing (HPC) modelling and simulation, data/graph analytics, and domain-aware machine learning computing paradigms on multiple levels.

This five-year initiative is creating the next generation of scientific computing capability through a software and hardware co-design effort at the levels of:

1) heterogeneous workloads,

2) integrated system software stack, and

3) conceptual designs for heterogeneous system-on-chip processors.

This 2020 update will provide an overview of our portfolio of DMC projects in Application Domains, Data Sciences, Software Stack and Hardware Architectures. Computing workflows that use this converged DMC software and hardware architecture support laboratory objectives in accelerating scientific discovery, and real-time control of the power grid.


Literate Pair Programming:

Hippie Hacker (Chris McClimans) – CEO

ii, Tauranga, Bay of Plenty, New Zealand

Date: TBC


Creating and maintaining a culture of collaboration requires tooling to capture`the way we work’ in addition to `the work itself’.

By combining pairing and literate programming templates we can capture, modify, and improve the way we work together.

Literate programming (Knuth 1984), interweaves essay style writing with blocks of code. Basically treating programs as literature.

Pair programming encourages cross-discipline thinking. It accelerates onboarding and process improvement particularly when pairing sessions / workflows can be templated.

Literate Pair Programming (LPP) is similar to interweaving Google docs and Jupyter notebooks and allows evolution of the ‘way we work’ in addition to providing meaningful output from each session. is the ii approach ( to LPP based on emacs + org-mode that can be run via the command line within a Kubernetes cluster.

In this walkthrough / demo, audience participation is expected as we mob/pair our way though a few workflows for some realtime cloud-native collaboration together.

• Launching a iimacs pairing session in Kubernetes
• Sharing mob/pair session with the audience (web/ssh)
• Organizing our workflow
• Capturing the output of our code
• Exporting our work to various formats
• Exporting our work directly over terminal / web
• Example Templated Workflows
• Working in multiple languages at once
• Creating Kubernetes Conformance Tests


The Pegasus Workflow Management System: Current Applications and Future Directions

Ewa Deelman, Research Director, Science Automation Technologies Division

University of Southern California, Information Sciences Institute, Los Angeles, CA, USA

Date: TBC


The Pegasus Workflow Management System is designed to meet the needs of a wide variety of scientific applications. It automates the execution of complex and large-scale workflow task graphs operating on large amounts of data.  Since 2001 Pegasus has been working with a number of applications such as LIGO, the gravitational-wave physics experiment, to enable them to accomplish their scientific goals. In 2016, Pegasus was used by LIGO to analyze their experimental data, confirming the first ever direct detection of a gravitational wave. Pegasus also delivers robust automation capabilities to researchers at the Southern California Earthquake Center (SCEC) studying seismic phenomena, to astronomers seeking to understand the structure of the universe, to material scientists developing new drug delivery methods, and to students seeking to understand human population migration.  An example of societal impact is SCEC’s use of Pegasus to generate the world’s first physics-based probabilistic seismic hazard map that provides insight into why earthquakes in the Los Angeles basin can be so destructive. This information can inform civil engineering practices in the area.

This talk focuses on the current Pegasus capabilities and describes new research directions that will inform future Pegasus development.


All Tomorrow’s Memories

(with apologies to Lou Reed)

Bruce Jacob – Keystone Professor of Electrical and Computer Engineering

University of Maryland, College Park, MD, USA

Date: TBC


Memory and communication are the primary reasons that our time-to-solution is no better than it currently is … the memory system is slow; the communication overhead is high; and yet a significant amount of research is still focused on increasing processor performance, rather than decreasing (the cost of) data movement. I will discuss recent & near-term memory-system technologies including high-bandwidth DRAMs and nonvolatile main memories, as well as the impact of tomorrow’s memory technologies on tomorrow’s applications and operating systems. Modern multicore and manycore designs exacerbate the problem, but two solutions are on the horizon.


Speed Up Your Parallel Application Without Doing Much

Ruud van der Pas – Netherlands


Surprisingly, many developers ignore the low hanging fruit when it comes to performance tuning. Admittedly the word “low” is relative to how tall you are, but as we will demonstrate in this talk, a combination of the right tools and basic insights can deliver significant performance improvements.
We will illustrate this using a graph analysis application.


Why aren’t we there yet?  The journey to Exascale COTS computing

Duncan Hall – IMD Strategy and Planning Manager

Ministry of Foreign Affairs & Trade, Wellington, New Zealand


The Green500 list, published biannually alongside the Top500 list, ranks the energy efficiency of the top 500 or so supercomputers (whose data is made public) by FLOPS per Watt.

I continue to analyse Green500 data to forecast likely trajectories towards Exascale (~10^18 FLOPS) Commercial Off The Shelf (COTS) computing.


Building an open, safe, accessible AI & HPC ecosystem

Andrew Richards – CEO and co-founder

Codeplay Ltd, Edinburgh, United Kingdom


The world of AI & HPC is dominated by closed, proprietary software models. To get high performance today, systems need accelerators that have high levels of parallelism, but use closed programming models like CUDA. How do we open this up? How do we make these models safe enough to drive a car? How do we get an industry to work together with industry standards? Andrew and Codeplay have been working on these challenges for years. This talk will show the huge progress made today (SYCL, SPIR-V, oneAPI) and where we’re going next.


Addressing Challenges in Data movement and Communication

Samantika Sury – Principal Engineer

Intel Corp., Westford, Massachusetts, USA



While the last decade of computer architecture has established many novel compute solutions we find that application performance is often dominated by data movement.

The convergence of HPC, AI and analytics and emergence of edge computing has furthered the trend of applications needing to access large amounts of memory fast and efficiently and with low energy.

In this talk we demonstrate the performance and power impact of data movement on key parallel applications and explore architectural solutions like tightly coupled heterogeneity, moving compute to data and adaptive hardware to address the performance challenges due to data movement.


Towards Dynamic Resource Management in Next Generation HPC Environments

Balazs Gerofi – Research Scientist

System Software Research Team, RIKEN Center for Computational Science (RIKEN-CCS) – Tokyo, Japan


Workload diversity in high-performance computing (HPC) environments has experienced an explosion in recent years. The increasing prevalence of Big Data processing, in-situ analytics, artificial intelligence (AI) and machine learning (ML) workloads, as well as multi-component workflows is pushing the limits of supercomputing systems that have been primarily designed to serve parallel simulations. In addition, with the growing complexity of the hardware there is also a growing interest for multi-tenancy and for a more dynamic, cloud-like execution environment. All these trends bring together a large variety of runtime components that do not cooperate well with each other, which in turn can lead to suboptimal performance.

This talk will enumerate a number of representative workloads that stress the limitations of the traditional HPC center. We then highlight some of the underlying forces which shape requirements of next generation systems and propose a cross-stack coordination layer that aims to resolve these conflicts. Finally, through some of our previous efforts in this space we demonstrate the benefits of the overall approach.



Featured image – Stephan Friedl – Cisco – USA. Multicore World 2012

Photo Credit: Open Parallel Ltd




Abstracts 2019



Updated 10 February 2019


Click on Names for Bio and Slides (available after the talk)

Day 1 – TUESDAY 12th FEBRUARY 2019


PNNL’s Data-Model Convergence Initiative

James A. Ang – Chief Scientist for Computing, Physical & Computational Sciences Directorate

Pacific Northwest National Laboratory  (PNNL), Richland, WA, USA



The Data-Model Convergence (DMC) Initiative is an opportunity for PNNL to integrate high performance computing (HPC) modelling and simulation, data/graph analytics, and domain-aware machine learning computing paradigms. 

The DMC Initiative is a five-year effort to create the next generation of scientific computing capability through a focused, integrated software and hardware co-design effort.  Our goal is to take the current approach for independent computing paradigms and integrate them into one converged computing capability.  Computing workflows that use this converged DMC architecture will support laboratory objectives in scientific discovery, and real-time control of the power grid.  

— — — — — —


Cosmic Rays and Computers: The Sky is Falling

Sean Blanchard, Linux and HPC expert, Systems Engineer

Ultrascale Systems Research Center, Los Alamos National Laboratory (LANL), New Mexico, USA



As HPC systems grow larger and larger each year new scaling challenges become evident that have not been problematic in the past. What once were rare one in a million events have become common everyday occurrences in data centers that contain tens to hundreds of thousands of computers. I will speak on one of these rare events, how the death of giant stars millions of years ago can crash your computers today. I will also discuss current efforts to understand the rates of these events compared to other similar events and how these problems can be mitigated in the future.

— — — — — —


Simulating Data Center Networks

Ariel Hendel, Infrastructure Technologist

Pallavi Shurpali, Infrastructure Engineer

Facebook, Inc. Menlo Park, California, USA



The massive scale of Compute and Storage capacity designed and deployed by Mega Data Center operators has naturally attracted much attention.

In terms of efficiency improvements in all its engineering aspects, be it power distribution, cooling, optimal compute building blocks, selective use of DRAM, flash, and spinning media for different storage tiers, and the network that binds all parts together.

At such scale efficiency matters a lot. Unlike other technology innovations, operators view these efficiency gains as benefitting the industry in general and have collaborated to share them across the entire ecosystem and supply chain for example within the Open Compute Project (OCP).

Ultimately the services hosted in Data Centers, owned by the Operator or not, come from semiconductors in the form of Processors, Memory subsystems, Non-Volatile Memories, I/O interfaces, and network switches. The innovation in such semiconductors has been the fuel behind the increase in Data Center Capacity applied to growing services.

We postulate that the efficiency gains, applied so far to system level aspects, may be getting into diminishing returns. However, semiconductor innovation has been limited to process transitions per Moore’s law, more than architectural innovation. Arguably architectural and certainly algorithmic innovation for compute and storage endpoints can be pursued at small scale, and then be deployed at scale. This is much harder to do for networking.

We combine the above observations, with some recent network simulation work we performed to suggest a path forward. The development of a multi-party network simulation framework that can model a Data Center network and its endpoints at Data Center scale, and to apply such a framework to drive semiconductor level innovation either at the component level, or even at the functional block level.

In our talk we present the driving forces behind the idea, some partial work done that leads us to our larger vision, and the role we see for technologists and academia joining and driving this vision forward.

— — — — — —



Accelerating The Data Center

Karen Schramm, VP Technology

Broadcom, Inc., San Jose, California, USA




Processing demands in Data Centers continue to grow, while Moore’s Law is slowing. Operators are looking to get more out of their Xeon servers and looking to alternative compute platforms.

This will be a discussion on accelerating processing in the Data Center, focusing on offload technology and dis-aggregation. Hardware offload has long been leveraged to free up CPU cycles, from relatively simple assist such as network checksum offloads through very specialized, complex offloads such as compression or a full network transport layer (e.g. TCP, RDMA).  Dis-aggregation is used to improve Data Center efficiency, enabling better utilization of resources.

Modern solutions combine the performance improvement and efficiency of hardware offload with the flexibility required to meet the fast pace of innovation. These solutions are being deployed today. Data will be shared from deployments for network vSwitch offload as well as dis-aggregation of storage and Xeon processors.

— — — — — —

Flexible and Scalable Domain Specific Architectures

Gavin StarkChief Scientist

Nic ViljoenAssociate Director, Software Engineering

Netronome, Inc. Santa Clara, California, USA – Cape Town, South Africa



In this talk we will first introduce the concept of a domain specific architecture (DSA) using the Netronome Flow Processor (NFP) as an example, we will cover the motivation, design and implementation.

Thereafter we will explore how this architecture’s flexibility has been leveraged in the past to handle unique platforms such as the Facebook Yosemite v2 Platform.

Finally approaches for designing flexible chipsets in the future will be explored, including the value of system wide computational modelling.

— — — — — —




Help, I Lost My Memory! What Now?

Ruud van der Pas, Distinguished Engineer in the Oracle Linux and Virtualization organization

Oracle, Inc.  Amsterdam, Netherlands


It is well-known that the memory access time is a common bottleneck in applications,   but often that is also where the discussion ends. That is where this talk starts.

We will explore what happens under the hood when memory is accessed, and where things may go wrong from a performance perspective. This naturally leads to an exploration of Non-Uniform Memory Access (NUMA) systems and behaviour.

This talk concludes with various examples illustrating how bad it can get, and what can be done to crank up the performance. As we’ll show, there are often ways to at least make things better and such solutions are generic, not specific to a particular system architecture. That means they are longer lasting and survive system upgrades.

— — — — — —




Day 2 – WEDNESDAY 13th FEBRUARY 2019


Perfect Math Libraries Without Sacrificing Speed: The Minefield Method

John Gustafson, Professor

National University of Singapore and A*STAR, Singapore



Port any program using floating-point arithmetic from one platform to another, and you are likely to get different results. The most common reason is an issue that has been known for centuries: Elementary functions such as cosine, logarithm, exponential, etc. are excruciatingly difficult to round for certain input arguments, so the designers of math libraries ask us to accept a few errors in the last bit. The problem is that those errors are inconsistent from one library to another. While methods of assuring correct rounding for every value are known, they slow the function evaluations down by a huge factor. A recent breakthrough technique, the “Minefield Method,” demonstrates a new way to achieve perfect rounding with low-order approximations, eliminating the historical tradeoff between speed and correctness; you can have both. The Draft Posit Standard therefore requires all standard functions be correctly rounded for all input arguments so that posit calculations, unlike those using IEEE 754 Standard floats, can at last produce bitwise-identical results across platforms.

— — — — — —


Writing Big Data Pipelines: the Apache Beam Project

Neal Glew – Software Engineer

Google, Inc. Sunnyvale, California, USA.


Apache Beam is an open-source project for writing big-data pipelines (from TBs to PBs+).  Its heart is a programming model that unifies both batch and stream processing, allowing the programmer to separate the what, where, when, and how of processing.  What actual processing is performed on the data.  Where in event time is that processing done – how are event times windowed.  When in processing time to materialise results.  How are updates of results (due e.g. to late data) combined.  Beam also provides several language-specific SDKs that instantiate the model for particular languages.  Currently Java and Python are available and Go is under development.  Beam also provides a portability framework that allows pipelines to be run on a variety of execution technologies.  Beam itself provides a reference runner.  There are also efforts to develop runners based on Apache Flink and Apache Spark.  Google provides a commercial managed runner on its Google Cloud.  Beam builds on the work of Map Reduce, Hadoop, Flume, Spark, and Flink.  In this talk I will give an overview of the Beam programming model and briefly describe the portability framework.

— — — — — — —



Big Data System Environments: What are they perceived to do and what do they do?

Professor Geoffrey C. FoxDirector, Digital Science Center.

Associate Dean for Research at IU School of Informatics and Computing

Professor of Informatics, Computing and Physics

Indiana University, Bloomington, IN, USA



We consider Big Data Systems such as Hadoop, Spark and TensorFlow and identify what they do well (which is a lot) and where they have omissions. We consider a programming model where “every call” is wrapped by a learning framework that configures execution (auto-tuning) and learns results. We describe our big data framework Twister2 and explain where it can offer improved capabilities over current systems.

— — — — — —


What’s Next After Six Years of New Zealand’s Participation in the SKA design

A/Prof Andrew Ensor, Director of the HPC Lab at Auckland University of Technology (AUT) and Director, New Zealand SKA Alliance (NZA)

AUT University, Auckland, New Zealand


The Square Kilometre Array (SKA) is both the world’s largest mega-Science and its largest big data computing project. With long-term and ambitious scientific goals, and a growing number of member countries, it might be surprising to see that New Zealand, as a founding member, still leads key parts of the computing work. The team recently completed six year’s design work for the SKA correlator, improvements on detecting and timing pulsars, supercomputing pipelines for generating images, and scalable middleware for operating a 260 PetaFLOP computer system.

This talk will provide an update on the project’s status as phase one design wraps up, outline its computing and political challenges, and discuss some of its spillovers and next steps.

— — — — — —



Are cloud and HPC mutually compatible?

Bruno Lago, Managing Director

Catalyst Cloud, Wellington, New Zealand



OpenStack and Kubernetes have introduced an open standard API for developers and researchers to interact with IT infrastructure. This standard is proving beneficial to foster collaboration between organisations worldwide and to improve the reproducibility of research experiments.

Teams that have been operating HPC and supercomputing clusters often struggle to understand how they could benefit from these cloud-native technologies while maximising the performance and benefits they get from their HPC clusters.

In this presentation, Bruno will highlight how HPC and cloud-native technologies can be brought together to deliver the best of both worlds. Some of the topics covered in the presentation include:

* Bare-metal hosts managed by OpenStack

* Hypervisor optimisations for near bare-metal performance

* Optimisation of cloud storage for HPC

* Exposing GPUs and FPGAs to guests

* Network latency, MPI, RDMA in cloud computing

— — — — — —



Learning Systems for Science 

Prof Ian Foster

Argonne National Laboratory and the University of Chicago, USA.



New learning technologies seem likely to transform much of science, as they are already doing for many areas of industry and society. We can expect these technologies to be used, for example, to obtain new insights from massive scientific data and to automate research processes. However, success in such endeavors will require new learning systems: scientific computing platforms, methods, and software that enable the large-scale application of learning technologies. These systems will need to enable learning from extremely large quantities of data; the management of large and complex data, models, and workflows; and the delivery of learning capabilities to many thousands of scientists. In this talk, I review these challenges and opportunities and describe systems that my colleagues and I are developing to enable the application of learning throughout the research process, from data acquisition to analysis.

— — — — — —



Post-K: A Game Changing Supercomputer for Convergence of HPC and Big Data / AI

Satoshi Matsuoka

Director Riken-CCS /

Professor, Tokyo Institute of Technology. Tokyo, Japan



With rapid rise and increase of Big Data and Artificial Intelligence (BD/AI) as a new breed of high-performance workloads on supercomputers, we need to accommodate them at scale, and thus the need for R&D for HW and SW Infrastructures where traditional simulation-based HPC and BD/AI would converge, in a BYTES-oriented fashion. The TSUBAME3 supercomputer at Tokyo Institute of Technology which has become online in August 2017, embodies various BYTES-oriented features to allow for such convergence to happen at scale, including significant scalable horizontal bandwidth as well as support for deep memory hierarchy and capacity, along with high flops in low precision arithmetic for deep learning. TSUBAME3’s technologies have been commoditized to construct one of the world’s largest BD/AI focused open and public computing infrastructure called ABCI (AI-Based Bridging Infrastructure), hosted by AIST-AIRC (AI Research Center), the largest public funded AI research center in Japan. Although not a supercomputer for HPC, its Linpack ranking is No.1 in Japan and No.5 in the world, as well as embodying 550 AI-Petaflops for AI, as well as being extremely energy efficient with novel warm water cooling pod design. Finally, Post-K is the flagship next generation national supercomputer being developed in collaboration by Riken and Fujitsu. Post-K will have hyperscale class resources in one exascale machine, with well more than 100,000 nodes of server-class A64FX many-core Arm CPUs, realized through extensive co-design process involving the entire Japanese HPC community.

Post-K is slated to perform 100 times faster on some key applications c.f. its predecessor, the K-Computer, but also will likely to be the premier big data and AI/Machine Learning infrastructure. Currently, we are conducting research to scale deep learning to more than 100,000 nodes on Post-K, where we would obtain near top GPU-class performance on each node.

— — — — — —



Day 3 – THURSDAY 14th FEBRUARY 2019




The Reinvention of Edge-to-Cloud Computing

Pete Beckman, Co-Director, Northwestern-Argonne Institute for Science and Engineering. Chicago, USA

Lead, Argo project for extreme-scale operating systems and run-time software. Founder and leader of the Waggle project for smart sensors and edge computing.



Speed and scale define supercomputing. By some metrics, our supercomputers are the fastest, most capable systems on the planet. However over the last twenty years, the HPC community has lost sight of the edge — where the data is collected and initially processed. Instead of leading the race for new architectures, methods, and edge-to-cloud software stacks, we have focused on the performance of a handful of hero computations in the machine room. An improved architecture would focus on edge-to-cloud infrastructures, computing models, and networking. From a sensor in a farmer’s field to the supercomputer, we must reinvent end-to-end data movement and computation. A new kind of edge-to-cloud infrastructure is needed.

— — — — — —



Title: TBC

Vic Crone – CEO Callaghan Innovation

Auckland, New Zealand

— — — — — — —




Exploring Emerging Memory Technologies in Extreme Scale High Performance Computing

Jeffrey S. Vetter, Distinguished R&D Staff Member, founding group leader of the Future Technologies Group in the Computer Science and Mathematics Division, and the founding director of the Experimental Computing Laboratory (ExCL)

Oak Ridge National Laboratory, Knoxville, Tennessee, USA




Concerns about energy-efficiency and cost are forcing our community to reexamine system architectures, and, specifically, the memory and storage hierarchy. While memory and storage technologies have remained relatively stable for nearly two decades, new architectural features, such as deep memory hierarchies, non-volatile memory (NVM), and near-memory processing, have emerged as possible solutions.

However, these architectural changes will have a major impact on HPC software systems and applications. To be effective, software and applications will need to be redesigned to exploit these new capabilities. In this talk, I will sample these emerging memory technologies, discuss their architectural and software implications, and describe several new approaches to programming these systems. One system is Papyrus (Parallel Aggregate Persistent -yru- Storage); it is a programming system that aggregates NVM from across the system for use as application data structures, such as vectors and key-value stores, while providing performance portability across emerging NVM hierarchies.

— — — — — —




Authenticated, Partial Data Structures for Blockchain Scalability, Sustainability and Security

Mark Moir, Architect

Oracle Labs, USA – New Zealand



Using our Haskell Authenticated Modular Maps (HAMM) framework, we can specify various implementations of authenticated modular maps that enable verifying and using _partial_ map (key-value store) data structures. I will present an overview of HAMM and results we have achieved with it. I will also discuss our motivation for building HAMM, which is to enable blockchain participants to quickly receive and verify part of a map representing a blockchain “world state”. This is important for addressing several practical concerns related to Blockchain Scalability, Sustainability and Security.

— — — — — — —



Security Versus Performance

Hugo Vincent, Principal Research Engineer. Head, Security Group

Arm Research, Cambridge, UK



For many in the security, computer architecture, and operating systems communities, 2018 was a tumultuous year thanks to the constant stream of new micro-architectural side channel attacks such as Spectre and Meltdown. Due to the emergence of these new attacks, and due to wider industry trends, developers are increasingly facing difficult tradeoffs between security and performance – tradeoffs that could previously be delegated to security specialists.

This talk will present recent security trends in computer architecture and operating systems and their implications, share insights into the performance costs of mitigations, and conclude by looking forward to how the hardware/software contract may change over the coming years to enable developers to better balance their performance and security goals.

— — — — — — —



A 36 Years Perspective of HPC’s 100 Billion Performance Improvement and Some Thoughts on What Comes Next

Mark Seager, Intel Fellow, Fellow in Residence for Intel China, Director of HPC Strategy, CTO for the Technical Computing Ecosystem.

Intel, Inc. San Francisco, California, USA


We will provide a historical perspective on the advances in HPC hardware and software over the last 36 years: 1’s MegaFLOP/s to 100’s of PetaFLOP/s and proprietary or homegrown software stacks to open source almost everything.  We will also discuss applications that were enabled as a result in this 100 billion fold increase in computational capability.  We will also discuss how this has fundamentally changed scientific discovery twice and enabled a vast number of industry advances, society changes, improvement of human condition.

Looking forward we will discuss the HPC+AI+HPDA converged workflows and how this is informing both computational scientific discovery and the broader coupling with and informing experimental and theoretical aspects of the scientific method.  The converged workflows are also being driven by the virtuous cycle dynamic between converged workflow advances and the digital economy transformation.  This converged workflow and the arrival of diverse computing architectures is profoundly challenging both system architecture and applications development practices.  Many industry participants are indicating that the rate of Moore’s law improvement is slowing down and will come to an inevitable near term end.  We will discuss several reasons why this alarm is not well founded, and is eerily similar to the inaccurate near term “peak oil” production predictions over the last 20+ years.

— — — — — —



Multicore World 2017 -Some speakers and participants: Pete Beckman, Victoria Maclennan, Dave Jaggar, Michael Kelly, Nathan DeBardeleben, John Gustafson, Andreas Wicenec, JC Guzman, Balasz Gerofi, Satoshi Matsuoka, Guy Kloss, Tony Hey, Paul McKenney, Piers Harding, Michelle Simmons, Duncan Hall and others



Check Multicore World 2018 abstracts here

%d bloggers like this: