Updated 27 January 2018
Wednesday 7 February 2018
Wed 7) – 9:25 – 10:10 – Keynote
“Irreproducibility: The Causes and The Cure”
Visiting Scientist A*CRC, Singapore
The use of cloud computing resources is increasingly exposing users to the fact that results of large-scale computations are not identical across systems, or even on the same system running the same data set. Some have started using the term “floating point drift” to describe, say, how a physical simulation of the climate will produce two different results on two different HPC systems. Such inconsistency creates a lack of confidence that either simulation is correct, as well it should.
Aside from parallel programming errors that create race conditions or assume memory coherence that is not provided by the system, the causes of inconsistencies stem from key decisions in the IEEE 754 floating point standard. These include the decision that transcendental functions need not be properly rounded for all input arguments, and the encouragement of the use of “guard bits” to reduce rounding errors in a way completely invisible to the programmer. We show the sources of errors and then show a new paradigm that eliminates the irreproducibility problem, rendering calculations with real numbers as bit-for-bit reproducible as pure integer calculations. Furthermore, enforcing the paradigm actually can increase the speed of calculations instead of slowing them down.
— — — —
Wed 7) 10:45 – 11:15 – Talk
“An Update to the IHK/McKernel Lightweight Multi-kernel Operating System”
Dr. Balazs Gerofi, Research Scientist
System Software Research Team, RIKEN, Japan
RIKEN Advanced Institute for Computation Science has been leading the development of Japan’s next generation flagship supercomputer, the successor of the K. Part of this effort is to design and develop a system software stack that suits the needs of future extreme scale computing. IHK/McKernel is a multi-kernel operating system that runs Linux with a light-weight kernel (LWK) side-by-side on compute nodes with the primary motivation of providing scalable, consistent performance for large scale HPC simulations, but at the same time to retain a fully Linux compatible execution environment. We have already demonstrated McKernel’s superior performance compared to Linux on a number of HPC workloads. This talk will provide an update on recent advances of the project. We will discuss integration of multi-kernels with application containers, a crucial technology towards supporting a converged HPC/Big Data/Machine learning software stack. Furthermore, we provide an overview of PicoDriver, a novel device driver framework that allows unmodified Linux device drivers to be enhanced in a multi-kernel OS via adapting the fast-path driver code to LWK internals.
— — — —
Wed 7) 1:15 – 1:45 – Talk
“Market Engineering and Information Reuse: An overview of the New Zealand Data Commons”
Robert O’Brien, Distributed Systems Engineer, New Zealand
Technology driving big data analytics are now mainstream commodities and machine learning infrastructure is following suit. Yet to make use of these technology stacks and the associated utility computing requires access to data and algorithmic know-how. The Data Commons aims to enable access to new classes of data— data often locked away due to concerns for privacy, misuse, quality, lack of consent, or anticipated proprietary interest. We want to create network markets for data and functions across different organisations to facilitate the flow of information. Organisations often with competing interests and concerns. The aim is trusted and efficient reuse of data across many domains-of-interests; to build an ecosystem of data-services that can be dynamically composed in an accountable and transparent manner. The talk will survey various attempts at sharing data for public good, then cover the motivation, objectives, and architecture of the Data Commons; illustrating where it fits in the digital technology landscape.
— — — —
Wed 7) 1:50 – 2:25 – Talk
“Architecture for Participation”
Piers Harding, Senior Consultant
Catalyst IT, Wellington, New Zealand
As the SKA anticipates moving from a pre-construction phase to construction, focus is turning towards how a project on this scale can be managed efficiently, cost effectively, and successfully in an environment that will prove to be highly distributed both in terms of geography and team dispersal.
In order to achieve this, the SKA is looking increasingly towards modern project methodologies including the Agile principles of SAFe. However, there is an opportunity to learn from the successes that come from outside the traditional lines of comparison drawn such as large scale enterprise, and existing scientific and research endeavours. The new places to look for inspiration are the large scale open source projects such as OpenStack, Linux, Moodle, Python, Docker, GitLab where distributed participation, rigorous quality control, and routine and regular release cycles are the norm.
This talk looks at how these projects achieve their goals, and what SKA can draw from these experiences to build a successful collaborative, and enduring community project.
— — — —
Wed 7) 2:30 – 3:00 – Talk
“Future data demands for agriculture, food production and environmental management”
Ian Yule, Professor in Precision Agriculture
Massey University, Palmerston North, New Zealand
Adoption of technology in our food production and environmental management processes is becoming increasingly evident. Precision Agriculture (PA) demonstrates how increased productivity and improved quality outcomes can be achieved. One of the important aspects of PA is measurement of performance both in terms of production and environmental impact. PA used to be seen in isolation but now tracking products through the value chain is becoming more important as is demonstrating product provenance. This has allowed the transition to Digital Agriculture (DA) where groups such as food manufacturers, environmental agencies, policy makers, farmers and agronomists need to be digitally linked to the final consumer, with the objectives of each clearly understood. Examples of information flows will be demonstrated.
Developments in sensing have meant that more advanced analytical techniques are required to process information in a timely and cost effective manner to a number of end use purposes. The level of complexity appears to be growing exponentially as is the quantity of data included, with increasing resolution in terms of temporal, spatial and spectral resolutions as examples. Big data is with us and this is again driving further requirements for integration across time and spatial scales.
Scalability is also hugely important for a number of groups, where IT can be directed to take out cost and provide an appropriate service. Technology is also changing in terms of its ability to reach over physical distances with technology trends such as 5G allowing highly complex distributed systems to be operated. IoT also appears to be coming of age with low cost wireless technology. The level of automation is also likely to grow and the technology for autonomous tractors and equipment is here now.
It will be interesting to observe how consumer behaviour is altered by technology, we are already seeing many retail developments and the most intensive level of investment in the AgTech space is going into retail. Will we see distributed systems growing food in cities or even our own apartment, or will convenience win the day for a time poor consumer.
Lots to ponder.
— — — —
Wed 7) 4:30 – 5:00 – Talk
“HPC and nanotech: a tale of cooperative exponential technologies”
Deputy Director of the MacDiarmid Institute for Advanced Materials and Nanotechnology, New Zealand
In this talk I’ll give a survey of the different research efforts within the MacDiarmid Institute for Advanced Materials and Nanotechnology (a NZ Centre of Research Excellence) that rely on HPC resources. I’ll also reflect on how these technologies intertwine on different levels; a central research theme of the institute being ‘Tomorrow’s Electronic Devices’, for example, while our ‘Energy’ and ‘Functional Nanostructures’ themes provide complementary research skills and questions. Finally, I’ll give some idea of the future prospects for these types of materials based technologies in the New Zealand context.
— — — —
Wed 7) 5:00 – 5:30 – Talk
“A Brave New Parallel World”
Drs. Ruud van der Pas, Distinguished Engineer
Oracle, Santa Clara, CA, USA
All current parallel programming models fail to deliver extreme scalability. In particular if ease of use is considered as well.
In this short talk we propose a new parallel programming paradigm that provides the unification of both worlds. Based upon an innovative and breakthrough concept, this model supports flexible, yet powerful, control to a wide range of applications and developers.
This presentation is meant to be interactive and encourages the audience to provide feedback.
— — — — — — — — — — — —
Thursday 8 February 2018
Thu 8) 8:45 – 9:30 – Keynote
“HPC is here, tools are not: how long do we have to wait”
Dr. Mohamad S El-Zein, John Deere Fellow
Manager, Advanced Materials & Mechanics, John Deere Inc., Moline, IL, USA
In Corporate culture time is money. The faster the product is driven into the market, the more cost opportunities become available. Tools available to engineers in the areas of Structural analysis, CFD, acoustics, Discrete Element Methods, manufacturing process simulations, data analytics, and many other domains are used heavily during the product design, manufacturing, and operations. However, many of these tools were written before MPP, GPUs, and huge supercomputers available around the world. But incorporations have to balance speed versus cost, here where things get interesting and the act of balancing dollars versus tools versus cores. A run through what is there and their capabilities versus what is delivered to enterprises will be presented.
— — — —
Thu 8) 10:45 – 11:25 – Talk
“Te Ao Tūroa: Harnessing the value and power of data”
Dr. Vicki Compton, Principal Advisor, Policy & Trade Branch
Ministry for Primary Industries, Wellington, New Zealand
The Ministry for Primary Industries released Te Ao Tūroa – the Primary Sector Science Roadmap, in June 2017. The Roadmap aims to provide a long-term view (10-20 years) of primary sector science and technology needs. It provides guidance on key priorities for all those investing in research and development related to the primary industries in New Zealand. It covers all of New Zealand’s primary industries across the whole of the value chain, including food and fibre, and land and water-based production systems. The Roadmap opens up many future opportunities that will be reliant on digital technologies and the collection, manipulation, integration and interpretation of data collected and connected in ways never before imagined. In this presentation I will outline some of the key needs of the primary sector in New Zealand and internationally as they relate to food supply and security, biosecurity, environmental sustainability and climate change management and adaptation. I would then like to open up the floor for discussion as to how we need to move forward to best harness the value, and realise the potential power, of data for a preferred future.
— — — —
Thu 8) 1:40 – 2:15 – Talk
“Computing challenges of next decade’s largest mega-science project: the Square Kilometre Array (SKA)”
Dr. Andrew Ensor, Director, New Zealand SKA Alliance (SKA)
AUT University, Auckland, New Zealand
The Square Kilometre Array is the largest mega-Science project of the next decade and presents some unprecedented computing challenges. As New Zealand’s first venture in a mega-Project with lead roles, it also represents numerous firsts for the country. This talk will outline the project, New Zealand’s key involvements in its computing design over the past five years, and updates on its progress as the construction phase draws closer.
— — — —
Thu 8) 2:20 – 3:00 – Talk
“The Discovery of the Higgs boson and the Big Data problem”
National Contact Physicist of South Africa at the ATLAS experiment (CERN).
Chairman of the South Africa group at ATLAS.
Director of the High throughout electronics laboratory at the University of the Witwatersrand, Johannesburg, South Africa
The discovery of the Higgs boson by experiments at the Large Hadron Collider is strongly connected with the Big Data problem. The Large Hadron Collider collides clouds of protons at a rate of 40 MHz. Every time two clouds of protons go through each other tens of proton-proton collisions take place. This leads to data flows of the order of petabytes per second and experiment. Fast real-time decisions are made to strongly reduce data flows that can be shipped to long-term storage for off-line processing. Data in storage has reached the Exabyte scale. Implemented solutions for these challenges in real-time data processing, distributed and cloud computing will be reviewed. Prospects for the upgrade of the Large Hadron Collider in the 20s will also be discussed.
— — — —
Thu 8) 4:30 – 5:15 – Keynote
“Gravitational waves and HPC: the role of computing and collaboration in opening a new field of astronomy”
Dr. Dan Stanzione, Executive Director of the Texas Advanced Computing Centre (TACC)
The University of Texas at Austin, USA
On September 14, 2015 the two detectors of the Laser Interferometer Gravitational-Wave Observatory (LIGO) made the first direct observation of gravitational waves, from two merging black holes. On August 17, 2017 the LIGO and Virgo observatories detected gravitational waves from the merging of two neutron stars, an event seen as both a short gamma-ray burst and subsequent kilonova by space and ground-based observatories. These and other discoveries mark the beginning of gravitational wave astronomy. Long before these discoveries took place, the computing framework had to be put in place, and doing that effectively and efficiently required close collaborations between the science teams and an HPC center to build an optimized software stack. In this talk we will highlight what we have learned through this collaboration, pointing out many of the ways in which high-throughput and high-performance computing have been essential to its progress — and of course, we will look at some black hole collisions.
— — — —
Friday 9 February 2018
Fri 9) 8:45 – 9:30 – Keynote
“Shared Memory Parallel Performance To The Extreme”
Drs. Ruud van der Pas, Distinguished Engineer
Oracle, Santa Clara, CA, USA
In this talk, we explore the boundaries of shared memory parallelization.
Through several real-world case studies we show what it takes to make TeraByte sized memory problems scale when using up to 2,048 threads in a single SPARC based system.
As will be explained and demonstrated, to achieve this kind of scalability, memory driven optimizations are crucial, but processor pipeline intensive operations cannot be ignored either. Both need to be considered for truly scalable performance.
— — — —
Fri 9) 9:40 – 10:10 – Talk
“Why Multicore Needs Software Engineering and What it can Get.”
Dr. Richard O’Keefe, Research Lead
Open Parallel Ltd, Dunedin, New Zealand
You can buy a 64-bit computer for NZ$ 70.00 that is a thousand times faster and has a thousand times more memory than the computers used by major banks 40 years ago. For another NZ$ 100.00 you can give it twenty thousand times as much persistent storage. This is literally a supercomputer (Cray-3) in your pocket. An application that demands substantially more computing power than this must be one of economic value, so that wrong answers come at significant moral or financial cost. This means that it makes sense to spend at least as much effort on software design as hardware.
The first part of the talk gives examples from existing codes showing poor software engineering practice.
The second part discusses what software engineering can currently offer to multicore programmers and what’s on the horizon.
— — — —
Fri 9) 10:45 – 11:15 – Talk
“Solving the world’s problems with good abstractions: How’s that going?”
Dr. Mark Moir, Architect
Oracle Labs, USA – New Zealand.
In my Multicore World 2015 talk, I asked: Can the world’s problems be solved using good abstractions? How about without? In this talk, I will revisit this question, touching on topics including blockchain technology, smart contracts, privacy/confidentiality, governance, accountability and regulation.
— — — —
Fri 9) 1:35 – 2:10 – Talk
“System administration and security for TACC’s high performance IO cluster: Wrangler”
Nicholas Thorne, Research Engineer
Texas Advanced Computing Centre (TACC), The University of Texas at Austin, USA
TACC’s Wrangler cluster was commissioned in 2014/15 with the intention of being able to achieve 1TB/s bandwidth to the storage back-end and dedicate this cluster towards workload like MapReduce and other IO bound applications. The hardware supporting the storage back-end are DSSD d5 devices consisting of 36 2TB flash modules per device. These get split into two 0.25PB high speed shared filesystems, one running Hadoop FS and the other running GPFS. This gives users flexibility to attempt workflows that don’t have extreme performance penalties if part of the workflow includes small read access of many files.
This talk will outline the system configuration and discuss some notable use cases, some of the early adoption challenges and some of the security related concerns and mitigations. These consist of areas where Wrangler deviates from the “regular” TACC clusters:
1) Portal based Hadoop cluster instantiation translates to much longer run-times which influence maintenance.
2) Including data archive space along with traditional scratch
3) Kernel-tied software components delay the patching cycles
4) Site wide security and access changes implemented on Wrangler
Wrangler is now beyond half of its expected lifespan so changes have slowed and we research whether a refresh is desired and what new options exist to support high performance IO clustering.
— — — —
Fri 9) 2:15 – 2:50 – Talk
“HPC platform efficiency and the challenges for a system builder”
Drs. Martin Hilgeman, EMEA Technical Director, High Performance Computing (HPC)
Dell EMC, Amsterdam, Netherlands
“With all the advances in massively parallel and multi-core computing with CPUs and accelerators it is often overlooked whether the computational work is being done in an efficient manner. This efficiency is largely being determined at the application level and therefore puts the responsibility of sustaining a certain performance trajectory into the hands of the user. It is observed that the adoption rate of new hardware capabilities is decreasing and lead to a feeling of diminishing returns. This presentation shows the well-known laws of parallel performance from the perspective of a system builder. It also covers through the use of real case studies, examples of how to program for energy efficient parallel application performance.”
— — — —
Fri 9) 4:30 – 5:15 – Keynote
“TSUBAME 3 and ABCI: HPC meets Big Data/AI and further advances into the Post-Moore Era”
Professor, Tokyo Institute of Technology
Fellow, Advanced Institute for Science and Technology (AIST), Japan
Director, Joint AIST-Tokyo Tech Open Innovation Lab on Real World Big Data Computing
With rapid rise and increase of Big Data and AI as a new breed of high-performance workloads on supercomputers, we need to accommodate them at scale, and traditional simulation-based HPC and BD/AI will converge.
Our TSUBAME 3 supercomputer at Tokyo Institute of Technology became online in Aug. 2017, and became the greenest supercomputer in the world on the Green 500 ranking at 14.11 GFlops/W; the other aspect of TSUBAME 3, is to embody various Data or “BYTES-oriented” features to allow for HPC to BD/AI convergence at scale, including significant scalable horizontal bandwidth as well as support for deep memory hierarchy and capacity, along with high flops in low precision arithmetic for deep learning.
Furthermore, TSUBAME 3’s technologies will be commoditized to construct one of the world’s largest BD/AI focused and “open-source” cloud infrastructure called ABCI (AI-Based Bridging Cloud Infrastructure), hosted by AIST-AIRC (AI Research Center), the largest public funded AI research center in Japan. The performance of the machine is slated to be several hundred AI-Petaflops for machine learning; the true nature of the machine however, is its BYTES-oriented, optimization acceleration in the memory hierarchy, I/O, the interconnect etc, for high-performance BD/AI. ABCI will be online Spring 2018 and its architecture, software, as well as the datacenter infrastructure design itself will be made open to drive rapid adoptions and improvements by the community, unlike the concealed cloud infrastructures of today.
Finally, transcending from FLOPS-centric mindset to being BYTES-oriented will be one of the key solutions to the upcoming “end-of-Moore’s law” in the mind 2020s, upon which FLOPS increase will cease and BYTES-oriented advances will be the new source of performance increases over time in general for any computing.
— — — —