Big Data System Environments:
What are they perceived to do and what do they do?
Professor Geoffrey C. Fox, Director, Digital Science Center.
Associate Dean for Research at IU School of Informatics and Computing
Professor of Informatics, Computing and Physics
Indiana University, Bloomington, IN, USA
We consider Big Data Systems such as Hadoop, Spark and TensorFlow and identify what they do well (which is a lot) and where they have omissions. We consider a programming model where “every call” is wrapped by a learning framework that configures execution (auto-tuning) and learns results. We describe our big data framework Twister2 and explain where it can offer improved capabilities over current systems.
Wednesday 13th February 2019, 10:45 am – 11:30 am – Schedule
- M.A. at Cambridge, 1968
- Ph.D. in Theoretical Physics at Cambridge, 1967
- B.A. in Mathematics at Cambridge, 1964
Fox received a Ph.D. in Theoretical Physics from Cambridge University and is now distinguished professor of Informatics and Computing, and Physics at Indiana University where he is director of the Digital Science Center, Chair of Department of Intelligent Systems Engineering and Director of the Data Science program at the School of Informatics, Computing, and Engineering.
He previously held positions at Caltech, Syracuse University and Florida State University after being a postdoc at the Institute of Advanced Study at Princeton, Lawrence Berkeley Laboratory and Peterhouse College Cambridge.
He has supervised the Ph.D. of 68 students and published around 1200 papers in physics and computer science with an index of 70 and over 26000 citations.
He currently works in applying computer science from infrastructure to analytics in Biology, Pathology, Sensor Clouds, Earthquake and Ice-sheet Science, Image processing, Deep Learning, Manufacturing, Network Science and Particle Physics. The infrastructure work is built around Software Defined Systems on Clouds and Clusters. The analytics focuses on scalable parallelism.
He is involved in several projects to enhance the capabilities of Minority Serving Institutions. He has experience in online education and its use in MOOCs for areas like Data and Computational Science.
He is a Fellow of APS (Physics) and ACM (Computing).
- Data Analytics and Parallel Computing
- Cyberinfrastructure and e-Science
- Deep Learning and Intelligent Systems
- Software Defined Computer Systems
- Cyber-physical Systems
- Data Science
- High Performance Computing
- Parallel and Distributed Computing
Keynote – Multicore World 2016
“Big Data HPC Convergence”
Prof Geoffrey C. Fox (Indiana University) – USA
Two major trends in computing systems are the growth in high performance computing (HPC) with an international exascale initiative, and the big data phenomenon with an accompanying cloud infrastructure of well publicized dramatic and increasing size and sophistication. In studying and linking these trends one needs to consider multiple aspects: hardware, software, applications/algorithms and even broader issues like business model and education. In this talk we study in detail a convergence approach for software and applications / algorithms and show what hardware architectures it suggests. We start by dividing applications into data plus model components and classifying each component (whether from Big Data or Big Simulations) in the same way. These leads to 64 properties divided into 4 views, which are Problem Architecture (Macro pattern); Execution Features (Micro patterns); Data Source and Style; and finally the Processing (runtime) View. We discuss convergence software built around HPC-ABDS (High Performance Computing enhanced Apache Big Data Stack) http://hpc-abds.org/kaleidoscope/ and show how one can merge Big Data and HPC (Big Simulation) concepts into a single stack. We give examples of data analytics running on HPC systems including details on persuading Java to run fast. Some details can be found at