yves robert  

Talk: Scheduling Matters

Yves Robert

ENS Lyon, France and Univ. Tenn. Knoxville, USA

https://graal.ens-lyon.fr/~yrobert/

This talk will review a few scheduling algorithms to solve simple computational problems on large-scale platforms. Faults, energy/power shortage, I/O contention, the constraints are numerous and challenging. The talk will provide a few answers and discuss open research directions.

Short Bio: Yves Robert received the PhD degree from Institut National Polytechnique de Grenoble. He is currently a full professor in the Computer Science Laboratory LIP at ENS Lyon. He is the author of 7 books, 150 papers published in international journals, and 240 papers published in international conferences. He is the editor of 11 book proceedings and 13 journal special issues. He has advised 30 PhD students. His main research interests are scheduling techniques and resilient algorithms for large-scale platforms. Yves Robert served on many editorial boards, including IEEE TPDS, JPDC and ACM TOPC. He is a Fellow of the IEEE. He was elected a Senior Member of Institut Universitaire de France in 2007 and renewed in 2012. He was awarded the 2014 IEEE TCSC Award for Excellence in Scalable Computing, and the 2016 IEEE TCPP Outstanding Service Award. He holds a Visiting Scientist position at the University of Tennessee Knoxville since 2011.

 

parashar  

Talk: Big Data at Extreme-Scales: Addressing Computational Challenges in the 21st Century

Manish Parashar

Rutgers University, USA

http://parashar.rutgers.edu/

Data-related challenges are quickly dominating computational and data-enabled sciences and are limiting the potential impact of scientific application workflows enabled by current and emerging extreme scale, high-performance, distributed computing environments. These data-intensive application workflows involve dynamic coordination, interactions and data coupling between multiple application processes that run at scale on different resources, and with services for monitoring, analysis and visualization and archiving, and present challenges due to increasing data volumes and complex data-coupling patterns, system energy constraints, increasing failure rates, etc. In this talk I will explore some of these challenges and investigate how solutions based on data sharing abstractions, managed data pipelines, data-staging service, and in-situ / in-transit data placement and processing can be used to help address them. This research is part of the DataSpaces project at the Rutgers Discovery Informatics Institute.

Short Bio: Manish Parashar is Distinguished Professor of Computer Science at Rutgers University. He is also the founding Director of the Rutgers Discovery Informatics Institute (RDI2). His research interests are in the broad areas of Parallel and Distributed Computing and Computational and Data-Enabled Science and Engineering. Manish is the founding chair of the IEEE Technical Consortium on High Performance Computing (TCHPC), Editor-in-Chief of the IEEE Transactions on Parallel and Distributed Systems, and serves on the editorial boards and organizing committees of a large number of journals and international conferences and workshops. He has over 350 publications, has deployed several software systems that are widely used, and has received a number of awards for his research and leadership. Manish is Fellow of AAAS, Fellow of IEEE/IEEE Computer Society and ACM Distinguished Scientist.

 

 

Talk: Extreme-Scale Earthquake Simulation on Sunway TaihuLight

Haohuan Fu

Tsinghua University, China

http://thuhpgc.org/index.php/Haohuan_Fu

This talk would first introduce and discuss the design philosophy about the Sunway TaihuLight system, and then describe our recent efforts on performing earthquake simulations on such a large-scale system. Our work in 2017 accomplished a complete redesign of AWP-ODC for Sunway architectures, achieves over 15% of the system's peak, better than the 11.8% achieved by a similar software running on Titan, whose byte to flop ratio is 5 times better than TaihuLight. The extreme cases demonstrate a sustained performance of over 18.9 Pflops, enabling the simulation of Tangshan earthquake as an 18-Hz scenario with an 8-meter resolution. Our recent work further improves the simulation framework with capabilities to describe complex surface topography, and to drive building damage prediction and landslide simulation, which are demonstrated with a case study of the Wenchuan earthquake with accurate surface topography and improved coda wave effects.

Short Bio: Haohuan Fu is the deputy director of the National Supercomputing Center in Wuxi, leading the research and development division. He is also an associate professor in the Ministry of Education Key Laboratory for Earth System Modeling, and Department of Earth System Science in Tsinghua University, where he leads the research group of High Performance Geo-Computing (HPGC). Fu has a PhD in computing from Imperial College London. Since joining Tsinghua in 2011, Dr. Fu has been working towards the goal of providing both the most efficient simulation platforms and the most intelligent data management and analysis platforms for geoscience applications. His research has, for example, led to efficient designs of atmospheric dynamic solvers for both Tianhe-1A, Tianhe-2, Sunway TaihuLight supercomputers, and the reconfigurable computing platforms. The work based on the Sunway TaihuLight supercomputer manages to scale a fully-implicit solver to over 10 million cores, which won the Gordon Bell Prize of SC16.