Sustainable Ultra Scale compuTing, dAta and energy Management
Ultra-scale systems are envisioned to comprise parallel and distributed computing infrastructure that will be two to three orders of magnitude larger than today’s systems, which will require cross fertilization among technologies including HPC, large-scale distributed systems, and big data management. Towards designing exascale infrastructure and while managing big data, researchers and industry are realizing that in addition to performance, they also need to consider other optimization criteria, such as sustainability, quality of service, fault tolerance, quality of usage, quality of experience, and energy efficiency. This poses several research challenges associated with ultra-scale systems that will be addressed by the “Sustainable Ultra Scale compuTing, dAta and energy Management” (SUSTAM) associate team:
- How to efficiently manage federations of large-scale heterogeneous resources?
- How to trace, trust and manage big data?
- How to profile energy usage and design energy-aware runtime systems?
SUSTAM aims to design a multi-criteria orchestration framework that manages resources, data and energy consumption in an efficient manner. The SUSTAM associate team will enable a long-term collaboration between the Inria Avalon and the RDI² team (Rutgers University). It will allow the teams to coordinate efforts and pursue common research activities in topics such as sustainable software solutions, resource and big-data management, elasticity of stream and batch applications, and energy efficiency. The involved members will contribute to the design of a common architecture and framework with components and algorithms adapted to various contexts.
The collaboration proposed by the SUSTAM associate team focuses on aspects of sustainability in ultra-scale systems. This focus reflects on the various mutual research interests of both RDI² and Avalon. The main goal of SUSTAM is to build a multi-criteria orchestration framework able to support the design and provision of sustainable ultra-scale systems. The main research direction include the exploration and research on three specific components, namely Large Scale Resource Management, Big Data Management and Energy Management. These components, detailed next, are the foundations for designing a common architecture and umbrella framework for sustainable ultra-scale systems. The three addressed components benefit from the expertise in both involved research teams.
Large Scale Resource Management
By focusing on schedulers and resource managers, SUSTAM explores the interfaces of middleware able to handle the heterogeneity of computing elements (GPUs, low-level processors, high-performance accelerators).
Big Data Management
This component improves the understanding of the data lifecycle, enable traceability of operations, and devise elastic and energy-efficient solutions for data stream processing.
Both Avalon and RDI² have experts on energy efficiency. The collaboration allows for pushing such topic to a new level by including energy proportionality capabilities; large and precise energy profiling combined with autonomic management of energy management capabilities and levers. Under this topic, SUSTAM also explores how an ultra-scale system can efficiently provision resources considering multiple energy sources (brown and renewable energy).