Research Subject and Motivations
The exponential growth of the processing power has fueled decades of continuous improvement of scientific applications and enabled addressing new societal challenges. Consequences are manifold, impacting various research field from climate modeling to brain analysis, as well as nuclear fusion. Applications have become able to process bigger data more accurately over the years, no longer thanks to the frequency growth but to the increasing number of cores. Next-generation exascale supercomputers are expected to supply thousands of cores per computing nodes with deep and complex cache hierarchies and non-uniform accesses using many-cores (e.g. Intel Xeon Phi) and dedicated accelerators (e.g. GPU).
However, harnessing such architectures is challenging, raising the need for adapted programming models.
HPC task-based scheduling systems have been designed to ease reaching high performance on complex hardware as well as performance portability. Applications are described as graphs of tasks with ordering constraints, called dependencies. The runtime is responsible for efficiently scheduling tasks on the available resources (e.g. CPU/GPU cores). This enables an efficient use of available resources, especially for irregular computations. However, the approach does not consider software engineering needs such as the reusability, maintainability, and extensibility of HPC codes. Without special attention, development costs can explode, because of duplication of efforts, multiplication of concerns in the same code, etc. Otherwise, the code could become obsolete, in the worse case even before it becomes feature-rich enough to be useful.
Component-based software engineering (CBSE) is a branch of software engineering that proposes to build applications by assembling independent software building blocks (components) with well-defined interfaces. Components are instantiated and their interfaces connected to form an assembly. This enables easy reuse of components and architectural-level modifications of applications through their assembly. Thus, CBSE highly improves application maintainability and adaptability. Component models with low overheads have been proposed for high-performance computing, such as CCA or L²C. However, current HPC component models offer limited ways to deal with (fine-grained) parallel composition.
The thesis proposes Comet, a programming model based on both software component models and task-based scheduling systems. It aims at writing fully independent codes efficiently coupled at runtime. The model focus on shared-memory platforms, but support MPI for inter-node communications (thanks to the MPI connectors inherited from L²C). The model and its implementation are evaluated both in term of performance and software engineering aspects on a set of synthetic use-cases as well as on a subpart of the production application Gysela5D.