DIET: The Grid and Cloud Middleware

Empowering high-performance computing since 2000

DIET (Distributed Interactive Engineering Toolbox) is a middleware designed for high-performance computing in a heterogeneous and distributed environment (workstations, clusters, grids, clouds).

DIET was created and is still actively improved and updated by an open-source community led by the AVALON research team.

Introduction to DIET

Among the existing approaches for grid middleware, a simple, powerful and flexible one consists in using the servers available in different administrative domains through the traditional client-server or Remote Procedure Call (RPC) paradigms. Network-Enabled Servers (NES) implement this model, also called Grid-RPC. Clients submit computation requests to a scheduler whose goal is to find a server available on the resources.

The aim of the DIET project is to develop a set of tools to build computational servers. Huge problems can now be computed over the Internet thanks to Grid Computing Environments – like Globus or Legion – or through Cloud solutions – such as Amazon EC2. Because most of current applications are numerical, the use of libraries like BLAS, LAPACK, ScaLAPACK or PETSc is mandatory. The integration of such libraries in high level applications using languages like Fortran or C is far from being easy. Moreover, the computational power and memory needs of such applications may of course not be available on every workstation. Thus, the RPC seems to be a good candidate to build Problem Solving Environments on the Grid.

Context of DIET

Large problems ranging from numerical simulation to life science can now be solved through the Internet using grid middleware. Several approaches exist for porting applications to grid platforms; examples include classic message-passing, batch processing, web portals, and Grid-RPC systems. This last approach implements a grid version of the classic Remote Procedure Call (RPC) model. Clients submit computation requests to a scheduler that locates one or more servers available on the grid. Scheduling is frequently applied to balance the work among the servers and a list of available servers is sent back to the client; the client is then able to send the data and the request to one of the suggested servers to solve their problem. Thanks to the growth of network bandwidth and the reduction of network latency, relatively small computation requests can now be sent to servers available on the grid. To make effective use of today’s scalable resource platforms, it is important to ensure scalability in the middleware layers.

The DIET project is focused on the development of scalable middleware with initial efforts focused on distributing the scheduling problem across multiple agents. DIET consists of a set of elements that can be used together to build applications using the Grid-RPC paradigm. This middleware is able to find an appropriate server according to the information given in the client’s request (e.g. problem to be solved, size of the data involved), the performance of the target platform (e.g. server load, available memory, communication performance) and the local availability of data stored during previous computations. The scheduler is distributed using several collaborating hierarchies connected either statically or dynamically (in a peer-to-peer fashion). Data management is provided to allow persistent data to stay within the system for future re-use. This feature avoids unnecessary communication when dependencies exist between different requests.