Overview

Workflow applications consist of multiple components (tasks) related by precedence constraints that usually follow the data flow between them, i.e., data files generated by one task are needed to start another task. Although this is the most common situation, precedence constraints may exist for other reasons, and be arbitrarily defined by the user.

This kind of application can be modeled as a DAG (Directed Acyclic Graph) where each vertex is a task with given input data and service name, and each edge can either be a data link between two tasks or a basic precedence constraint. The DIET workflow engine can handle that kind of workflow by assigning those tasks to SeDs using one DIET service call. This assignment is made internally and dynamically when the task is ready to be executed (i.e., all predecessors are done) depending on the service performance properties and on available resources on the grid.

A specific agent called the Master Agent DAG (MA) provides DAG workflow scheduling. This agent serves as the entry point to the DIET Hierarchy for a client that wants to submit a workflow. The language supported by the MA is based on XML and described in the section 15.4.1.

Because of large amounts of computations and data involved in some workflow applications, the number of tasks in a DAG can grow very fast. The need for a more abstract way of representing a workflow that separates the data instances from the data flow has led to the definition of a functional workflow language called the Gwendia language. A complex application can be defined using this language which provides data operators and control structures (if/then/else, loops, etc.). To execute the application, the user needs to provide both the workflow description (see 15.4.2) and a file describing the input data set. The DIET workflow engine then instantiates the workflow as one or several tasks' DAGs, sent to the MA agent to be executed in the DIET platform.

The DIET Team - Mer 29 nov 2017 15:13:36 EST