Plug-in Scheduler

Applications targeted for the DIET platform are able to exert a degree of control over the scheduling subsystem via plug-in schedulers. As the applications that are to be deployed on the grid vary greatly in terms of performance demands, the DIET plug-in scheduler facility permits the application designer to express application needs and features in order that they be taken into account when application tasks are scheduled. These features are invoked after a user has submitted a service request to the MA, which broadcasts the request to its agent hierarchy.

When an application service request arrives at a SeD, it creates a performance estimation vector, a collection of performance estimation values, that are pertinent to the scheduling process for that application. The values to be stored in this structure can be either values provided by CoRI (Collectors of Resource Information), or custom values generated by the SeD itself. The design of the estimation vector subsystem is modular; future performance measurement systems can be integrated with the DIET platform in a fairly straightforward manner.

CoRI generates a basic set of performance estimation values, which are stored in the estimation vector and identified by system-defined tags; The following table lists the tags that may be generated by a standard CoRI installation.

Information tag starts with EST_multi-valueDescription
TCOMPthe predicted time to solve a problem
TIMESINCELASTSOLVEtime since last solve has been made (sec)
FREECPUamount of free CPU between 0 and 1
LOADAVGCPU load average
FREEMEMamount of free memory (Mb)
NBCPUnumber of available processors
CPUSPEEDxfrequency of CPUs (MHz)
TOTALMEMtotal memory size (Mb)
BOGOMIPSxthe BogoMips
CACHECPUxcache size CPUs (Kb)
TOTALSIZEDISKsize of the partition (Mb)
FREESIZEDISKamount of free place on partition (Mb)
DISKACCESREADaverage time to read from disk (Mb/sec)
DISKACCESWRITEaverage time to write to disk (Mb/sec)
ALLINFOSx[empty] fill all possible fields

Application developers may also define performance values to be included in a SeD response to a client request. For example, a DIET SeD that provides a service to query particular databases may need to include information about which databases are currently resident in its disk cache, in order that an appropriate server be identified for each client request. By default, when a user request arrives at a DIET SeD, an estimation vector is created via a default estimation function; typically, this function populates the vector with standard CoRI values. If the application developer includes a custom performance estimation function in the implementation of the SeD, the DIET framework will associate the estimation function with the registered service. Each time a user request is received by a SeD associated with such an estimation function, that function, instead of the default estimation procedure, is called to generate the performance estimation values.

In the performance estimation routine, the SeD developer should store in the provided estimation vector any performance data needed by the agents to evaluate server responses. Such vectors are then the basis on which the suitability of different SeDs for a particular application service request is evaluated. Specifically, a local agent gathers responses generated by the SeDs that are its descendents, sorts those responses based on application-specific comparison metrics, and transmits the sorted list to its parent. The mechanics of this sorting process comprises an aggregation method, which is simply the logical process by which SeD responses are sorted. If application-specific data are supplied (i.e., a custom estimation function has been specified), an alternative method for aggregation is needed. Currently, a basic priority scheduler has been implemented, enabling an application developer to specify a series of performance values that are to be optimized in succession. From the point of view of an agent, the aggregation phase is essentially a sorting of the server responses from its children. A priority scheduler logically uses a series of user-specified tags to perform the pairwise server comparisons needed to construct the sorted list of server responses.