Batch system

Generally, a parallel resource is managed by a batch system, and jobs are submitted to a site queue. The batch system is responsible for managing parallel jobs: it schedules each job, and determines and allocates the resources needed for the execution of the job.

There are many batch system, among which Torque6.2 (a fork of PBS6.3), Loadleveler6.4(developped by IBM), Oracle Grid Engine6.5(formerly SunGrid Engine6.6: SGE, developped by Sun), OAR6.7 (developped at the IMAG lab). Each one implements its own language syntax (with its own mnemonics), as well as its own scheduler. Jobs can generally access the identity of the reserved nodes through a file during their execution, and are assured to exclusively possess them.

The DIET Team - Mer 29 nov 2017 15:13:36 EST