Arnold Rosenberg
IC-Optimal Schedules that Accommodate Heterogeneous Clients
Abstract:
Initial results are presented concerning the problem of clustering the tasks of
computation-dags so as to accommodate the heterogeneity of remote clients in
Internet-based computing. Earlier work has developed the underpinnings of
IC-Scheduling theory, an algorithmic framework for scheduling computations having
intertask dependencies for Internet-based computing (IC, for short). The goal of
the schedules produced by the Theory is to render tasks eligible for execution at
the maximum possible rate, with the dual aim of: (a) utilizing remote clients'
computational resources well, by enhancing the likelihood of having work to
allocate to an available client; (b) lessening the likelihood of a computation's
stalling for lack of tasks that are eligible for execution. The current paper
addresses the problem of enhancing IC-Scheduling theory so that it can accommodate
the varying
computational resources of different remote clients. The techniques developed here
accomplish this goal by clustering a computation's tasks so as to render the
computation multi-granular. After a few ad hoc examples, several task-clustering
strategies are developed which exploit the structure of the computation being
performed. These strategies range from one that works for any dag, to ones that
build increasingly on the explicit structure of the dag being scheduled.
slides (pdf)
Jon Weinberg
Symbiotic Space-Sharing: Mitigating Resource Contention on SMP Systems
Abstract:
Symbiotic space-sharing is a technique that can improve system
throughput by executing parallel applications in combinations and
configurations that alleviate pressure on shared resources. In this
talk, we explore the opportunity for symbiotic schedulers and discuss
initial findings and prototype implementations. We show that such
schedulers can improve throughput by 20\% over conventional
space-sharing when resource bottlenecks are known. We then discuss
approaches for exposing these bottlenecks to the scheduler, including
through user-provided job script hints as well as fine-grained, memory
signatures extracted from applications. Lastly, we discuss additional
studies and components necessary to realize symbiotic scheduling in
HPC environments.
slides (ppt) slides (pdf)
Cynthia Lee
Studies of the User-Scheduler Relationship.
Abstract:
An important prerequisite to effective resource management is productive interaction between the user and scheduler. Unfortunately, relatively little scheduling research directly studies the users themselves. This talk will cover user survey-based studies we have conducted, which probe several aspects of the user-scheduler relationship, and propose solutions to three of the most vexing barriers between the two. First, users' monetary valuation of compute time and schedule turnaround time is examined in terms of a utility function. Second, responsiveness of the scheduler to users' varied valuations is optimized via a genetic algorithm heuristic, creating a controlled market for computation. Finally, the chronic problem of inaccurate user runtime requests, and its implications for scheduler performance, is examined, along with mitigation techniques.
slides (ppt)
Emmanuel Jeannot
Handling collusion in desktop grid
Abstract:
By exploiting idle time on volunteer's machines, desktop grids provide a
way to execute large sets of computational tasks while requiring
negligible maintenance and extra cost. Although, it constitutes a
valuable resource for low-budget projects, relying on external resources
can be hazardous and ensuring the validity of the obtained results is
mandatory. We will present a taxonomy of existing and possible risks by
considering several aspects of workers faultiness (maliciousness,
collusion, dynamicity). Then, we propose a general framework for
scheduling tasks in order to obtain reliable results. Our final solution
contains mechanisms for the scheduling and answer selection processes
which use both a trust-based system that characterizes workers previous
behaviors. Joint work with Louis-Claude Canon and Jon Weissman.
slides (pdf)
Jack Dongarra
Multicore Scheduling and Programming
slides (pdf)
Denis Trystram
Multi-users, multi-organizations, multi-objectives : a single perspective.
Abstract:
The new computing platforms, like grids composed of multiple clusters,
are characterized by new features that force to revisit the scheduling
problem. Classical scheduling algorithms are mostly centralized. This
centralized character allows to design more efficient algorithms
(consider for instance the PTAS of Shmoys for the basic P||cmax
problem). However, the platforms are intrinsically distributed. Each
cluster is under the administration of a separate « organization »
that has its own rules. Moreover, the resources (processors) are
shared between several users that have also their own selfish goals.
In this talk, we will first discuss the limitations of the existing
scheduling algorithms. Then, the concept of cooperation is introduced
both between organizations and users. On the first hand, we propose an
approximation algorithm for the global makespan without disturbing the
local objectives. On the second hand, we study the problem of
optimizing simultaneously the objectives of many users within the same
organization. Inapproximability results show that we can not expect
too much in such a case. Finally, the mixed case of multi-user
competing for resources for several organizations is addressed.
slides (ppt) slides (pdf)
Kunal Agrawal
Adaptive Scheduling with Parallelism Feedback
Abstract:
Multiprocessor scheduling in a shared multiprogramming environment is
often structured as two-level scheduling, where a kernel-level job
scheduler allots processors to jobs and a user-level task scheduler
schedules the work of a job on the allotted processors. In this
context, the number of processors allotted to a particular job may
vary during the job's execution, and the task scheduler must adapt to
these changes in processor resources. For overall system efficiency,
the task scheduler should also provide parallelism feedback to the job
scheduler to avoid the situation where a job is allotted processors
that it cannot use productively.
We present an adaptive task scheduler for multitasked jobs with
dependencies that provides continual parallelism feedback to the job
scheduler in the form of requests for processors. Our scheduler
guarantees that a job completes near optimally while utilizing at
least a constant fraction of the allotted processor cycles. Our
scheduler can be applied to schedule data-parallel programs, such as
those written in High Performance Fortran (HPF), *Lisp, C*, NESL, and
ZPL.
Our analysis models the job scheduler as the task scheduler's
adversary, challenging the task scheduler to be robust to the system
environment and the job scheduler's administrative policies. For
example, the job scheduler can make available a huge number of
processors exactly when the job has little use for them. To analyze
the performance of our adaptive task scheduler under this stringent
adversarial assumption, we introduce a new technique called ``trim
analysis,'' which allows us to prove that our task scheduler performs
poorly on at most a small number of time steps, exhibiting
near-optimal behavior on the vast majority.
To be precise, suppose that a job has work $T_1$ and critical-path
length $T_infty$ and is running on a machine with $P$ processors.
Using trim analysis, we prove that our scheduler completes the job in
$O(T_1/Padj + T_infty + L lg P)$ time
steps, where $L$ is the length of a scheduling quantum and
$Padj$ denotes the $O(T_infty + L lg P)$-trimmed availability.
This quantity is the average of the
processor availability over all time steps excluding the $O(T_infty +
L lg P)$ time steps with the highest processor availability.
When $T_1/T_infty gg Padj$ (the job's
parallelism dominates the $O(T_infty + L lg P)$-trimmed
availability), the job achieves nearly perfect linear speedup.
Conversely, when $T_1/T_infty ll Padj$, the asymptotic running time
of the job is nearly the length of its critical path.
slides (pdf)
Olivier Beaumont
Scheduling Divisible Workloads on Heterogeneous Platforms under Bounded Multi-Port Model
Abstract:
In this talk, we discuss complexity issues for scheduling divisible
workloads on heterogeneous systems under the bounded multi-port
model. To our best knowledge, this paper is the first attempt to
consider divisible load scheduling under a realistic communication
model, where the master node can communicate simultaneously to several
slaves, provided that bandwidth constraints are not exceeded. We
concentrate on one round distribution schemes, where a given node
starts its processing only once all data has been received. Our main
contributions are (i) the proof that processors start working
immediately after receiving their work (ii) the study of the optimal
schedule in the case of 2 processors and (iii) the proof that
scheduling divisible load under the bounded multi-port model is
NP-complete. This last result strongly differs from divisible load
literature and represents the first NP-completeness result when
latencies are not taken into account.
slides (pdf)
Paulo Detti
Scheduling unreliable jobs on parallel machines
Abstract:
An allocation and sequencing problem is considered in which
a set of independent jobs have to be processed by a set of
machines. Each job is characterized by a certain success probability
and a reward, which is obtained if the job is successfully carried
out. When a job fails during processing, the machine is blocked and
the jobs subsequently scheduled on that machine are blocked until the
end of the unsupervised period. The problem is to assign and sequence
the jobs on the machines so that the expected total reward is
maximized. Connections with other problems, complexity and algorithmic
results are presented. In particular, we propose a polyhedral
characterization for the single machine case, a connection to other
scheduling problems implying that the problem we consider is strongly
NP-hard even with 2 machines, approximation results and an effective
upper bound.
slides (pdf)
Loris Marchal
Offline and online scheduling of concurrent bags-of-tasks on heterogeneous platforms
Abstract:
Scheduling problems are already difficult on traditional parallel machines. They become extremely challenging on heterogeneous clusters, even when embarrassingly parallel applications are considered. In this paper we deal with the problem of scheduling
multiple applications, made of collections of independent and identical tasks, on a
heterogeneous master-worker platform. The applications are submitted online, which
means that there is no a priori (static) knowledge of the workload distribution at the
beginning of the execution. The objective is to minimize the maximum stretch, i.e.
the maximum ratio between the actual time an application has spent in the system
and the time this application would have spent if executed alone.
On the theoretical side, we design an optimal algorithm for the offline version of
the problem (when all release dates and application characteristics are known before-hand). We also introduce several heuristics for the general case of online applications.
On the practical side, we have conducted extensive simulations and MPI experiments,
showing that we are able to deal with very large problem instances in a few seconds.
Also, the solution that we compute totally outperforms classical heuristics from the
literature, thereby fully assessing the usefulness of our approach.
slides (pdf)
Bruno Gaujal
From Sturmian words to Balanced trees
Abstract:
Many Scheduling problems with frequency constraints
are solved optimally by Sturmian words.
Sturmian words are binary infinite words with many equivalent
definitions: they are aperiodic sequences with complexity n+1,
balanced sequences and mechanical words.
For problems with a recursive nature, one rather need to consider Sturmian trees.
The first natural definition of Sturmian trees is to consider planar trees of complexity n+1 but in
doing so, one looses the other two properties.
In this talk, we will consider non-ordered trees. In this case,
we show that strongly balanced trees
are exactly mechanical trees. Moreover we will see that they are also
aperiodic trees of minimal complexity, so that the three
equivalent definitions of Sturmian words are almost preserved in that case.
slides (pdf)
Arnaud Legrand
Toward a Fully Decentralized Algorithm for Multiple Bag-of-tasks Application Scheduling on Grids
Abstract:
In this talk, we present a fully decentralized algorithm for fair
resource sharing between multiple bag-of-tasks applications in a
grid environment. This algorithm is inspired from related work on
multi-path routing in communication network. An interesting feature
of this algorithm is that it allows the choice of wide variety of
fairness criteria and achieves both optimal path selection and flow
control. In addition, this algorithm only requires local information
at each slave computing tasks and at each buffer of the network
links while minimal computation is done by the schedulers. A naive
adaptation is unstable and inefficient though. Fortunately, a simple
and effective scaling mechanism is sufficient to circumvent this
issue. This scaling mechanism is motivated by a careful study of
the subtle differences with the classical multi-path routing
problem. We prove its efficiency through a detailed analysis of a
simple simulation.
slides (pdf)
Jean-Marc Nicod
Batch Scheduling for Identical Multi-Tasks Jobs on Heterogeneous Platforms
Abstract:
We consider the scheduling of one batch of identical jobs on a grid of
heterogeneous computers. Each job of the batch is an application
described by a Directed Acyclic Graph (DAG) of tasks without
forks(in-tree) and executed independently from the others jobs. Each
job is an instance of the same DAG and the tasks of the DAG are typed,
i.e. there may be two identical tasks in one job and a task of one type
can only be executed on the computers where the corresponding program
is installed. The execution resources are distributed and each
resource can carry out a set of task types, the installed programs, so
several hosts may be able to carry out the same task. The objective
function of our problem is to minimize the makespan of the batch
execution. An example that matches these definitions is a workflow to
process a set of data in several steps on a grid where the softwares
installed on the hosts differs: each step could be a filter applied to
one or more images and only some filters are available on each host.
In this context, we consider the optimization of the schedule depending
on the size, from small to large sized batches, on the characteristics
of the batches, etc. Different scheduling techniques could be used to
schedule such applications: off-line or online, flow oriented,
etc. Classical off-line scheduling techniques for DAGs are heuristics,
as there is no direct optimal solution to this problem.They just take
one job into account so that multiple jobs will be scheduled as if they
were just one application, without benefiting from the knowledge that
there are several identical jobs nor of the structure of the
jobs. Moreover, if the size of the batch is large,some heuristics may
become very costly. On-line techniques are usually simple but they do
not either take benefit of the identical structure of the jobs. Steady
state oriented scheduling techniques provides an optimal solution in
this case but for an infinite number of jobs.
We compare three scheduling techniques: a simple on-line scheduler,
a steady-state scheduling technique and a genetic heuristic
based algorithm. The on-line algorithm just schedule the tasks as soon
as they are free of their dependencies. The genetic based algorithm
start from a list-scheduling solution and improve it by selecting
the best individuals from the makespan of their schedule. The
steady-state technique generates a periodic schedule from the result of
a linear program, built with the constraints that characterize the
problem.
By using an experimental evaluation based on simulation, we show
that the platform architecture and the DAG structure affect
the performances of the schedule differently depending on the
used algorithm. Then, we propose some adaptations to these
scheduling algorithms to improve their performances on the execution of
the batch.
slides (pdf)
Anthony T. Chronopoulos
Game theory based load balanced job allocation in distributed systems
Abstract:
The main goal of our work is to provide fairness to the users and the
users' jobs , i.e. all the users and their jobs should experience
approximately equal expected response time (expected queuing delay +
processing time + any communication time) or be charged approximately
equal price for their execution independent of the computers
allocated, and we will show that game theory provides a suitable
framework for characterizing such schemes. For distributed systems in
which all the jobs belong to a single user (single-class), we use a
cooperative game to model the load balancing problem which takes the
average system information into account (static load balancing). Our
solution is based on the Nash Bargaining Solution which provides a
Pareto optimal solution for the distributed system and is also a fair
solution. We then extend the system model to include jobs from various
users (multi-user/multi-class job distributed system) and include
pricing to model a Grid system. We use the concept of Nash
equilibrium as the solution of our non-cooperative games and derive
distributed algorithms for computing it. We also derive a dynamic
scheme tries to minimize the expected response time of the entire
system and the other tries to minimize the expected response time of
the individual users to provide a fair solution.
slides (pdf)
Henri Casanova
Scheduling Mixed-Parallel Applications with Advance Reservations
Abstract:
In this talk we discuss the scheduling of mixed-parallel
applications, which exhibit both task and data parallelism, in advance
reservations settings. Both the problem of minimizing application
turn-around time and that of meeting a deadline are studied. For each
several scheduling algorithms are proposed, some of which borrow ideas
from previously published work in non-reservation settings. Algorithms
are compared in simulation over a wide range of application and
reservation scenarios. The main finding is that schedules
computed using the previously published CPA algorithm can be adapted to
advance reservation settings, notably resulting in low resource
consumption and thus high efficiency.
slides (ppt)
Florina M. Ciorba
Enhancing self-scheduling algorithms via synchronization and weighting
Abstract:
Existing dynamic self-scheduling algorithms, used to schedule
independent tasks on heterogeneous clusters, cannot handle tasks with
dependencies because they lack the support for internode
communication. In this talk a synchronization mechanism that provides
inter-processor communication, is presented to compensate for the
above mentioned deficiency. This mechanism enables self-scheduling
algorithms to efficiently handle nested loops with dependencies. A
weighting mechanism is presented also that significantly improves the
performance of dynamic self-scheduling algorithms. Traditionally, the
self-scheduling algorithms divide the total number of tasks into
chunks and assign them to processors. The weighting mechanism adapts
the chunk sizes to the computing power and current run-queue state of
the processors. The synchronization and weighting mechanisms are
orthogonal, in the sense that they can simultaneously be applied to
loops with dependencies. Thus, they broaden the application spectrum
of dynamic self-scheduling algorithms and improve their
performance. Extensive tests confirm the efficiency of the
synchronization and weighting mechanisms and the significant
improvement of the synchronized-weighted versions of the algorithms
over the synchronized-only versions.
slides (pdf)
Kiminori Matsuzaki
An Efficient Implementation of Data-Parallel Skeletons on Multicore Processors
Abstract:
Recently, multicore processors are getting widely available.
Developing programs that run efficiently on multicore processors
is a hard task since the programs should run in parallel.
Furthermore, cache memory shared among CPU cores makes the development
much more difficult.
slides (pdf)
Anne Benoit
Mapping skeleton workflows onto heterogeneous platforms to optimize throughput, latency and reliability
Abstract:
Mapping workflow applications onto parallel platforms is a challenging
problem, even for simple application patterns such as pipeline graphs.
Several antagonist criteria should be optimized, such as throughput,
latency, reliability (or a combination). Typical applications include
digital image processing, where images are processed in steady-state
mode. In this talk, we study the complexity of various multi-criteria
mapping problems for pipeline graphs. For some NP-hard instances of
the problem, polynomial bi-criteria heuristics are presented and their
relative performance is evaluated through extensive simulations.
slides (pdf)
Umit V. Catalyurek
Multi-Objective Scheduling of Streaming Workflows
Abstract:
Scheduling, in many application domains, involves the
optimization of multiple performance metrics. In this talk,
I will present a novel algorithm framework for the multi-objective scheduling of
workflows that act on a stream of input data. Our framework focuses
on the two performance metrics: latency and throughput, and minimizes
the latency of workflows while satisfying strict throughput
requirements. We leverage pipelined, task and data parallelism in a
coordinated manner to meet these objectives and investigate
the benefit of task duplication in alleviating communication overheads
in the pipelined schedule for different workflow characteristics. Evaluation
using synthetic benchmarks as well as those derived from real applications
shows that our algorithm consistently produces lower-latency schedules
and meets throughput requirements, even when previously proposed
schemes fail.
slides (ppt)
Larry Carter
Scheduling: Today and Tomorrow
slides (ppt) slides (pdf)