Optimizing Distributed Application Performance Using Dynamic Grid Topology-Aware Load Balancing
IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2007
Publication Type: Paper
Repository URL:
Abstract
Grid computing offers a model for solving large-scale scientific
problems by uniting computational resources owned by multiple
organizations to form a single cohesive resource for the duration
of individual jobs. Despite the appeal of using Grid computing to
solve large problems, its use has been hindered by the challenges
involved in developing applications that can run efficiently in
Grid environments. One substantial obstacle to deploying Grid
applications across geographically distributed resources is
cross-site latency. While certain classes of applications, such as
master-slave style or functional decomposition type applications,
lend themselves well to running in Grid environments due to
inherent latency tolerance, other classes of applications, such as
tightly-coupled applications in which each processor regularly
communicates with its neighboring processors, represent a
significant challenge to deployment on Grids.
In this paper, we present a dynamic load balancing technique for Grid applications based on graph partitioning. This technique exploits knowledge of the topology of the Grid environment to partition the computation's communication graph in such a way as to reduce the volume of cross-site communication, thus improving the performance of tightly-coupled applications that are co-allocated across distributed resources. Our technique is particularly well suited to codes from disciplines like molecular dynamics or cosmology due to the non-uniform structure of communication in these types of applications. We evaluate the effectiveness of our technique when used to optimize the execution of a tightly-coupled classical molecular dynamics code called LeanMD deployed in a Grid environment.
In this paper, we present a dynamic load balancing technique for Grid applications based on graph partitioning. This technique exploits knowledge of the topology of the Grid environment to partition the computation's communication graph in such a way as to reduce the volume of cross-site communication, thus improving the performance of tightly-coupled applications that are co-allocated across distributed resources. Our technique is particularly well suited to codes from disciplines like molecular dynamics or cosmology due to the non-uniform structure of communication in these types of applications. We evaluate the effectiveness of our technique when used to optimize the execution of a tightly-coupled classical molecular dynamics code called LeanMD deployed in a Grid environment.
TextRef
Gregory A. Koenig and Laxmikant V. Kale, "Optimizing Distributed Application
Performance Using Dynamic Grid Topology-Aware Load Balancing," 21st IEEE
International Parallel and Distributed Processing Symposium, March 2007.
People
Research Areas