Inc-Part: Incremental Partitioning for Load Balancing in Large-Scale Behavioral Simulations
Publication date
2015Keyword
Behavioral simulationLoad balancing
Incremental partitioning
Multi-agent system
Group migration
Transportation simulations
Fish schools
Dynamics
Colonies
Cluster
Peer-Reviewed
YesOpen Access status
closedAccess
Metadata
Show full item recordAbstract
Large-scale behavioral simulations are widely used to study real-world multi-agent systems. Such programs normally run in discrete time-steps or ticks, with simulated space decomposed into domains that are distributed over a set of workers to achieve parallelism. A distinguishing feature of behavioral simulations is their frequent and high-volume group migration, the phenomenon in which simulated objects traverse domains in groups at massive scale in each tick. This results in continual and significant load imbalance among domains. To tackle this problem, traditional load balancing approaches either require excessive load re-profiling and redistribution, which lead to high computation/communication costs, or perform poorly because their statically partitioned data domains cannot reflect load changes brought by group migration. In this paper, we propose an effective and low-cost load balancing scheme, named Inc-part, based on a key observation that an object is unlikely to move a long distance (across many domains) within a single tick. This localized mobility property allows one to efficiently estimate the load of a dynamic domain incrementally, based on merely the load changes occurring in its neighborhood. The domains experiencing significant load changes are then partitioned or merged, and redistributed to redress load imbalance among the workers. Experiments on a 64-node (1,024-core) platform show that Inc-part can attain excellent load balance with dramatically lowered costs compared to state-of-the-art solutions.Version
No full-text in the repositoryCitation
Zhang Y, Liao XF, Jin H et al (2015) Inc-Part: Incremental Partitioning for Load Balancing in Large-Scale Behavioral Simulations. IEEE Transactions on Parallel and Distributed Systems. 26(7): 1900-1909.Link to Version of Record
https://doi.org/10.1109/Tpds.2014.2333511Type
Articleae974a485f413a2113503eed53cd6c53
https://doi.org/10.1109/Tpds.2014.2333511