Enhanced parallel loop self-scheduling for heterogeneous multi-core cluster systems

Chao Chin Wu, Liang Tsung Huang, Lien Fu Lai, Ming Lung Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Recently, more and more studies investigated the issue of dealing with the heterogeneity problem on heterogeneous cluster systems consisting of multi-core computing nodes. Previously we have proposed a hybrid MPI and OpenMP based loop self-scheduling approach for this kind of system. The allocation functions of several well-known schemes have been modified for better performance. Though the previous approach can improve system performance significantly, in this paper we present how to enhance the speedup further. First, we exploit the thread-level parallelism on the multi-core master node. Second, we investigate how to design a loop self-scheduling scheme which is able to smartly assign a proper chunk size according to each node's performance. At the beginning of dispatching, we prevent the slow slaves from being assigned too many tasks. On the other hand, the master will not assign too many small chunks to slaves at the end. Experimental results show that our approach could obtain the best speedup of 1.35.

Original languageEnglish
Title of host publicationI-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks
Pages568-573
Number of pages6
DOIs
Publication statusPublished - 2009 Dec 1
Event10th International Symposium on Pervasive Systems, Algorithms, and Networks, I-SPAN 2009 - Kaohsiung, Taiwan
Duration: 2009 Dec 142009 Dec 16

Publication series

NameI-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks

Other

Other10th International Symposium on Pervasive Systems, Algorithms, and Networks, I-SPAN 2009
CountryTaiwan
CityKaohsiung
Period09-12-1409-12-16

Fingerprint

Scheduling

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Software

Cite this

Wu, C. C., Huang, L. T., Lai, L. F., & Chen, M. L. (2009). Enhanced parallel loop self-scheduling for heterogeneous multi-core cluster systems. In I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks (pp. 568-573). [5381658] (I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks). https://doi.org/10.1109/I-SPAN.2009.38
Wu, Chao Chin ; Huang, Liang Tsung ; Lai, Lien Fu ; Chen, Ming Lung. / Enhanced parallel loop self-scheduling for heterogeneous multi-core cluster systems. I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks. 2009. pp. 568-573 (I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks).
@inproceedings{4be41af615cc4c7dac1ee6d726696b17,
title = "Enhanced parallel loop self-scheduling for heterogeneous multi-core cluster systems",
abstract = "Recently, more and more studies investigated the issue of dealing with the heterogeneity problem on heterogeneous cluster systems consisting of multi-core computing nodes. Previously we have proposed a hybrid MPI and OpenMP based loop self-scheduling approach for this kind of system. The allocation functions of several well-known schemes have been modified for better performance. Though the previous approach can improve system performance significantly, in this paper we present how to enhance the speedup further. First, we exploit the thread-level parallelism on the multi-core master node. Second, we investigate how to design a loop self-scheduling scheme which is able to smartly assign a proper chunk size according to each node's performance. At the beginning of dispatching, we prevent the slow slaves from being assigned too many tasks. On the other hand, the master will not assign too many small chunks to slaves at the end. Experimental results show that our approach could obtain the best speedup of 1.35.",
author = "Wu, {Chao Chin} and Huang, {Liang Tsung} and Lai, {Lien Fu} and Chen, {Ming Lung}",
year = "2009",
month = "12",
day = "1",
doi = "10.1109/I-SPAN.2009.38",
language = "English",
isbn = "9780769539089",
series = "I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks",
pages = "568--573",
booktitle = "I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks",

}

Wu, CC, Huang, LT, Lai, LF & Chen, ML 2009, Enhanced parallel loop self-scheduling for heterogeneous multi-core cluster systems. in I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks., 5381658, I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks, pp. 568-573, 10th International Symposium on Pervasive Systems, Algorithms, and Networks, I-SPAN 2009, Kaohsiung, Taiwan, 09-12-14. https://doi.org/10.1109/I-SPAN.2009.38

Enhanced parallel loop self-scheduling for heterogeneous multi-core cluster systems. / Wu, Chao Chin; Huang, Liang Tsung; Lai, Lien Fu; Chen, Ming Lung.

I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks. 2009. p. 568-573 5381658 (I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Enhanced parallel loop self-scheduling for heterogeneous multi-core cluster systems

AU - Wu, Chao Chin

AU - Huang, Liang Tsung

AU - Lai, Lien Fu

AU - Chen, Ming Lung

PY - 2009/12/1

Y1 - 2009/12/1

N2 - Recently, more and more studies investigated the issue of dealing with the heterogeneity problem on heterogeneous cluster systems consisting of multi-core computing nodes. Previously we have proposed a hybrid MPI and OpenMP based loop self-scheduling approach for this kind of system. The allocation functions of several well-known schemes have been modified for better performance. Though the previous approach can improve system performance significantly, in this paper we present how to enhance the speedup further. First, we exploit the thread-level parallelism on the multi-core master node. Second, we investigate how to design a loop self-scheduling scheme which is able to smartly assign a proper chunk size according to each node's performance. At the beginning of dispatching, we prevent the slow slaves from being assigned too many tasks. On the other hand, the master will not assign too many small chunks to slaves at the end. Experimental results show that our approach could obtain the best speedup of 1.35.

AB - Recently, more and more studies investigated the issue of dealing with the heterogeneity problem on heterogeneous cluster systems consisting of multi-core computing nodes. Previously we have proposed a hybrid MPI and OpenMP based loop self-scheduling approach for this kind of system. The allocation functions of several well-known schemes have been modified for better performance. Though the previous approach can improve system performance significantly, in this paper we present how to enhance the speedup further. First, we exploit the thread-level parallelism on the multi-core master node. Second, we investigate how to design a loop self-scheduling scheme which is able to smartly assign a proper chunk size according to each node's performance. At the beginning of dispatching, we prevent the slow slaves from being assigned too many tasks. On the other hand, the master will not assign too many small chunks to slaves at the end. Experimental results show that our approach could obtain the best speedup of 1.35.

UR - http://www.scopus.com/inward/record.url?scp=77949782929&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77949782929&partnerID=8YFLogxK

U2 - 10.1109/I-SPAN.2009.38

DO - 10.1109/I-SPAN.2009.38

M3 - Conference contribution

AN - SCOPUS:77949782929

SN - 9780769539089

T3 - I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks

SP - 568

EP - 573

BT - I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks

ER -

Wu CC, Huang LT, Lai LF, Chen ML. Enhanced parallel loop self-scheduling for heterogeneous multi-core cluster systems. In I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks. 2009. p. 568-573. 5381658. (I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks). https://doi.org/10.1109/I-SPAN.2009.38