Adjusting thread parallelism dynamically to accelerate dynamic programming with irregular workload distribution on GPGPUs

Chao-Chin Wu, Jenn Yang Ke, Syun Sheng Jhan, Heshan Lin

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Dynamic Programming (DP) is an important and popular method for solving a wide variety of discrete optimization problems such as scheduling, string-editing, packaging, and inventory management. DP breaks problems into simpler subproblems and combines their solutions into solutions to original ones. This paper focuses on one type of dynamic programming called Nonserial Polyadic Dynamic Programming (NPDP). To run NPDP applications efficiently on an emerging General-Purpose Graphic Processing Unit (GPGPU), the authors have to exploit more parallelism to fully utilize the computing power of the hundreds of processing units in it. However, the parallelism degree varies significantly in different phases of the NPDP applications. To address the problem, the authors propose a method that can adjust the thread-level parallelism to provide a sufficient and steadier parallelism degree for different phases. If a phase has insufficient parallelism, the authors split threads into subthreads. On the other hand, the authors can limit the total number of threads in a phase by merging threads. The authors also examine the difference between the conventional problem of finding the minimum on a GPU and the NPDP-featured problem of finding the minimums of many independent sets on a GPU. Finally, the authors examine how to design an appropriate data structure to apply the memory coalescing optimization technique. The experimental results demonstrate our method can obtain the best speedup of 13.40 over the algorithm published previously.

Original languageEnglish
Pages (from-to)1-20
Number of pages20
JournalInternational Journal of Grid and High Performance Computing
Volume6
Issue number1
DOIs
Publication statusPublished - 2014 Jan 1

Fingerprint

Dynamic programming
Merging
Data structures
Packaging
Scheduling
Data storage equipment
Processing
Graphics processing unit

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Cite this

@article{aeaa669a8b074d588f8c41eb8bf6ed97,
title = "Adjusting thread parallelism dynamically to accelerate dynamic programming with irregular workload distribution on GPGPUs",
abstract = "Dynamic Programming (DP) is an important and popular method for solving a wide variety of discrete optimization problems such as scheduling, string-editing, packaging, and inventory management. DP breaks problems into simpler subproblems and combines their solutions into solutions to original ones. This paper focuses on one type of dynamic programming called Nonserial Polyadic Dynamic Programming (NPDP). To run NPDP applications efficiently on an emerging General-Purpose Graphic Processing Unit (GPGPU), the authors have to exploit more parallelism to fully utilize the computing power of the hundreds of processing units in it. However, the parallelism degree varies significantly in different phases of the NPDP applications. To address the problem, the authors propose a method that can adjust the thread-level parallelism to provide a sufficient and steadier parallelism degree for different phases. If a phase has insufficient parallelism, the authors split threads into subthreads. On the other hand, the authors can limit the total number of threads in a phase by merging threads. The authors also examine the difference between the conventional problem of finding the minimum on a GPU and the NPDP-featured problem of finding the minimums of many independent sets on a GPU. Finally, the authors examine how to design an appropriate data structure to apply the memory coalescing optimization technique. The experimental results demonstrate our method can obtain the best speedup of 13.40 over the algorithm published previously.",
author = "Chao-Chin Wu and Ke, {Jenn Yang} and Jhan, {Syun Sheng} and Heshan Lin",
year = "2014",
month = "1",
day = "1",
doi = "10.4018/ijghpc.2014010101",
language = "English",
volume = "6",
pages = "1--20",
journal = "International Journal of Grid and High Performance Computing",
issn = "1938-0259",
publisher = "IGI Global Publishing",
number = "1",

}

Adjusting thread parallelism dynamically to accelerate dynamic programming with irregular workload distribution on GPGPUs. / Wu, Chao-Chin; Ke, Jenn Yang; Jhan, Syun Sheng; Lin, Heshan.

In: International Journal of Grid and High Performance Computing, Vol. 6, No. 1, 01.01.2014, p. 1-20.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Adjusting thread parallelism dynamically to accelerate dynamic programming with irregular workload distribution on GPGPUs

AU - Wu, Chao-Chin

AU - Ke, Jenn Yang

AU - Jhan, Syun Sheng

AU - Lin, Heshan

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Dynamic Programming (DP) is an important and popular method for solving a wide variety of discrete optimization problems such as scheduling, string-editing, packaging, and inventory management. DP breaks problems into simpler subproblems and combines their solutions into solutions to original ones. This paper focuses on one type of dynamic programming called Nonserial Polyadic Dynamic Programming (NPDP). To run NPDP applications efficiently on an emerging General-Purpose Graphic Processing Unit (GPGPU), the authors have to exploit more parallelism to fully utilize the computing power of the hundreds of processing units in it. However, the parallelism degree varies significantly in different phases of the NPDP applications. To address the problem, the authors propose a method that can adjust the thread-level parallelism to provide a sufficient and steadier parallelism degree for different phases. If a phase has insufficient parallelism, the authors split threads into subthreads. On the other hand, the authors can limit the total number of threads in a phase by merging threads. The authors also examine the difference between the conventional problem of finding the minimum on a GPU and the NPDP-featured problem of finding the minimums of many independent sets on a GPU. Finally, the authors examine how to design an appropriate data structure to apply the memory coalescing optimization technique. The experimental results demonstrate our method can obtain the best speedup of 13.40 over the algorithm published previously.

AB - Dynamic Programming (DP) is an important and popular method for solving a wide variety of discrete optimization problems such as scheduling, string-editing, packaging, and inventory management. DP breaks problems into simpler subproblems and combines their solutions into solutions to original ones. This paper focuses on one type of dynamic programming called Nonserial Polyadic Dynamic Programming (NPDP). To run NPDP applications efficiently on an emerging General-Purpose Graphic Processing Unit (GPGPU), the authors have to exploit more parallelism to fully utilize the computing power of the hundreds of processing units in it. However, the parallelism degree varies significantly in different phases of the NPDP applications. To address the problem, the authors propose a method that can adjust the thread-level parallelism to provide a sufficient and steadier parallelism degree for different phases. If a phase has insufficient parallelism, the authors split threads into subthreads. On the other hand, the authors can limit the total number of threads in a phase by merging threads. The authors also examine the difference between the conventional problem of finding the minimum on a GPU and the NPDP-featured problem of finding the minimums of many independent sets on a GPU. Finally, the authors examine how to design an appropriate data structure to apply the memory coalescing optimization technique. The experimental results demonstrate our method can obtain the best speedup of 13.40 over the algorithm published previously.

UR - http://www.scopus.com/inward/record.url?scp=84928168401&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84928168401&partnerID=8YFLogxK

U2 - 10.4018/ijghpc.2014010101

DO - 10.4018/ijghpc.2014010101

M3 - Article

AN - SCOPUS:84928168401

VL - 6

SP - 1

EP - 20

JO - International Journal of Grid and High Performance Computing

JF - International Journal of Grid and High Performance Computing

SN - 1938-0259

IS - 1

ER -