A lightweight BLASTP and its implementation on CUDA GPUs

Liang Tsung Huang, Kai Cheng Wei, Chao Chin Wu, Chao Yu Chen, Jian An Wang

Research output: Contribution to journalArticlepeer-review


The BLAST server in the National Center for Biotechnology Information in the USA receives tens of thousands of queries per day on average. However, the service is always the same for every query even though query lengths vary significantly. In fact, the lengths of a large portion of protein sequences are less than 500. On the other hand, the hit detection process consumes the most of the execution time of BLAST and its core architecture is a lookup table. Following the above reasons, we propose a lightweight BLASTP for servicing not-too-long queries, where a hybrid query-index table is proposed accordingly. Each table entry consists of four bytes that can store up to three query positions. Therefore, a sequence word usually requires only one memory fetch to retrieve its hit information. Furthermore, additional dummy entries are embedded into the table and interleaved with original entries. The entries without any hits and dummy entries both can be used to buffer spilled query positions. The above features result in a much smaller lookup table with a higher utilization rate and a lower cache miss ratio. Experimental results show that the lightweight BLASTP outperforms CUDA-BLASTP with speedups ranging from 1.82 to 3.37 based on the first two critical phases.

Original languageEnglish
Pages (from-to)322-342
Number of pages21
JournalJournal of Supercomputing
Issue number1
Publication statusPublished - 2021 Jan

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Information Systems
  • Hardware and Architecture

Fingerprint Dive into the research topics of 'A lightweight BLASTP and its implementation on CUDA GPUs'. Together they form a unique fingerprint.

Cite this