Using the longest significance run to estimate region-specific p-values in genetic association mapping studies

Iebin Lian, Yi Hsien Lin, Ying Chao Lin, Hsin Chou Yang, Chee Jang Chang, Cathy S.J. Fann

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Background: Association testing is a powerful tool for identifying disease susceptibility genes underlying complex diseases. Technological advances have yielded a dramatic increase in the density of available genetic markers, necessitating an increase in the number of association tests required for the analysis of disease susceptibility genes. As such, multiple-tests corrections have become a critical issue. However the conventional statistical corrections on locus-specific multiple tests usually result in lower power as the number of markers increases. Alternatively, we propose here the application of the longest significant run (LSR) method to estimate a region-specific p-value to provide an index for the most likely candidate region. Results: An advantage of the LSR method relative to procedures based on genotypic data is that only p-value data are needed and hence can be applied extensively to different study designs. In this study the proposed LSR method was compared with commonly used methods such as Bonferroni's method and FDR controlling method. We found that while all methods provide good control over false positive rate, LSR has much better power and false discovery rate. In the authentic analysis on psoriasis and asthma disease data, the LSR method successfully identified important candidate regions and replicated the results of previous association studies. Conclusion: The proposed LSR method provides an efficient exploratory tool for the analysis of sequences of dense genetic markers. Our results show that the LSR method has better power and lower false discovery rate comparing with the locus-specific multiple tests.

Original languageEnglish
Article number246
JournalBMC Bioinformatics
Volume9
DOIs
Publication statusPublished - 2008 May 27

Fingerprint

Genetic Association
Genetic Association Studies
p-Value
Estimate
Multiple Tests
Genes
Disease Susceptibility
Susceptibility
Genetic Markers
Locus
Gene
Testing
Asthma
Bonferroni
False Positive
Psoriasis
Sequence Analysis
Likely

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Lian, Iebin ; Lin, Yi Hsien ; Lin, Ying Chao ; Yang, Hsin Chou ; Chang, Chee Jang ; Fann, Cathy S.J. / Using the longest significance run to estimate region-specific p-values in genetic association mapping studies. In: BMC Bioinformatics. 2008 ; Vol. 9.
@article{cdd93bb4eb954abcb90293059cfff9fe,
title = "Using the longest significance run to estimate region-specific p-values in genetic association mapping studies",
abstract = "Background: Association testing is a powerful tool for identifying disease susceptibility genes underlying complex diseases. Technological advances have yielded a dramatic increase in the density of available genetic markers, necessitating an increase in the number of association tests required for the analysis of disease susceptibility genes. As such, multiple-tests corrections have become a critical issue. However the conventional statistical corrections on locus-specific multiple tests usually result in lower power as the number of markers increases. Alternatively, we propose here the application of the longest significant run (LSR) method to estimate a region-specific p-value to provide an index for the most likely candidate region. Results: An advantage of the LSR method relative to procedures based on genotypic data is that only p-value data are needed and hence can be applied extensively to different study designs. In this study the proposed LSR method was compared with commonly used methods such as Bonferroni's method and FDR controlling method. We found that while all methods provide good control over false positive rate, LSR has much better power and false discovery rate. In the authentic analysis on psoriasis and asthma disease data, the LSR method successfully identified important candidate regions and replicated the results of previous association studies. Conclusion: The proposed LSR method provides an efficient exploratory tool for the analysis of sequences of dense genetic markers. Our results show that the LSR method has better power and lower false discovery rate comparing with the locus-specific multiple tests.",
author = "Iebin Lian and Lin, {Yi Hsien} and Lin, {Ying Chao} and Yang, {Hsin Chou} and Chang, {Chee Jang} and Fann, {Cathy S.J.}",
year = "2008",
month = "5",
day = "27",
doi = "10.1186/1471-2105-9-246",
language = "English",
volume = "9",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

Using the longest significance run to estimate region-specific p-values in genetic association mapping studies. / Lian, Iebin; Lin, Yi Hsien; Lin, Ying Chao; Yang, Hsin Chou; Chang, Chee Jang; Fann, Cathy S.J.

In: BMC Bioinformatics, Vol. 9, 246, 27.05.2008.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Using the longest significance run to estimate region-specific p-values in genetic association mapping studies

AU - Lian, Iebin

AU - Lin, Yi Hsien

AU - Lin, Ying Chao

AU - Yang, Hsin Chou

AU - Chang, Chee Jang

AU - Fann, Cathy S.J.

PY - 2008/5/27

Y1 - 2008/5/27

N2 - Background: Association testing is a powerful tool for identifying disease susceptibility genes underlying complex diseases. Technological advances have yielded a dramatic increase in the density of available genetic markers, necessitating an increase in the number of association tests required for the analysis of disease susceptibility genes. As such, multiple-tests corrections have become a critical issue. However the conventional statistical corrections on locus-specific multiple tests usually result in lower power as the number of markers increases. Alternatively, we propose here the application of the longest significant run (LSR) method to estimate a region-specific p-value to provide an index for the most likely candidate region. Results: An advantage of the LSR method relative to procedures based on genotypic data is that only p-value data are needed and hence can be applied extensively to different study designs. In this study the proposed LSR method was compared with commonly used methods such as Bonferroni's method and FDR controlling method. We found that while all methods provide good control over false positive rate, LSR has much better power and false discovery rate. In the authentic analysis on psoriasis and asthma disease data, the LSR method successfully identified important candidate regions and replicated the results of previous association studies. Conclusion: The proposed LSR method provides an efficient exploratory tool for the analysis of sequences of dense genetic markers. Our results show that the LSR method has better power and lower false discovery rate comparing with the locus-specific multiple tests.

AB - Background: Association testing is a powerful tool for identifying disease susceptibility genes underlying complex diseases. Technological advances have yielded a dramatic increase in the density of available genetic markers, necessitating an increase in the number of association tests required for the analysis of disease susceptibility genes. As such, multiple-tests corrections have become a critical issue. However the conventional statistical corrections on locus-specific multiple tests usually result in lower power as the number of markers increases. Alternatively, we propose here the application of the longest significant run (LSR) method to estimate a region-specific p-value to provide an index for the most likely candidate region. Results: An advantage of the LSR method relative to procedures based on genotypic data is that only p-value data are needed and hence can be applied extensively to different study designs. In this study the proposed LSR method was compared with commonly used methods such as Bonferroni's method and FDR controlling method. We found that while all methods provide good control over false positive rate, LSR has much better power and false discovery rate. In the authentic analysis on psoriasis and asthma disease data, the LSR method successfully identified important candidate regions and replicated the results of previous association studies. Conclusion: The proposed LSR method provides an efficient exploratory tool for the analysis of sequences of dense genetic markers. Our results show that the LSR method has better power and lower false discovery rate comparing with the locus-specific multiple tests.

UR - http://www.scopus.com/inward/record.url?scp=45749116906&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=45749116906&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-9-246

DO - 10.1186/1471-2105-9-246

M3 - Article

C2 - 18503718

AN - SCOPUS:45749116906

VL - 9

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 246

ER -