Using propensity score adjustment method in genetic association studies

Amrita Sengupta Chattopadhyay, Ying Chao Lin, Ai Ru Hsieh, Chien Ching Chang, Ie Bin Lian, Cathy S.J. Fann

Research output: Contribution to journalArticle

Abstract

Background The statistical tests for single locus disease association are mostly under-powered. If a disease associated causal single nucleotide polymorphism (SNP) operates essentially through a complex mechanism that involves multiple SNPs or possible environmental factors, its effect might be missed if the causal SNP is studied in isolation without accounting for these unknown genetic influences. In this study, we attempt to address the issue of reduced power that is inherent in single point association studies by accounting for genetic influences that negatively impact the detection of causal variant in single point association analysis. In our method we use propensity score (PS) to adjust for the effect of SNPs that influence the marginal association of a candidate marker. These SNPs might be in linkage disequilibrium (LD) and/or epistatic with the target-SNP and have a joint interactive influence on the disease under study. We therefore propose a propensity score adjustment method (PSAM) as a tool for dimension reduction to improve the power for single locus studies through an estimated PS to adjust for influence from these SNPs while regressing disease status on the target-genetic locus. The degree of freedom of such a test is therefore always restricted to 1. Results We assess PSAM under the null hypothesis of no disease association to affirm that it correctly controls for the type-I-error rate (<0.05). PSAM displays reasonable power (>70%) and shows an average of 15% improvement in power as compared with commonly-used logistic regression method and PLINK under most simulated scenarios. Using the open-access multifactor dimensionality reduction dataset, PSAM displays improved significance for all disease loci. Through a whole genome study, PSAM was able to identify 21 SNPs from the GAW16 NARAC dataset by reducing their original trend-test p-values from within 0.001 and 0.05 to p-values less than 0.0009, and among which 6 SNPs were further found to be associated with immunity and inflammation. Conclusions PSAM improves the significance of single-locus association of causal SNPs which have had marginal single point association by adjusting for influence from other SNPs in a dataset. This would explain part of the missing heritability without increasing the complexity of the model due to huge multiple testing scenarios. The newly reported SNPs from GAW16 data would provide evidences for further research to elucidate the etiology of rheumatoid arthritis. PSAM is proposed as an exploratory tool that would be complementary to other existing methods. A downloadable user friendly program, PSAM, written in SAS, is available for public use.

Original languageEnglish
Pages (from-to)1-11
Number of pages11
JournalComputational Biology and Chemistry
Volume62
DOIs
Publication statusPublished - 2016 Jun 1

Fingerprint

Genetic Association
Propensity Score
Genetic Association Studies
Single Nucleotide Polymorphism
Adjustment
Association reactions
Nucleotides
Locus
Polymorphism
Single nucleotide Polymorphism
p-Value
Statistical tests
Trend Test
Heritability
Multifactor Dimensionality Reduction
Linkage Disequilibrium
Scenarios
Logistics
Inflammation
Target

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Organic Chemistry
  • Computational Mathematics

Cite this

Sengupta Chattopadhyay, Amrita ; Lin, Ying Chao ; Hsieh, Ai Ru ; Chang, Chien Ching ; Lian, Ie Bin ; Fann, Cathy S.J. / Using propensity score adjustment method in genetic association studies. In: Computational Biology and Chemistry. 2016 ; Vol. 62. pp. 1-11.
@article{cc74c1d968774d6aaedc9c6b8d7591a2,
title = "Using propensity score adjustment method in genetic association studies",
abstract = "Background The statistical tests for single locus disease association are mostly under-powered. If a disease associated causal single nucleotide polymorphism (SNP) operates essentially through a complex mechanism that involves multiple SNPs or possible environmental factors, its effect might be missed if the causal SNP is studied in isolation without accounting for these unknown genetic influences. In this study, we attempt to address the issue of reduced power that is inherent in single point association studies by accounting for genetic influences that negatively impact the detection of causal variant in single point association analysis. In our method we use propensity score (PS) to adjust for the effect of SNPs that influence the marginal association of a candidate marker. These SNPs might be in linkage disequilibrium (LD) and/or epistatic with the target-SNP and have a joint interactive influence on the disease under study. We therefore propose a propensity score adjustment method (PSAM) as a tool for dimension reduction to improve the power for single locus studies through an estimated PS to adjust for influence from these SNPs while regressing disease status on the target-genetic locus. The degree of freedom of such a test is therefore always restricted to 1. Results We assess PSAM under the null hypothesis of no disease association to affirm that it correctly controls for the type-I-error rate (<0.05). PSAM displays reasonable power (>70{\%}) and shows an average of 15{\%} improvement in power as compared with commonly-used logistic regression method and PLINK under most simulated scenarios. Using the open-access multifactor dimensionality reduction dataset, PSAM displays improved significance for all disease loci. Through a whole genome study, PSAM was able to identify 21 SNPs from the GAW16 NARAC dataset by reducing their original trend-test p-values from within 0.001 and 0.05 to p-values less than 0.0009, and among which 6 SNPs were further found to be associated with immunity and inflammation. Conclusions PSAM improves the significance of single-locus association of causal SNPs which have had marginal single point association by adjusting for influence from other SNPs in a dataset. This would explain part of the missing heritability without increasing the complexity of the model due to huge multiple testing scenarios. The newly reported SNPs from GAW16 data would provide evidences for further research to elucidate the etiology of rheumatoid arthritis. PSAM is proposed as an exploratory tool that would be complementary to other existing methods. A downloadable user friendly program, PSAM, written in SAS, is available for public use.",
author = "{Sengupta Chattopadhyay}, Amrita and Lin, {Ying Chao} and Hsieh, {Ai Ru} and Chang, {Chien Ching} and Lian, {Ie Bin} and Fann, {Cathy S.J.}",
year = "2016",
month = "6",
day = "1",
doi = "10.1016/j.compbiolchem.2016.02.017",
language = "English",
volume = "62",
pages = "1--11",
journal = "Computational Biology and Chemistry",
issn = "1476-9271",
publisher = "Elsevier Limited",

}

Using propensity score adjustment method in genetic association studies. / Sengupta Chattopadhyay, Amrita; Lin, Ying Chao; Hsieh, Ai Ru; Chang, Chien Ching; Lian, Ie Bin; Fann, Cathy S.J.

In: Computational Biology and Chemistry, Vol. 62, 01.06.2016, p. 1-11.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Using propensity score adjustment method in genetic association studies

AU - Sengupta Chattopadhyay, Amrita

AU - Lin, Ying Chao

AU - Hsieh, Ai Ru

AU - Chang, Chien Ching

AU - Lian, Ie Bin

AU - Fann, Cathy S.J.

PY - 2016/6/1

Y1 - 2016/6/1

N2 - Background The statistical tests for single locus disease association are mostly under-powered. If a disease associated causal single nucleotide polymorphism (SNP) operates essentially through a complex mechanism that involves multiple SNPs or possible environmental factors, its effect might be missed if the causal SNP is studied in isolation without accounting for these unknown genetic influences. In this study, we attempt to address the issue of reduced power that is inherent in single point association studies by accounting for genetic influences that negatively impact the detection of causal variant in single point association analysis. In our method we use propensity score (PS) to adjust for the effect of SNPs that influence the marginal association of a candidate marker. These SNPs might be in linkage disequilibrium (LD) and/or epistatic with the target-SNP and have a joint interactive influence on the disease under study. We therefore propose a propensity score adjustment method (PSAM) as a tool for dimension reduction to improve the power for single locus studies through an estimated PS to adjust for influence from these SNPs while regressing disease status on the target-genetic locus. The degree of freedom of such a test is therefore always restricted to 1. Results We assess PSAM under the null hypothesis of no disease association to affirm that it correctly controls for the type-I-error rate (<0.05). PSAM displays reasonable power (>70%) and shows an average of 15% improvement in power as compared with commonly-used logistic regression method and PLINK under most simulated scenarios. Using the open-access multifactor dimensionality reduction dataset, PSAM displays improved significance for all disease loci. Through a whole genome study, PSAM was able to identify 21 SNPs from the GAW16 NARAC dataset by reducing their original trend-test p-values from within 0.001 and 0.05 to p-values less than 0.0009, and among which 6 SNPs were further found to be associated with immunity and inflammation. Conclusions PSAM improves the significance of single-locus association of causal SNPs which have had marginal single point association by adjusting for influence from other SNPs in a dataset. This would explain part of the missing heritability without increasing the complexity of the model due to huge multiple testing scenarios. The newly reported SNPs from GAW16 data would provide evidences for further research to elucidate the etiology of rheumatoid arthritis. PSAM is proposed as an exploratory tool that would be complementary to other existing methods. A downloadable user friendly program, PSAM, written in SAS, is available for public use.

AB - Background The statistical tests for single locus disease association are mostly under-powered. If a disease associated causal single nucleotide polymorphism (SNP) operates essentially through a complex mechanism that involves multiple SNPs or possible environmental factors, its effect might be missed if the causal SNP is studied in isolation without accounting for these unknown genetic influences. In this study, we attempt to address the issue of reduced power that is inherent in single point association studies by accounting for genetic influences that negatively impact the detection of causal variant in single point association analysis. In our method we use propensity score (PS) to adjust for the effect of SNPs that influence the marginal association of a candidate marker. These SNPs might be in linkage disequilibrium (LD) and/or epistatic with the target-SNP and have a joint interactive influence on the disease under study. We therefore propose a propensity score adjustment method (PSAM) as a tool for dimension reduction to improve the power for single locus studies through an estimated PS to adjust for influence from these SNPs while regressing disease status on the target-genetic locus. The degree of freedom of such a test is therefore always restricted to 1. Results We assess PSAM under the null hypothesis of no disease association to affirm that it correctly controls for the type-I-error rate (<0.05). PSAM displays reasonable power (>70%) and shows an average of 15% improvement in power as compared with commonly-used logistic regression method and PLINK under most simulated scenarios. Using the open-access multifactor dimensionality reduction dataset, PSAM displays improved significance for all disease loci. Through a whole genome study, PSAM was able to identify 21 SNPs from the GAW16 NARAC dataset by reducing their original trend-test p-values from within 0.001 and 0.05 to p-values less than 0.0009, and among which 6 SNPs were further found to be associated with immunity and inflammation. Conclusions PSAM improves the significance of single-locus association of causal SNPs which have had marginal single point association by adjusting for influence from other SNPs in a dataset. This would explain part of the missing heritability without increasing the complexity of the model due to huge multiple testing scenarios. The newly reported SNPs from GAW16 data would provide evidences for further research to elucidate the etiology of rheumatoid arthritis. PSAM is proposed as an exploratory tool that would be complementary to other existing methods. A downloadable user friendly program, PSAM, written in SAS, is available for public use.

UR - http://www.scopus.com/inward/record.url?scp=84960848629&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84960848629&partnerID=8YFLogxK

U2 - 10.1016/j.compbiolchem.2016.02.017

DO - 10.1016/j.compbiolchem.2016.02.017

M3 - Article

C2 - 26991546

AN - SCOPUS:84960848629

VL - 62

SP - 1

EP - 11

JO - Computational Biology and Chemistry

JF - Computational Biology and Chemistry

SN - 1476-9271

ER -