Background The statistical tests for single locus disease association are mostly under-powered. If a disease associated causal single nucleotide polymorphism (SNP) operates essentially through a complex mechanism that involves multiple SNPs or possible environmental factors, its effect might be missed if the causal SNP is studied in isolation without accounting for these unknown genetic influences. In this study, we attempt to address the issue of reduced power that is inherent in single point association studies by accounting for genetic influences that negatively impact the detection of causal variant in single point association analysis. In our method we use propensity score (PS) to adjust for the effect of SNPs that influence the marginal association of a candidate marker. These SNPs might be in linkage disequilibrium (LD) and/or epistatic with the target-SNP and have a joint interactive influence on the disease under study. We therefore propose a propensity score adjustment method (PSAM) as a tool for dimension reduction to improve the power for single locus studies through an estimated PS to adjust for influence from these SNPs while regressing disease status on the target-genetic locus. The degree of freedom of such a test is therefore always restricted to 1. Results We assess PSAM under the null hypothesis of no disease association to affirm that it correctly controls for the type-I-error rate (<0.05). PSAM displays reasonable power (>70%) and shows an average of 15% improvement in power as compared with commonly-used logistic regression method and PLINK under most simulated scenarios. Using the open-access multifactor dimensionality reduction dataset, PSAM displays improved significance for all disease loci. Through a whole genome study, PSAM was able to identify 21 SNPs from the GAW16 NARAC dataset by reducing their original trend-test p-values from within 0.001 and 0.05 to p-values less than 0.0009, and among which 6 SNPs were further found to be associated with immunity and inflammation. Conclusions PSAM improves the significance of single-locus association of causal SNPs which have had marginal single point association by adjusting for influence from other SNPs in a dataset. This would explain part of the missing heritability without increasing the complexity of the model due to huge multiple testing scenarios. The newly reported SNPs from GAW16 data would provide evidences for further research to elucidate the etiology of rheumatoid arthritis. PSAM is proposed as an exploratory tool that would be complementary to other existing methods. A downloadable user friendly program, PSAM, written in SAS, is available for public use.
|Number of pages||11|
|Journal||Computational Biology and Chemistry|
|Publication status||Published - 2016 Jun 1|
All Science Journal Classification (ASJC) codes
- Structural Biology
- Organic Chemistry
- Computational Mathematics