### Abstract

Assume y is a response variable, x is a risk factor of interest, and z's are covariates, or sometime called "confounders of x" if they are correlated with both x and y. If the covariates are numerous, then model selection procedures are applied on z's while x is usually forced into the model before or after the selection. In this situation, over-dispersion will occur to bias the inference on the relation between x and y. In a linear model, the over-dispersion comes from two sources: An underestimation of the mean-squared error, and a dependency between the estimator of the x-effect and its standard error. The author proposed a method that incorporates the ideas of Ye's generalized degree of freedom and Rosenbaum and Rubin's propensity score. The method reduces the bias and over-dispersion effect to acceptable levels. Data from the Georgia capital charging and sentencing study, which included 1077 observations and 295 covariates, were analyzed as an illustration.

Original language | English |
---|---|

Pages (from-to) | 197-214 |

Number of pages | 18 |

Journal | Computational Statistics and Data Analysis |

Volume | 43 |

Issue number | 2 |

DOIs | |

Publication status | Published - 2003 Jun 28 |

### All Science Journal Classification (ASJC) codes

- Statistics and Probability
- Computational Mathematics
- Computational Theory and Mathematics
- Applied Mathematics