Estimating the Probability of Rare Events Occurring Using a Local Model Averaging

Jin Hua Chen, Chun Shu Chen, Meng Fan Huang, Hung Chih Lin

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed.

Original languageEnglish
Pages (from-to)1855-1870
Number of pages16
JournalRisk Analysis
Volume36
Issue number10
DOIs
Publication statusPublished - 2016 Oct 1

Fingerprint

Logistic Models
Logistics
Necrotizing Enterocolitis
Perturbation techniques

All Science Journal Classification (ASJC) codes

  • Safety, Risk, Reliability and Quality
  • Physiology (medical)

Cite this

Chen, Jin Hua ; Chen, Chun Shu ; Huang, Meng Fan ; Lin, Hung Chih. / Estimating the Probability of Rare Events Occurring Using a Local Model Averaging. In: Risk Analysis. 2016 ; Vol. 36, No. 10. pp. 1855-1870.
@article{0260bbf780174881beddb12dddd2c1f4,
title = "Estimating the Probability of Rare Events Occurring Using a Local Model Averaging",
abstract = "In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed.",
author = "Chen, {Jin Hua} and Chen, {Chun Shu} and Huang, {Meng Fan} and Lin, {Hung Chih}",
year = "2016",
month = "10",
day = "1",
doi = "10.1111/risa.12558",
language = "English",
volume = "36",
pages = "1855--1870",
journal = "Risk Analysis",
issn = "0272-4332",
publisher = "Wiley-Blackwell",
number = "10",

}

Estimating the Probability of Rare Events Occurring Using a Local Model Averaging. / Chen, Jin Hua; Chen, Chun Shu; Huang, Meng Fan; Lin, Hung Chih.

In: Risk Analysis, Vol. 36, No. 10, 01.10.2016, p. 1855-1870.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Estimating the Probability of Rare Events Occurring Using a Local Model Averaging

AU - Chen, Jin Hua

AU - Chen, Chun Shu

AU - Huang, Meng Fan

AU - Lin, Hung Chih

PY - 2016/10/1

Y1 - 2016/10/1

N2 - In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed.

AB - In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed.

UR - http://www.scopus.com/inward/record.url?scp=84996536982&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84996536982&partnerID=8YFLogxK

U2 - 10.1111/risa.12558

DO - 10.1111/risa.12558

M3 - Article

AN - SCOPUS:84996536982

VL - 36

SP - 1855

EP - 1870

JO - Risk Analysis

JF - Risk Analysis

SN - 0272-4332

IS - 10

ER -