Optimal geostatistical model selection

Hsin Cheng Huang, Chun-Shu Chen

Research output: Contribution to journalArticle

34 Citations (Scopus)

Abstract

In many fields of science, predicting variables of interest over a study region based on noisy data observed at some locations is an important problem. Two popular methods for the problem are kriging and smoothing splines. The former assumes that the underlying process is stochastic, whereas the latter assumes it is purely deterministic. Kriging performs better than smoothing splines in some situations, but is outperformed by smoothing splines in others. However, little is known regarding selecting between kriging and smoothing splines. In addition, how to perform variable selection in a geostatistical model has not been well studied. In this article we propose a general methodology for selecting among arbitrary spatial prediction methods based on (approximately) unbiased estimation of mean squared prediction errors using a data perturbation technique. The proposed method accounts for estimation uncertainty in both kriging and smoothing spline predictors, and is shown to be optimal in terms of two mean squared prediction error criteria. A simulation experiment is performed to demonstrate the effectiveness of the proposed methodology. The proposed method is also applied to a water acidity data set by selecting important variables responsible for water acidity based on a spatial regression model. Moreover, a new method is proposed for estimating the noise variance that is robust and performs better than some well-known methods.

Original languageEnglish
Pages (from-to)1009-1024
Number of pages16
JournalJournal of the American Statistical Association
Volume102
Issue number479
DOIs
Publication statusPublished - 2007 Sep 1

Fingerprint

Smoothing Splines
Model Selection
Kriging
Prediction Error
Mean Squared Error
Data Perturbation
Spatial Prediction
Uncertainty Estimation
Water
Unbiased Estimation
Methodology
Perturbation Technique
Spatial Model
Noisy Data
Variable Selection
Simulation Experiment
Model selection
Predictors
Regression Model
Smoothing splines

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Huang, Hsin Cheng ; Chen, Chun-Shu. / Optimal geostatistical model selection. In: Journal of the American Statistical Association. 2007 ; Vol. 102, No. 479. pp. 1009-1024.
@article{4a72999275824ca695cc5adbf645e1ca,
title = "Optimal geostatistical model selection",
abstract = "In many fields of science, predicting variables of interest over a study region based on noisy data observed at some locations is an important problem. Two popular methods for the problem are kriging and smoothing splines. The former assumes that the underlying process is stochastic, whereas the latter assumes it is purely deterministic. Kriging performs better than smoothing splines in some situations, but is outperformed by smoothing splines in others. However, little is known regarding selecting between kriging and smoothing splines. In addition, how to perform variable selection in a geostatistical model has not been well studied. In this article we propose a general methodology for selecting among arbitrary spatial prediction methods based on (approximately) unbiased estimation of mean squared prediction errors using a data perturbation technique. The proposed method accounts for estimation uncertainty in both kriging and smoothing spline predictors, and is shown to be optimal in terms of two mean squared prediction error criteria. A simulation experiment is performed to demonstrate the effectiveness of the proposed methodology. The proposed method is also applied to a water acidity data set by selecting important variables responsible for water acidity based on a spatial regression model. Moreover, a new method is proposed for estimating the noise variance that is robust and performs better than some well-known methods.",
author = "Huang, {Hsin Cheng} and Chun-Shu Chen",
year = "2007",
month = "9",
day = "1",
doi = "10.1198/016214507000000491",
language = "English",
volume = "102",
pages = "1009--1024",
journal = "Journal of the American Statistical Association",
issn = "0162-1459",
publisher = "Taylor and Francis Ltd.",
number = "479",

}

Optimal geostatistical model selection. / Huang, Hsin Cheng; Chen, Chun-Shu.

In: Journal of the American Statistical Association, Vol. 102, No. 479, 01.09.2007, p. 1009-1024.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Optimal geostatistical model selection

AU - Huang, Hsin Cheng

AU - Chen, Chun-Shu

PY - 2007/9/1

Y1 - 2007/9/1

N2 - In many fields of science, predicting variables of interest over a study region based on noisy data observed at some locations is an important problem. Two popular methods for the problem are kriging and smoothing splines. The former assumes that the underlying process is stochastic, whereas the latter assumes it is purely deterministic. Kriging performs better than smoothing splines in some situations, but is outperformed by smoothing splines in others. However, little is known regarding selecting between kriging and smoothing splines. In addition, how to perform variable selection in a geostatistical model has not been well studied. In this article we propose a general methodology for selecting among arbitrary spatial prediction methods based on (approximately) unbiased estimation of mean squared prediction errors using a data perturbation technique. The proposed method accounts for estimation uncertainty in both kriging and smoothing spline predictors, and is shown to be optimal in terms of two mean squared prediction error criteria. A simulation experiment is performed to demonstrate the effectiveness of the proposed methodology. The proposed method is also applied to a water acidity data set by selecting important variables responsible for water acidity based on a spatial regression model. Moreover, a new method is proposed for estimating the noise variance that is robust and performs better than some well-known methods.

AB - In many fields of science, predicting variables of interest over a study region based on noisy data observed at some locations is an important problem. Two popular methods for the problem are kriging and smoothing splines. The former assumes that the underlying process is stochastic, whereas the latter assumes it is purely deterministic. Kriging performs better than smoothing splines in some situations, but is outperformed by smoothing splines in others. However, little is known regarding selecting between kriging and smoothing splines. In addition, how to perform variable selection in a geostatistical model has not been well studied. In this article we propose a general methodology for selecting among arbitrary spatial prediction methods based on (approximately) unbiased estimation of mean squared prediction errors using a data perturbation technique. The proposed method accounts for estimation uncertainty in both kriging and smoothing spline predictors, and is shown to be optimal in terms of two mean squared prediction error criteria. A simulation experiment is performed to demonstrate the effectiveness of the proposed methodology. The proposed method is also applied to a water acidity data set by selecting important variables responsible for water acidity based on a spatial regression model. Moreover, a new method is proposed for estimating the noise variance that is robust and performs better than some well-known methods.

UR - http://www.scopus.com/inward/record.url?scp=35348819628&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35348819628&partnerID=8YFLogxK

U2 - 10.1198/016214507000000491

DO - 10.1198/016214507000000491

M3 - Article

AN - SCOPUS:35348819628

VL - 102

SP - 1009

EP - 1024

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 479

ER -