Likelihood-based cross-validation score for selecting the smoothing parameter in maximum penalized likelihood estimation

Wataru Sakamoto, Shingo Shirahata

Research output: Contribution to journalArticle

Abstract

Maximum penalized likelihood estimation is applied in non(semi)-parametric regression problems, and enables us exploratory identification and diagnostics of nonlinear regression relationships. The smoothing parameter λ controls trade-off between the smoothness and the goodness-of-fit of a function. The method of cross-validation is used for selecting λ, but the generalized cross-validation, which is based on the squared error criterion, shows bad behavior in non-normal distribution and can not often select reasonable λ. The purpose of this study is to propose a method which gives more suitable λ and to evaluate the performance of it. A method of simple calculation for the delete-one estimates in the likelihood-based cross-validation (LCV) score is described. A score of similar form to the Akaike information criterion (AIC) is also derived. The proposed scores are compared with the ones of standard procedures by using data sets in literatures. Simulations are performed to compare the patterns of selecting λ and overall goodness-of-fit and to evaluate the effects of some factors. The LCV scores by the simple calculation provide good approximations to the exact one if λ is not extremely small. Furthermore the LCV scores by the simple calculation have little risk of choosing extremely small λ and make it possible to select λ adaptively. They have the effect of reducing the bias of estimates and provide better performance in the sense of overall goodness-of-fit. These scores are useful especially in the case of small sample size and in the case of binary logistic regression.

Original languageEnglish
Pages (from-to)1671-1698
Number of pages28
JournalCommunications in Statistics - Theory and Methods
Volume28
Issue number7
Publication statusPublished - 1999
Externally publishedYes

Fingerprint

Penalized Maximum Likelihood
Maximum likelihood estimation
Smoothing Parameter
Cross-validation
Likelihood
Goodness of fit
Logistics
Binary Regression
Semiparametric Regression
Non-normal Distribution
Generalized Cross-validation
Akaike Information Criterion
Nonlinear Regression
Evaluate
Small Sample Size
Logistic Regression
Estimate
Smoothness
Diagnostics
Trade-offs

Keywords

  • Akaike information criterion
  • Logistic regression
  • Nonparametric generalized linear models
  • Poisson regression
  • Smoothing spline

ASJC Scopus subject areas

  • Statistics and Probability
  • Safety, Risk, Reliability and Quality

Cite this

@article{e80e856440694d7c8c045c4854f6b7ef,
title = "Likelihood-based cross-validation score for selecting the smoothing parameter in maximum penalized likelihood estimation",
abstract = "Maximum penalized likelihood estimation is applied in non(semi)-parametric regression problems, and enables us exploratory identification and diagnostics of nonlinear regression relationships. The smoothing parameter λ controls trade-off between the smoothness and the goodness-of-fit of a function. The method of cross-validation is used for selecting λ, but the generalized cross-validation, which is based on the squared error criterion, shows bad behavior in non-normal distribution and can not often select reasonable λ. The purpose of this study is to propose a method which gives more suitable λ and to evaluate the performance of it. A method of simple calculation for the delete-one estimates in the likelihood-based cross-validation (LCV) score is described. A score of similar form to the Akaike information criterion (AIC) is also derived. The proposed scores are compared with the ones of standard procedures by using data sets in literatures. Simulations are performed to compare the patterns of selecting λ and overall goodness-of-fit and to evaluate the effects of some factors. The LCV scores by the simple calculation provide good approximations to the exact one if λ is not extremely small. Furthermore the LCV scores by the simple calculation have little risk of choosing extremely small λ and make it possible to select λ adaptively. They have the effect of reducing the bias of estimates and provide better performance in the sense of overall goodness-of-fit. These scores are useful especially in the case of small sample size and in the case of binary logistic regression.",
keywords = "Akaike information criterion, Logistic regression, Nonparametric generalized linear models, Poisson regression, Smoothing spline",
author = "Wataru Sakamoto and Shingo Shirahata",
year = "1999",
language = "English",
volume = "28",
pages = "1671--1698",
journal = "Communications in Statistics - Theory and Methods",
issn = "0361-0926",
publisher = "Taylor and Francis Ltd.",
number = "7",

}

TY - JOUR

T1 - Likelihood-based cross-validation score for selecting the smoothing parameter in maximum penalized likelihood estimation

AU - Sakamoto, Wataru

AU - Shirahata, Shingo

PY - 1999

Y1 - 1999

N2 - Maximum penalized likelihood estimation is applied in non(semi)-parametric regression problems, and enables us exploratory identification and diagnostics of nonlinear regression relationships. The smoothing parameter λ controls trade-off between the smoothness and the goodness-of-fit of a function. The method of cross-validation is used for selecting λ, but the generalized cross-validation, which is based on the squared error criterion, shows bad behavior in non-normal distribution and can not often select reasonable λ. The purpose of this study is to propose a method which gives more suitable λ and to evaluate the performance of it. A method of simple calculation for the delete-one estimates in the likelihood-based cross-validation (LCV) score is described. A score of similar form to the Akaike information criterion (AIC) is also derived. The proposed scores are compared with the ones of standard procedures by using data sets in literatures. Simulations are performed to compare the patterns of selecting λ and overall goodness-of-fit and to evaluate the effects of some factors. The LCV scores by the simple calculation provide good approximations to the exact one if λ is not extremely small. Furthermore the LCV scores by the simple calculation have little risk of choosing extremely small λ and make it possible to select λ adaptively. They have the effect of reducing the bias of estimates and provide better performance in the sense of overall goodness-of-fit. These scores are useful especially in the case of small sample size and in the case of binary logistic regression.

AB - Maximum penalized likelihood estimation is applied in non(semi)-parametric regression problems, and enables us exploratory identification and diagnostics of nonlinear regression relationships. The smoothing parameter λ controls trade-off between the smoothness and the goodness-of-fit of a function. The method of cross-validation is used for selecting λ, but the generalized cross-validation, which is based on the squared error criterion, shows bad behavior in non-normal distribution and can not often select reasonable λ. The purpose of this study is to propose a method which gives more suitable λ and to evaluate the performance of it. A method of simple calculation for the delete-one estimates in the likelihood-based cross-validation (LCV) score is described. A score of similar form to the Akaike information criterion (AIC) is also derived. The proposed scores are compared with the ones of standard procedures by using data sets in literatures. Simulations are performed to compare the patterns of selecting λ and overall goodness-of-fit and to evaluate the effects of some factors. The LCV scores by the simple calculation provide good approximations to the exact one if λ is not extremely small. Furthermore the LCV scores by the simple calculation have little risk of choosing extremely small λ and make it possible to select λ adaptively. They have the effect of reducing the bias of estimates and provide better performance in the sense of overall goodness-of-fit. These scores are useful especially in the case of small sample size and in the case of binary logistic regression.

KW - Akaike information criterion

KW - Logistic regression

KW - Nonparametric generalized linear models

KW - Poisson regression

KW - Smoothing spline

UR - http://www.scopus.com/inward/record.url?scp=28244465721&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=28244465721&partnerID=8YFLogxK

M3 - Article

VL - 28

SP - 1671

EP - 1698

JO - Communications in Statistics - Theory and Methods

JF - Communications in Statistics - Theory and Methods

SN - 0361-0926

IS - 7

ER -