Award Date

12-1-2022

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Mathematical Sciences

First Committee Member

Kaushik Ghosh

Second Committee Member

Amei Amei

Third Committee Member

Malwane Ananda

Fourth Committee Member

Lung-Chang Chien

Abstract

The prediction of future insurance claims frequency and severity is one of the most important problems in actuarial science. Such predictions help the actuary set insurance premiums based on observed risk factors, or covariates. Accuracy of these predictions is important from the point of view of both the insurance company as well as the insured customer. Typically, actuaries use parametric regression models to predict claims based on the covariate information. Such models assume the same functional form tying the response to the covariates for each data point. These models are not flexible enough and fail to accurately capture at the individual level, the relationship between the covariates and the claims frequency and severity, which are often multimodal, highly skewed, and heavy-tailed.In this dissertation, we explore the use of Bayesian nonparametric (BNP) regression models such as the Dirichlet process mixture model (DPMM) and Pitman-Yor process mixture model (PYMM) to model and predict insurance claims frequency and severity based on covariates. In particular, we model claims frequency as a mixture of Poisson regression and log(claims severity) as a mixture of normal regression, and use the Dirichlet process (DP) and Pitman-Yor process (PY) as a prior for the mixing distribution over the regression parameters. Unlike parametric regression, such models allow each data point to have its individual parameters, making them highly flexible, resulting in improved prediction accuracy. We calculate the posterior predictive distribution for claims frequency and severity using the Polya urn predictive rule of the Dirichlet process and the Pitman-Yor process. Markov chain Monte Carlo (MCMC) methods, such as Neal (2000)’s Algorithm 8 are used to sample from the posterior distributions in the DPMM and PYMM. One important by-product of these models is the clustering information, which can be used to ascertain the number of mixture components. We use simulation studies to demonstrate the accuracy of the proposed models. In addition, we use the French motor insurance claims data to demonstrate its accuracy and applicability in real data situations.

Keywords

Bayesian nonparametric regression; Dirichlet process mixture model; Insurance claims modeling; Pitman-Yor process mixture model

Disciplines

Statistics and Probability

File Format

pdf

File Size

10000 KB

Degree Grantor

University of Nevada, Las Vegas

Language

English

Rights

IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/

Available for download on Monday, December 15, 2025


Share

COinS