A Clusterwise Regression Approach for the Estimation of Crash Frequencies

Document Type


Publication Date


Publication Title

Journal of Transportation Safety and Security

First page number:


Last page number:



In the current literature, data is aggregated for the estimation of functions to explain or predict crash patterns using either clustering analysis, regression analysis, or stage-wise models. Typically, analysis sites are grouped into site subtypes based on predefined characteristics. The assumption is that sites within each subtype experience similar crash patterns as a function of prespecified explanatory characteristics. To develop functions to estimate crashes, all data points are clustered only as a function of associated site characteristics. As a consequence, estimated parameters may be based on different crash patterns that represents various trends that could be better captured by using multiple functions. To address this limitation, this study proposes a mathematical program utilizing clusterwise regression to assign sites to clusters and simultaneously seek sets of parameter values for the corresponding estimation functions, so as to maximize the probability of observing the available data. A simulated annealing, coupled with maximum likelihood estimation, was used to solve the mathematical program. Results were analyzed for two site subtypes with fatal and all injury crashes: (1) roadway segments for urban multilane divided segments and (2) urban four-leg signalized intersections. Clusterwise regression improved the predicted number of crashes with multiple estimation functions within the same site subtype.


Clusterwise regression; Negative multinomial; Log-likelihood; Traffic safety; Network screening; Crash frequency; Accident prediction model


Numerical Analysis and Computation



UNLV article access

Search your library