Award Date

May 2018

Degree Type


Degree Name

Master of Science in Computer Science


Computer Science

First Committee Member

Fatma Nasoz

Second Committee Member

Justin Zhan

Third Committee Member

Yoohwan Kim

Fourth Committee Member

Magdalena Martinez

Number of Pages



First-year student retention rates for a four-year institution refers to the percentage of First-time Full-time students from the previous fall who return to the same institution for the following fall. First-year retention rates act as an important indicator of the student satisfaction as well as the performance of the university. Moreover, universities with low retention rates may face a decline in the admissions of talented students with a notable loss of tuition fees and contributions from alumni. Therefore, it is important for universities to formulate strategies to identify students who are at risk of not being retained and take necessary measures to retain them. Many universities have tried to develop successful intervention programs to help students increase their performance. However, identifying and prioritizing students who need early interventions still remains to be a very challenging task. The retention rate at the University of Nevada, Las Vegas (UNLV) from Fall 2016 to Fall 2017 is 74.4% which indicates the need for specific intervention programs to retain the students who are at risk of dropping out after their first year. In this thesis, we propose the use of predictive modeling methods to identify such at-risk students at an early stage, so that the interventions can be offered in a timely manner. For this, we implemented and compared various classification algorithms of machine learning including Logistic Regression, Decision Trees, Random Forest Classifier, and Support Vector Machines in identifying at-risk students using classic machine learning metrics. The models were trained and tested using a set of features extracted from the datasets housed in UNLV’s data warehouse that capture student information such as pre-college academics, family background, financial situation and academic performance during their first year at UNLV. The experimental results showed that Logistic Regression and Random Forest classifiers performed better in predicting at risk students at UNLV. Furthermore, students were ranked based on their risk of dropping out, which would enable the educators to focus on concentrating their intervention resources effectively.


First-Year; Machine Learning; Rates; Retention; UNLV


Computer Sciences

File Format


Degree Grantor

University of Nevada, Las Vegas




IN COPYRIGHT. For more information about this rights statement, please visit