Date of Award

5-2026

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Mathematical Sciences

First Advisor

Munevver Mine Subasi

Second Advisor

Maria Pozo de Fernandez

Third Advisor

Jian Du

Fourth Advisor

Ryan White

Abstract

Student retention and degree completion remain central challenges for higher-education institutions, with significant implications for student success, institutional effectiveness, and public accountability. While advances in predictive analytics have enabled earlier identification of students at risk of withdrawal, many commonly used machine learning approaches suffer from limited interpretability, constraining their practical usefulness for advising, intervention, and policy decision making. This dissertation addresses the problem of predicting student persistence by developing and evaluating optimization based, interpretable classification models within the Logical Analysis of Data (LAD) framework. Building on existing LAD formulations, this research introduces two novel pattern generation models, the Best Term Generation Model (BTGM) and the En hanced Best Term Generation Model (EBTGM), designed to improve classification performance while preserving model transparency. In addition, the dissertation pro poses a linear programming based binarization approach that endogenously selects discretization thresholds aligned with classification objectives, addressing limitations of conventional ad hoc binarization techniques. The proposed methods are evaluated using both single-pattern and multiple-pattern strategies, including iterative and simul taneous pattern generation frameworks. The empirical analysis applies the enhanced LAD methodology to a publicly available student retention dataset and benchmarks performance against widely used classification algorithms, including logistic regression, decision trees, support vector machines, random forests, and neural networks. Results from a rigorous cross-validation framework demonstrate that the proposed EBTGM models achieve competitive, and in some cases superior, predictive accuracy relative to state-of-the-art methods. Importantly, the iterative EBTGM approach yields a set of low-degree, interpretable patterns that explicitly characterize heterogeneous pathways to student persistence and withdrawal. Overall, this dissertation demonstrates that integrating integer linear programming techniques into the LAD framework provides an effective balance between predictive performance and interpretability. The resulting pattern-based models offer transparent, auditable, and actionable insights that are well suited to early-warning systems and data-driven decision making in higher education contexts.

Share

COinS