Date of Award
5-2026
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Mathematical Sciences
First Advisor
Munevver Mine Subasi
Second Advisor
Maria Pozo de Fernandez
Third Advisor
Jian Du
Fourth Advisor
Ryan White
Abstract
Student retention and degree completion remain central challenges for higher-education institutions, with significant implications for student success, institutional effectiveness, and public accountability. While advances in predictive analytics have enabled earlier identification of students at risk of withdrawal, many commonly used machine learning approaches suffer from limited interpretability, constraining their practical usefulness for advising, intervention, and policy decision making. This dissertation addresses the problem of predicting student persistence by developing and evaluating optimization based, interpretable classification models within the Logical Analysis of Data (LAD) framework. Building on existing LAD formulations, this research introduces two novel pattern generation models, the Best Term Generation Model (BTGM) and the En hanced Best Term Generation Model (EBTGM), designed to improve classification performance while preserving model transparency. In addition, the dissertation pro poses a linear programming based binarization approach that endogenously selects discretization thresholds aligned with classification objectives, addressing limitations of conventional ad hoc binarization techniques. The proposed methods are evaluated using both single-pattern and multiple-pattern strategies, including iterative and simul taneous pattern generation frameworks. The empirical analysis applies the enhanced LAD methodology to a publicly available student retention dataset and benchmarks performance against widely used classification algorithms, including logistic regression, decision trees, support vector machines, random forests, and neural networks. Results from a rigorous cross-validation framework demonstrate that the proposed EBTGM models achieve competitive, and in some cases superior, predictive accuracy relative to state-of-the-art methods. Importantly, the iterative EBTGM approach yields a set of low-degree, interpretable patterns that explicitly characterize heterogeneous pathways to student persistence and withdrawal. Overall, this dissertation demonstrates that integrating integer linear programming techniques into the LAD framework provides an effective balance between predictive performance and interpretability. The resulting pattern-based models offer transparent, auditable, and actionable insights that are well suited to early-warning systems and data-driven decision making in higher education contexts.
Recommended Citation
Jaafari, Salihah Ahmed E., "A New Approach to Generate Combinatorial Patterns in Logical Analysis of Data and Its Application to Predict College Retention" (2026). Theses and Dissertations. 1614.
https://repository.fit.edu/etd/1614
Included in
Analysis Commons, Applied Statistics Commons, Data Science Commons, Discrete Mathematics and Combinatorics Commons, Higher Education Commons, Logic and Foundations Commons, Operations Research, Systems Engineering and Industrial Engineering Commons, Other Applied Mathematics Commons, Other Mathematics Commons, Other Physical Sciences and Mathematics Commons, Other Statistics and Probability Commons, Probability Commons, Statistical Models Commons, Survival Analysis Commons