Implementing Machine Learning Algorithms to Predict Donor Status: Preliminary Work with Data from an Institution of Higher Learning | Zendy

Cecilia Coulter | Zendy; Paula Baingana | Zendy; Pascaline Mukakamari | Zendy

Open Access

Implementing Machine Learning Algorithms to Predict Donor Status: Preliminary Work with Data from an Institution of Higher Learning

Author(s) -

Cecilia Coulter,

Paula Baingana,

Pascaline Mukakamari

Publication year - 2020

Publication title -

actas del congreso internacional de ingeniería de sistemas

Language(s) - English

Resource type - Conference proceedings

ISSN - 2810-806X

DOI - 10.26439/ciis2019.5527

Subject(s) - machine learning , undersampling , computer science , artificial intelligence , metric (unit) , resampling , statistical classification , random forest , receiver operating characteristic , algorithm , predictive power , data mining , engineering , philosophy , operations management , epistemology

Identifying potential donors allows institutions of higher learning to conduct more effective fundraising campaigns. Machine learning classification algorithms can be useful in building models to predict donor status. However, when data contains imbalanced classes, like the data we used for this project, models tend to over-index the majority class, which was non-donors in this case. These results have significant implications for institutions in that they may not pursue entities that may, in fact, become donors. In order to improve the usefulness of our model, we used a resampling technique called random undersampling (RUS) to balance the data and also the area under the receiver operating characteristic curve (AUC-ROC) metric to evaluate the performance. Our final model improved its predictive power from 67% to 76%. Institutions of higher learning can use this machine learning model to more efficiently target the pool of potential donors, saving money and time. Future research will focus on improving the predictive accuracy of our model by exploring other data manipulation techniques that minimize the effect of imbalanced data, changing thresholds for classification algorithms, and using genetic programming and feature engineering.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research