Customer churn prediction in telecoms industry using random forest
Researcher: Fortune Mhlanga, University of the Witwatersrand, Johannesburg
Supervisor: Dr Wilbert Chagwiza, University of the Witwatersrand, Johannesburg
The telecommunications industry lose customers to their competitors daily in the world. The churning of customers always leads to reduced revenues. The need to create predictive models that would predict customers that are likely to churn will significantly increase the revenue of the industry.
This study seeks to develop a customer churn prediction model using the random forest algorithm. The performance of the model is measured using statistical measures such as accuracy, precision, recall, f-measure, Cohen’s kappa, Gini coefficient and Matthew’s correlation coefficient. This study explores the SK-learn default filter method, the step forward feature selection, the step backward feature selection, and the exhaustive feature selection, to find features that can be used as inputs of the model. This study shows that the random forest algorithm is not the best at predicting customer churn for the telecommunications industry, since the best performing model has a recall metric value of 52.2%.