Prediction of Customer Churn in Telecommunication using Machine Learning Algorithms

Researcher: Perlate Diala, University of the Witwatersrand, Johannesburg
Supervisor: Dr. Hairong Wang

Loss of customers in telecommunication industries has become one of the major concerns in recent years. This is due to a very high of competition among industries and the customer acquisition costs, so it is of great value to keep existing customers. For that purpose, it is of great significant to prevent churn by implementing prediction models that are effective and accurate. However, the major problems with building models for telecommunication are large volumes of data, enormous feature space and Class Imbalance Problem (CIP). This study aims to compare the performance of various machine learning classifiers for the prediction of customer churn in telecommunication. In particular, we explore some pre-processing of the dataset such as dimensionality reduction and seven oversampling techniques to reduce CIP, and hence to improve the performance of the concerned machine learning models. To evaluate the performance of selected machine learning models, the Receiver Operating Characteristic and Area Under the Curve (ROC-AUC curve) was adopted. The experimental results showed that the Logistic Regression classifier coupled with Random Oversampling (ROS) and dimensionality reduction based on linear autoencoder performs better than all other classifiers.

View the full report

Read More

Predicting Particle Fineness in a Cement Mill

Researcher: Rowan Lange, University of the Witwatersrand, Johannesburg
Supervisors: Prof. Anton van Wyk, Dr. Terence van Zyl

Cement production is a multi-billion dollar industry, of which one of the main subprocesses, cement milling, is complex and non-linear. There is a need to model the fineness of particles exiting the milling circuit in order to better control the cement plant. This paper explores the relationship between the particle size of cement produced and various sensor readings from the cement mill circuit. The aim of this paper is to provide a model for predicting the fineness of particles exiting the milling circuit using data on the current and past states of the plant. A comprehensive literature review of the problem as well as a discussion of potential modelling solutions is provided. Blaine (particle fineness) is modelled using many different linear and non linear models on 5 months of data from a large cement plant. On a holdout test set a multi layered perceptron achieved a MAE of 8.799 and a linear regression achieved a R2 of 0.481. discussion of the significance of various features for predicting Blaine is also presented. The results show some success from non-linear data-driven models and highlight the unique difficulties in modelling the cement mill, presenting recommendations for future research.

View the full report

Read More

Discrimination of Signal-Background Events with Supervised and Weakly Supervised Learning in the Search for New Bosons Decaying to Z + y Final State

Researcher: Nkateko Baloyi, University of the Witwatersrand, Johannesburg
Supervisors: Prof. Bruce Mellado and Dr Xifeng Ruan, University of the Witwatersrand, Johannesburg

The theory of Standard Model (SM) has successfully driven experiments and predictions since it was developed in 1975 and it has never been contradicted by experimental results. In 2012, the SM led to the discovery of Higgs boson (h) at the Large Hadron Collider (LHC) which completed the particle spectrum of the SM. The discovery of h inspired experimental studies to further understand the h scalar properties, opening the search for Beyond the Standard Model physics (BSM). BSM physics searches for new particles that can help understand and answer phenomena that cannot be explained by the SM. The LHC collides protons at high luminosity and high energy trying to recreate particles that occurred moments after the big bang and the BSM particles. The data produced during the collision requires advanced techniques that can search for relevant information in the data for signal (S) events to be identified. The production of the S events comes with a huge amount of background (B) production to which the S events cannot be easily identified. advanced machine learning (ML) and statistical techniques can be used to isolate the S events from the B events. ML is a subset of artificial intelligence that provides systems the ability to automatically learn and improve from experience. This research focuses on the application of boosted decision trees (BDT) and deep neural networks (DNN) on the Monte Carlo simulated data. Supervised learning and weakly-supervised learning (WSL) approaches are implemented to discriminate the S from the B events. The supervised learning is used as a benchmark to measure the performance of the WSL approach. Pre-selection cuts are applied on the data and four different models are applied to classify the S and B events for both BDT and DNN using the supervised learning and WSL approach. The WSL approach use two samples, one sample with only B events and the other sample mixed with S and B events to train the model. The DNN model is trained on the samples and applied to classify the S and B events on the same test data used in the supervised learning approach. The performance of the WSL models are compared to the supervised learning models performance. The results show a strong bias in the WSL approach.

View the full report

Read More

Analysing and Tracking the Evolution of South Africa’s Foreign Policy in the Democratic Era Through Parliamentary Debate

Researcher: Katherine Bebington, University of the Witwatersrand, Johannesburg
Supervisor: Prof. Rod Alence, University of the Witwatersrand, Johannesburg

The year 1994 marked the beginning of a new era in South Africa as the country became a democracy, stepping out of the international isolation that dogged the Apartheid government. The question must therefore be asked what evolution has South Africa’s foreign policy undertaken over the first 25 years of South Africa’s democracy and what changes have occurred? Data analysis through the use of text mining will be done on the budget vote speeches delivered by the South African ministers of foreign affairs/international relations to the South African parliament and the subsequent participation of political parties in the budget vote debate. This analysis will be conducted through a geo-political study of the data, sentiment analysis and a thematic analysis in order to track evolution of South Africa’s foreign policy for the period 1994-2018.

View the full report

Read More