Predicting Particle Fineness in a Cement Mill

Researcher: Rowan Lange, University of the Witwatersrand, Johannesburg
Supervisors: Prof. Anton van Wyk, Dr. Terence van Zyl

Cement production is a multi-billion dollar industry, of which one of the main subprocesses, cement milling, is complex and non-linear. There is a need to model the fineness of particles exiting the milling circuit in order to better control the cement plant. This paper explores the relationship between the particle size of cement produced and various sensor readings from the cement mill circuit. The aim of this paper is to provide a model for predicting the fineness of particles exiting the milling circuit using data on the current and past states of the plant. A comprehensive literature review of the problem as well as a discussion of potential modelling solutions is provided. Blaine (particle fineness) is modelled using many different linear and non linear models on 5 months of data from a large cement plant. On a holdout test set a multi layered perceptron achieved a MAE of 8.799 and a linear regression achieved a R2 of 0.481. discussion of the significance of various features for predicting Blaine is also presented. The results show some success from non-linear data-driven models and highlight the unique difficulties in modelling the cement mill, presenting recommendations for future research.

View the full report

Read More

Indigenous Women Moving From Physical to Digital Fires: The Evolution of Methods of Transmission of Indigenous Knowledge

Researcher: Khanyisile Yolanda Ntsenge, University of the Witwatersrand, Johannesburg
Supervisors: Dr Constance Khupe and Prof. Rod Alence, University of the Witwatersrand, Johannesburg

In response to the threat of extinction of indigenous knowledge, there has been a growing number of people, a significant amount of whom are women, interested in the preservation of indigenous knowledge systems who have begun to use social media platforms such as Twitter and YouTube and the indigenous method of storytelling to share indigenous knowledge.
The aim of the study is to understand how the introduction of the social media platforms Twitter and YouTube has changed the community structures for sharing indigenous knowledge in physical versus social media communities.

The research is informed by a postcolonial indigenous and indigenous feminist approach and employs transformative participatory research in its methodology. Indigenous women in physical communities participated in the research while accounts owned by indigenous women on Twitter and YouTube were analysed. A social network analysis was conducted on both the physical communities data and social media data. Sentiment analysis was conducted on the social media data.

The results show that the network of communities while both anchored by indigenous women have different structures. The physical communities were very tight-knit with members of the networks learning and sharing indigenous knowledge amongst each other thereby potentially reinforcing their knowledge. The social media communities were mainly connected only to the main account and members rarely engaged with each other. The sentiment analysis found conversations in the social media networks to be significantly positive with the highest scoring emotion being that of trust.

The research has shown that although women play an important role in the sharing of indigenous knowledge in both physical and online communities, the community network structures differ. It also evidenced that there is a space and appetite for conversations on indigenous knowledge on social media. Furthermore, as they are in physical communities, women continue to be important custodians of indigenous knowledge and are trusted to share credible indigenous knowledge. This presents opportunities for further exploration on how to leverage social media platforms to mainstream indigenous knowledge while amplifying the voices of indigenous women as custodians of indigenous knowledge.

View the full report

Read More

The use of machine learning in the search for di-photons in association with missing energy

Researcher: Theodore Cwere Gaelejwe, University of the Witwatersrand, Johannesburg
Supervisor: Prof. Bruce Mellado

The Large Hadron Collider (LHC) generates petabytes of data per second during each data taking period and has long term data storage in the order of exabytes. Sophisticated machine learning (ML) techniques are used at the trigger and final state level to analyse this data. Boosted Decision Trees (BDTs) in particular, have been the default ML tool for this task. However, in the recent past, more modern techniques such as Deep Learning have emerged and there has been growing justification for their use in High Energy Physics (HEP). We conduct a comparative study between BDTs and (Deep Neural Networks) DNNs in classifying signal and background events in the H → γγ + Χ decay channel. A comparison between a fully supervised and weakly supervised model is also conducted. Results suggest that DNNs outperform BDTs and the fully supervised model is outperformed by the weakly supervised model though it is more robust.

View the full report

Read More

Prediction of Customer Churn in Telecommunication using Machine Learning Algorithms

Researcher: Perlate Diala, University of the Witwatersrand, Johannesburg
Supervisor: Dr. Hairong Wang

Loss of customers in telecommunication industries has become one of the major concerns in recent years. This is due to a very high of competition among industries and the customer acquisition costs, so it is of great value to keep existing customers. For that purpose, it is of great significant to prevent churn by implementing prediction models that are effective and accurate. However, the major problems with building models for telecommunication are large volumes of data, enormous feature space and Class Imbalance Problem (CIP). This study aims to compare the performance of various machine learning classifiers for the prediction of customer churn in telecommunication. In particular, we explore some pre-processing of the dataset such as dimensionality reduction and seven oversampling techniques to reduce CIP, and hence to improve the performance of the concerned machine learning models. To evaluate the performance of selected machine learning models, the Receiver Operating Characteristic and Area Under the Curve (ROC-AUC curve) was adopted. The experimental results showed that the Logistic Regression classifier coupled with Random Oversampling (ROS) and dimensionality reduction based on linear autoencoder performs better than all other classifiers.

View the full report

Read More