About Us
The DSTI-NICIS National e-Science Postgraduate Teaching and Training Platform (NEPTTP) was launched in 2017 after the Department of Science, Technology, and Innovation (DSTI) identified the need to establish a multi-institutional consortium in data science in South Africa. This platform exists to develop human capital, with the knowledge and skills to conduct cutting-edge research in the field of e-Science – in line with the South African Research Infrastructure Roadmap (SARIR), the Square Kilometre Array (SKA) and others. The platform offers both MSc and MA degrees.
The DSTI, through the National Integrated Cyber Infrastructure System (NICIS), funds the platform which comprises six universities. The University of the Witwatersrand is the administrative hub of the consortium that includes the Universities of Limpopo, Pretoria and Venda as well as North-West University and Sol Plaatje University.
Sol Plaatje University (SPU), University of Limpopo (UL), University of Pretoria (UP), University of Venda (UniVen) all offer the Master of Science degree and the University of the Witwatersrand, Johannesburg (Wits) offers both the Master of Science as well as the Master of Arts and North-West University (NWU) offers the Master of Arts.
These programmes are unique because they traverse faculties – students come from disciplines including Computer Science, Mathematics, Applied Mathematics, Actuarial Sciences, Physics, Statistics, Engineering, Social Sciences, and Public Health.
“We foresee the demand for such degrees increasing exponentially over the next few years,” says Benjamin Rosman, NEPTTP Director and Professor in the School of Computer Science and Applied Mathematics at Wits.
Our graduates are being employed at well-known organisations all over Africa including Absa Group, Amend.org, CSIR, Deloitte, IBM, Rand Merchant Bank, Standard Bank, amongst others.
Entry Requirements
Applicants are required to have a Bachelor with Honours degree (NQF level 8 qualification) from a relevant discipline in Science or Engineering (Computer Science, Mathematics, Physics, and Statistics) OR a relevant NQF level 8 qualification or a relevant Professional Engineering Degree with demonstrable knowledge of basic principles of Computing, Calculus, Linear Algebra, Probability and Statistics. Applicants require a minimum of 65 percent in their NQF level 8 qualification and fulfill any additional institutional application requirements of the institution through which they are applying, and must be co-approved by the Consortium.
Applicants will also be required to complete a number of pre-requisite on-line courses.
This Masters programme aims to train postgraduate students in computational, mathematical and statistical methods to solve data-driven problems. The programme will create opportunities for students in the Computer Science, Statistics, Physics, Electrical Engineering or related fields to gain an interdisciplinary perspective on the emerging fields of Data Science.
This programme forms part of the DSTI-funded National e-Science Postgraduate Teaching and Training Platform (NEPTTP). Students will register with their Home Institution but will attend coursework at Wits University in Johannesburg, Gauteng, in the first year. On completion of the coursework modules, students will move back to their Home Institutions for their second year of study.
Degree Information
The Masters programme extends over eighteen to twenty-four months of full-time study. The programme comprises compulsory and elective modules. Cross-disciplinary data-driven projects are offered both within the University and from a wide range of industry partners. A candidate must undertake modules to the value of 180 credits and must successfully complete the following courses to obtain a Master of Science by Coursework and Research Report in the field of e-Science.
Coursework Modules (Year 1 at Wits University)
2 Compulsory Courses
- Research Methods and Capstone Project in Data Science (15 credits)
This course gives the students the theoretical and practical skills to plan, conduct, analyse and present a scientific assignment (Capstone Project) in the area of Data Science by introducing them to research methodology, ethics and sustainability. The course is comprised of three parts: 1) scientific writing; 2) research methodology; and 3) scientific assignment. These three parts are integrated in a capstone project. - Data Privacy and Ethics (15 credits)
This course introduces the students to the ethical and legal foundations of data science governance. The topics covered include technical processes of data collection, storage, exchange and access; ethical aspects of data management; legal and regulatory frameworks in South Africa and in relevant jurisdictions; data policy; data privacy; data ownership; legal liabilities of analytical decisions, and discrimination; algorithms and technical approaches to enhance data privacy; and relevant case studies.
Any 4 Elective Course on Offer
- Adaptive Computation and Machine Learning (15 credits)
This course provides the candidate with an in−depth understanding of adaptive computing and machine learning. The course consists of machine learning, pattern recognition and computational learning theory in artificial intelligence. Machine learning explores the study and construction of algorithms that can learn from and make predictions using data – such algorithms overcome the limitation of following strictly static program instructions by making data driven predictions or decisions, through building a model from sample inputs. - Data Visualisation and Exploration (15 credits)
This course introduces the field of data visualisation which seeks to determine and present underlying correlated structures and relationships in data sets from a wide variety of application areas. The prime objective of the presentation is to communicate the information in a dataset so as to enhance understanding. The course is comprised of the following subjects: Data and image models; Visualisation attributes (colour) and design (layout); Exploratory data analysis; Interactive data visualisation; Multidimensional data; Graphical perception; Visualisation software (Python & R); and Types of visualisation (Animation, Networks & Text). - Large Scale Computing Systems and Scientific Programming (15 credits)
Conducting e-research/e-science requires a good understanding of the computing principles, methods and tools that have been developed to support the analysis of large-scale and complex data. The course focuses on the software stack but addresses hardware issues as necessary. The course covers a selection of following topics: Introduction to programming environments for scientific computing (e.g. Pandas, Numpy, matplotlib); Principles of distributed systems, and overview of parallel architectures and environments (e.g. FPGA, GPU, multi-core, cluster, grid); Large scale data transfer and storage; Frameworks for large scale data analysis (relational databases, map-reduce, streaming); Scientific workflow management: provenance and replication; Introduction to cloud computing and virtualisation; and Project (e.g. Programming large-data applications on open-source infrastructures for data processing and storage systems). - Large Scale Optimisation for Data Science (15 credits)
Advanced areas of data science require a deeper understanding of the large scale discrete optimisation methods pertaining to the field. In order to bridge this mathematical gap and provide a foundation for further learning, this course will place more emphasis on topics such as convex optimisation, sub-gradient methods, localisation methods, decomposition and distributed optimisation, proximal and operator splitting methods, conjugate gradients, and nonconvex problems. - Mathematical Foundations of Data Science (15 credits)
Advanced areas of data science require a deeper understanding of the fundamental mathematics pertaining to the field. In order to bridge this mathematical gap and provide a foundation for further learning, this course will place more emphasis on topics such as high-dimensional space, best-fit subspaces and singular value decomposition, random walks and Markov chains, statistical machine learning, clustering, random graphs, topic models, non-negative matrix factorisation, hidden Markov models, graphical models, wavelets, and sparse representations. - Special Topics in Data Science (15 credits)
This module deals with specialised and applied concepts and trends in the domain specific areas of data sciences such as finance, health sciences, bioinformatics, natural sciences, social sciences, smart cities, education, and energy. - Statistical Foundations of Data Science (15 credits)
This course provides an understanding of multivariate statistical methods. Hypothesis testing and confidence intervals. The ability to model data using well known statistical distributions as well as handle data that is both continuous and categorical. The ability to perform statistical modeling including multivariate regression and adjust for multiple hypothesis. Forecasting, extrapolation, prediction and modeling using statistical methods. Bayesian statistics. An understanding of bootstrapping and Monte Carlo simulation.
** Not all electives are offered every year.
Research Report (Year 2 at Home Institution)
- Research Report: Data Science (90 credits)
The ability to do research is an essential skill for an individual pursuing a career in Data Science, and forms the basis for further post-graduate study. This module provides practical training for the development of research skills and bridges the gap between theory and practice, and established work and novel research. By working within established research structures in the Institution under the guidance of an expert, students will receive exposure to the methods, philosophy and ethos of research in the field of Data Science.