{"id":4648,"date":"2025-12-24T11:24:10","date_gmt":"2025-12-24T09:24:10","guid":{"rendered":"https:\/\/www.escience.ac.za\/?p=4648"},"modified":"2025-12-24T11:24:15","modified_gmt":"2025-12-24T09:24:15","slug":"uncovering-hidden-tax-fraud-patterns-using-graph-attention-networks-supplemented-with-xgboost-and-kmeans-clustering","status":"publish","type":"post","link":"https:\/\/www.escience.ac.za\/index.php\/2025\/12\/24\/uncovering-hidden-tax-fraud-patterns-using-graph-attention-networks-supplemented-with-xgboost-and-kmeans-clustering\/","title":{"rendered":"Uncovering hidden tax fraud patterns using graph attention networks supplemented with XGBoost and KMeans clustering"},"content":{"rendered":"<p><strong>Researcher<\/strong>: Zolani Xulu, University of the Witwatersrand, Johannesburg<br><strong>Supervisor<\/strong>: Dr. Martins Arasomwan, University of the Witwatersrand, Johannesburg<\/p><p>This study introduces a hybrid machine learning framework that combines a Graph Attention Network (GAT), k-means clustering, and XGBoost enhanced with the Synthetic Minority Oversampling Technique (SMOTE) to uncover hidden tax fraud patterns. The GAT is employed to learn node embeddings that capture both the individual features of taxpayers and their interconnections within a tax<br>network. These embeddings are then clustered using k-means to reveal unusual behavioral patterns, while XGBoost performs final classification between fraudulent and legitimate entities. By integrating graph-based learning with clustering and ensemble methods, this approach enhances fraud detection<br>accuracy and interpretability. The results demonstrate that hybrid graph-driven models outperform traditional systems in identifying complex and previously unseen fraud behaviors, offering tax authorities a powerful data-driven tool for improving compliance and reducing revenue losses.<\/p><div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"724\" src=\"https:\/\/www.escience.ac.za\/wp-content\/uploads\/2025\/12\/186045_Zolani_Xulu_Capestone_project_poster_1646_383584306-1024x724.jpg\" alt=\"\" class=\"wp-image-4649\" srcset=\"https:\/\/www.escience.ac.za\/wp-content\/uploads\/2025\/12\/186045_Zolani_Xulu_Capestone_project_poster_1646_383584306-1024x724.jpg 1024w, https:\/\/www.escience.ac.za\/wp-content\/uploads\/2025\/12\/186045_Zolani_Xulu_Capestone_project_poster_1646_383584306-300x212.jpg 300w, https:\/\/www.escience.ac.za\/wp-content\/uploads\/2025\/12\/186045_Zolani_Xulu_Capestone_project_poster_1646_383584306-768x543.jpg 768w, https:\/\/www.escience.ac.za\/wp-content\/uploads\/2025\/12\/186045_Zolani_Xulu_Capestone_project_poster_1646_383584306-1536x1086.jpg 1536w, https:\/\/www.escience.ac.za\/wp-content\/uploads\/2025\/12\/186045_Zolani_Xulu_Capestone_project_poster_1646_383584306-2048x1448.jpg 2048w, https:\/\/www.escience.ac.za\/wp-content\/uploads\/2025\/12\/186045_Zolani_Xulu_Capestone_project_poster_1646_383584306-600x424.jpg 600w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div><p>&nbsp;<\/p><p><\/p>","protected":false},"excerpt":{"rendered":"<p>Researcher: Zolani Xulu, University of the Witwatersrand, JohannesburgSupervisor: Dr. Martins Arasomwan, University of the Witwatersrand, Johannesburg This study introduces a hybrid machine learning framework that combines a Graph Attention Network (GAT), k-means clustering, and XGBoost enhanced with the Synthetic Minority Oversampling Technique (SMOTE) to uncover hidden tax<\/p>\n","protected":false},"author":3,"featured_media":4493,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[],"class_list":["post-4648","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-capstone-projects"],"_links":{"self":[{"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/4648","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/comments?post=4648"}],"version-history":[{"count":1,"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/4648\/revisions"}],"predecessor-version":[{"id":4650,"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/4648\/revisions\/4650"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/media\/4493"}],"wp:attachment":[{"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/media?parent=4648"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/categories?post=4648"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.escience.ac.za\/index.php\/wp-json\/wp\/v2\/tags?post=4648"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}