Illustration from Khaled Al Sabbagh's PhD thesis
Photo: Khaled Al Sabbagh

Improving the Performance of Machine Learning-based Methods for Continuous Integration by Handling Noise

Science and Information Technology

Khaled Al Sabbagh is defending his doctoral thesis "Improving the Performance of Machine Learning-based Methods for Continuous Integration by Handling Noise" for the Degree of Doctor of Philosophy in the subject Computer Science and Engineering.

18 Sep 2023
13:00 - 16:00
Room Tesla, Lindholmen Science Park, Lindholmspiren 5, Gothenburg

About the thesis:

The availability of large amounts of data in Continuous Integration (CI) systems allows companies to utilize machine learning (ML) methods to optimize CI processes. The predictive performance of these methods can be hindered by noise in code change data. Using design science research and controlled experiments, this thesis examines the impact of noisehandling techniques in CI. Two ML-based methods, MeBoTS and HiTTs, are developed for regression testing. A taxonomy and a class noisehandling approach (DB) ae created to reduce class noise. Controlled experiments are conducted to examine the effect of class noise-handling on MeBoTS’ performance. The results show that handling class noise using DB improves test case selection and code change request predictions. Further, memory management and complexity code changes should be tested with performance-related tests. The “majority filter” algorithm is the most effective in improving the prediction of build outcomes and code change requests.

This thesis highlights the importance of handling class noise in code change data to improve test case selection, build outcomes, and change request predictions. It also shows that using code-to-test dependencies offers an effective way to perform regression testing. Finally, it shows that software engineers do not necessarily need to remove attribute noise to gain improvements in test selection.

To full text version of the thesis

Faculty opponent:

Professor Burak Turhan, Faculty of Information Technology and Electrical Engineering, University of Oulu, Finland

Grading committee:

  • Professor Natalia Juristo, Facultad de Informática, Universidad Politécnica de Madrid, Spain
  • Professor Darja Smite, Department of Software Engineering, Blekinge Institute of Technology, Sweden
  • Senior lecturer Markus Borg, Department of Computer Science, Lund University, Sweden

To full text version of the thesis