Advanced Mutation Testing with Zero and Few‑Shot Evaluation Using GPT‑v4
Published in 29th CSICC (under review), 2025
Mutation testing assesses the quality of software test suites by introducing small changes (mutations) to code and checking whether the tests detect them. This paper investigates zero‑ and few‑shot approaches using GPT‑v4 to automatically generate and evaluate mutants without extensive retraining. The authors propose a pipeline that prompts GPT‑v4 with mutated and original code snippets, evaluates the tool’s ability to identify behavioural differences and analyses the correlation between model confidence and mutation detection success. Results suggest that large language models can assist software quality assurance by prioritising test cases and identifying weak spots in test suites.