Issue Report Classification

Synthetic Issue Report Data Generation

Introduction The automatic classification of Issue Reports is a crucial task for the efficient management of software projects. Machine learning models, such as BERT and its variants, have shown significant effectiveness in this task. However, training these models requires large amounts of labeled data, which are often expensive to obtain, especially in low-resource settings. In such cases, the generation of synthetic data emerges as a promising alternative to increase the size of the training dataset and improve the performance of classifiers. The use of large language models (LLMs), such as GPT, for generating synthetic text has demonstrated great potential in various domains. Research Objectives This thesis aims to investigate the use of LLMs for generating synthetic Issue Reports to train classifiers in low-resource settings. The specific objectives of the research are: Developing an approach for generating synthetic Issue Reports using few-shot prompts with LLMs. A labeled initial set of Issue Reports will be used to construct few-shot prompts that guide an LLM in generating new synthetic Issue Reports. Various prompting techniques will be explored to optimize the quality and diversity of the generated data. Training and evaluating Issue Report classifiers using a synthetic (or hybrid) dataset. Several Issue Report classifiers will be trained using the generated synthetic dataset or a hybrid dataset combining real and synthetic data. The classifiers’ performance will be evaluated in terms of accuracy, precision, recall, and F1-score, comparing them with the performance achieved using only the initial dataset. Analyzing the impact of the size and quality of the synthetic dataset on classifier performance. The study will investigate how the size and quality of the synthetic dataset influence classifier performance. Techniques to improve the quality of the synthetic data will be explored, such as filtering or manually selecting generated examples. The results of this research will contribute to a better understanding of the use of LLMs for generating synthetic data in the field of software engineering and provide practical guidelines to enhance the accuracy of Issue Report classifiers in low-resource settings.

Dec 6, 2024

Investigating Prompt Variations for Issue Report Classification with Large Language Models

Introduction Classifying issue reports is a crucial task in software development, as it helps manage and resolve problems more efficiently. With the advent of Large Language Models (LLMs) such as BERT and GPT, this task can be automated, enhancing both the accuracy and speed of classification. While BERT-like models require fine-tuning on labeled datasets to achieve optimal performance, GPT-like models can operate in zero-shot scenarios, leveraging their ability to understand tasks through descriptions provided in the prompt. However, the effectiveness of these models heavily depends on the structure of the prompt, including task descriptions, labels, and examples. This thesis aims to explore the impact of prompt variations on GPT-like models for issue report classification, comparing their performance to that of BERT-like models fine-tuned on datasets sourced from GitHub and Jira. Objectives Explore prompting techniques for GPT-like models, analyzing how prompt design affects accuracy and performance. Identify the best prompting strategies based on model type, availability of labeled data, and computational resources. Evaluate performance on issue report datasets from GitHub and Jira, analyzing differences across zero-shot, few-shot, and fine-tuning scenarios. Compare GPT-like and BERT-like models, contrasting prompt engineering with fine-tuning for issue report classification.

Dec 6, 2024

Fine-tuning LLMs for Issue Report Classification

Introduction The classification of issue reports is a critical activity in software project management, as it enables distinguishing bugs, feature requests, enhancements, and other types of reports. Large Language Models (LLMs), such as those from the GPT and BERT families, have shown promising results in text classification tasks due to their ability to understand context and semantics. This thesis aims to investigate the effectiveness of fine-tuning LLMs for the classification of issue reports, with a focus on optimizing model performance on small-scale datasets. Through a comparative analysis of GPT-like and BERT-like models, the thesis will explore the advantages and limitations of different architectures in the context of automatic classification. Objectives Analysis of the State of the Art Study the application of LLMs for automatic issue report classification and the most advanced fine-tuning techniques. Experimentation with GPT-like and BERT-like Models Train and compare various LLMs through fine-tuning on specific issue report datasets, evaluating performance in terms of accuracy, precision, recall, and F1-score. Contribution to Automated Issue Report Management Propose guidelines for the implementation of LLMs in the context of automated issue report management, with particular attention to the practical implications of deployment in real-world scenarios.

Dec 6, 2024