Fine-tuning LLMs for Issue Report Classification

Dec 6, 2024 · 1 min read

Introduction

The classification of issue reports is a critical activity in software project management, as it enables distinguishing bugs, feature requests, enhancements, and other types of reports. Large Language Models (LLMs), such as those from the GPT and BERT families, have shown promising results in text classification tasks due to their ability to understand context and semantics.

This thesis aims to investigate the effectiveness of fine-tuning LLMs for the classification of issue reports, with a focus on optimizing model performance on small-scale datasets. Through a comparative analysis of GPT-like and BERT-like models, the thesis will explore the advantages and limitations of different architectures in the context of automatic classification.

Objectives

Analysis of the State of the Art
Study the application of LLMs for automatic issue report classification and the most advanced fine-tuning techniques.
Experimentation with GPT-like and BERT-like Models
Train and compare various LLMs through fine-tuning on specific issue report datasets, evaluating performance in terms of accuracy, precision, recall, and F1-score.
Contribution to Automated Issue Report Management
Propose guidelines for the implementation of LLMs in the context of automated issue report management, with particular attention to the practical implications of deployment in real-world scenarios.

Last updated on Dec 6, 2024