Loading...
Thumbnail Image
Publication

From Online user feedback to requirements: evaluating large language models for classification and specification tasks

Date
2026-03-01
Abstract
[Context and Motivation] Online user feedback provides valuable information to support requirements engineering (RE). However, analysing online user feedback is challenging due to its large volume and noise. Large language models (LLMs) show strong potential to automate this process and outperform previous techniques. They can also enable new tasks, such as generating requirements specification. However, existing work largely focuses on large proprietary models. Consequently, lightweight open-source LLMs remain underexplored for feedback-drivenRE. [Question/Problem] In particular, existing studies offer limited empirical evidence, lack thorough evaluation, and rarely provide replication packages, undermining validity and reproducibility. [Principal Idea/Results] We evaluate five lightweight open-source LLMs on three RE tasks: NFR classification, user request classification, and requirements specification generation. Classification performance was measured on two feedback datasets, and specification quality via human evaluation. LLMs achieved moderate-to-high classification accuracy (F1 ≈ 0.47–0.68) and moderately high specification quality (mean ≈ 3/5). [Contributions] We newly explore lightweight open-source LLMs for feedback driven requirements development. Our contributions are: (i) an empirical evaluation of lightweight LLMs on three RE tasks, (ii) a replication package, and (iii) insights into their capabilities and limitations for RE.
Supervisor
Description
This is a preprint and maybe subject to change
Publisher
Citation
32nd International Working Conference on Requirement Engineering: Foundation for Software Quality (REFSQ 2026)
Funding code
Funding Information
Sustainable Development Goals
External Link
License
Attribution-NonCommercial-ShareAlike 4.0 International
Embedded videos