posted on 2022-12-22, 14:16authored byABDUL RAZZAQ
Feature location (FL) is the task of finding the source code that implements
a specific, user-observable functionality in a software system. Given its key
role in many software maintenance tasks, it is an area of much research and a
wide variety of Feature Location Techniques (FLTs), that rely on source code
structure, dynamic or textual analysis, have been proposed by researchers.
As FLTs evolve and more novel FLTs are introduced, it is important to
perform comparison studies to investigate Which FLTs are relatively better?
However, this thesis shows through a systematic survey of the FL literature
that performing such comparisons would be an arduous process, based on
the large number of techniques to be compared, the heterogeneous nature of
the empirical designs employed to evaluate those FLTs, the lack of openly
available, executable FLTs for re-evaluation, and existing, contradictory per-
formance's results.
This thesis builds on this Systematic Literature Review (SLR) to present
an empirical design cognisant of FL goals which is based on best empirical
practice and common empirical design elements. Then, in order to facilitate
the cross-comparison of FLTs going forward, this thesis employs the resultant
empirical design to cross-compare replicable FLTs, in order to relate their performance.
The results suggest that Vector Space Model (VSM) with lucene
implementation is frequently the best performing openly-available, Information
Retrieval(IR)-based FLT but that the performance of specific FLTs is
(partially) driven/controlled by feature-sets differences.
Towards understanding the impact of feature-set differences, this thesis de-
fines a feature-metric suite that is assessed in terms of its effect on FLTs'
performance, holistically across FLTs and on the individual FLTs.
As contributions, this thesis presents empirical guidelines and an empirical
framework that allows better goal-cognisant, performance-based ranking of
FLTs and also helps to explain the performance of FLTs in relation to the
employed feature-set. It is intended that these advances will, ultimately, allow
a standard selection of the systems and benchmarks during FLT evaluation
which will not only facilitate increased reliability across FLTs' evaluations but
will also greatly improve generality knowledge towards FLT's recommendation
for practitioners given a specific software system. This work is seen as a step
towards standardizing evaluation in the field, thus facilitating comparison
across FLTs.