Security bug reports can describe security critical vulnerabilities in software products. Bug tracking systems may contain
thousands of bug reports, where relatively few of them are security related. Therefore finding unlabelled security bugs among them can
be challenging. To help security engineers identify these reports quickly and accurately, text-based prediction models have been
proposed. These can often mislabel security bug reports due to a number of reasons such as class imbalance, where the ratio of
non-security to security bug reports is very high. More critically, we have observed that the presence of security related keywords in
both security and non-security bug reports can lead to the mislabelling of security bug reports. This paper proposes FARSEC, a
framework for filtering and ranking bug reports for reducing the presence of security related keywords. Before building prediction
models, our framework identifies and removes non-security bug reports with security related keywords. We demonstrate that FARSEC
improves the performance of text-based prediction models for security bug reports in 90% of cases. Specifically, we evaluate it with
45,940 bug reports from Chromium and four Apache projects. With our framework, we mitigate the class imbalance issue and reduce
the number of mislabelled security bug reports by 38%.
Funding
Study on Aerodynamic Characteristics Control of Slender Body Using Active Flow Control Technique