Keele Research Repository
Explore the Repository
Brereton, OP, Kitchenham, BA, Madeyski, L, Budgen, D, Keung, J, Charters, S, Gibbs, S and Pohthong, A (2016) Robust Statistical Methods for Empirical Software Engineering. Empirical Software Engineering: an international journal, 22 (2). 579- 630. ISSN 1573-7616
This is the latest version of this item.
B Kitchenham - Robust Statistical Methods for Empirical Software Engineering.pdf - Published Version
Available under License Creative Commons Attribution.
Download (2MB) | Preview
Abstract
There have been many changes in statistical theory in the past 30 years, including increased evidence that non-robust methods may fail to detect important results. The statistical advice available to software engineering researchers needs to be updated to address these issues. This paper aims both to explain the new results in the area of robust analysis methods and to provide a large-scale worked example of the new methods. We summarise the results of analyses of the Type 1 error efficiency and power of standard parametric and non-parametric statistical tests when applied to non-normal data sets. We identify parametric and non-parametric methods that are robust to non-normality. We present an analysis of a large-scale software engineering experiment to illustrate their use. We illustrate the use of kernel density plots, and parametric and non-parametric methods using four different software engineering data sets. We explain why the methods are necessary and the rationale for selecting a specific analysis. We suggest using kernel density plots rather than box plots to visualise data distributions. For parametric analysis, we recommend trimmed means, which can support reliable tests of the differences between the central location of two or more samples. When the distribution of the data differs among groups, or we have ordinal scale data, we recommend non-parametric methods such as Cliff’s δ or a robust rank-based ANOVA-like method.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | empirical software engineering; statistical methods; robust methods; robust statistical methods |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software |
Divisions: | Faculty of Natural Sciences > School of Computing and Mathematics |
Depositing User: | Symplectic |
Date Deposited: | 31 May 2017 10:58 |
Last Modified: | 11 Apr 2019 09:04 |
URI: | https://eprints.keele.ac.uk/id/eprint/3517 |
Available Versions of this Item
-
Robust Statistical Methods for Empirical Software Engineering. (deposited 20 Jul 2016 13:07)
- Robust Statistical Methods for Empirical Software Engineering. (deposited 31 May 2017 10:58) [Currently Displayed]