"What Programmers Do with Inheritance in Java" Replicated on Source Code
Inheritance is an important mechanism in object oriented languages. Quite some research effort is invested in inheritance until now. Most of the research work about inheritance (if not all) is done about the inheritance relationship between the types (namely, classes and interfaces). There is also some debate about if inheritance is good or bad, or how much inheritance is useful. Tempero et al. raised another important question about inheritance. Given the inheritance relationships dened, they wanted to know how much of these relationships were actually used in the system. To answer this question, they developed a model for inheritance usage in Java and analysed the byte code of 93 open source Java projects from Qualitas Corpus (which is a curated collection of Java open source projects). The conclusion of the study was that inheritance was actually used a lot in these projects - in about two thirds of the cases for subtyping and about 22 percent of the cases for (what they call) external reuse. They report about 2 % of internal reuse. Moreover, they found out that down-call (late-bound self-reference) was also used quite frequently, about a third of inheritance relationships included down-call usage. They also report that there are other usages of inheritance, but these are not significant. In this study, we replicate the study of Tempero et al. with one major difference: we analyse the source code of the projects and not the byte code. We use the inheritance model of the original study as-is. We also analyse the same projects, however we obtain them from a different source, namely the compiled version of Qualitas Corpus: Qualitas.class Corpus. Our analysis program is written in the Rascal meta-programming language. In some cases we obtained similar results as the original study. We conclude that at least 60 % of the cases involve subtyping. We have similar results in total reuse. In some other cases, we have different results. We report a lower median of external reuse than the original study (only 3 %). On the contrary, our internal reuse median is higher than theirs (20 %). Our down-call percentage median is also lower (27 % vs. 34 %). For the other possible uses of inheritance, most of our results are similar to the ones of the original study. We discuss the possible reasons for the differences elaborately and come up with four major causes: First of all, the content of the analysed projects is different in our case, which may affect the results in all usages of inheritance. For one third of the projects in the Corpus, we analysed different versions. Moreover, the source code distribution contains often different set of types than byte code packages. This does not mean reporting fewer or more cases, but simply a different percentage. We believe that this has the largest impact on results among all the differences in our study set-up. Secondly, the byte code analysis can be misleading when detecting some particular cases of down-call and external reuse, and we suspect that the original study reports more cases for down-call and external-reuse than it should. Thirdly, our technical limitation about analysis of non-system methods may cause a fewer number of reported cases in subtype and external reuse attributes. As the last cause, we suspect the difference in interpretations of some inheritance definitions. Although we communicated with the authors for clarification and got satisfactory answers in many cases, we can not be totally certain that we have the right understanding of each definition without doing an extensive case study together with the authors.