Reverse engineering source code : empirical studies of limitations and opportunities

Landman, Davy

The goal of software renovation is to modernize software. One way to achieve this is to first reverse engineer the essential concepts and abstractions used in the software from source code and then use these during renovation. Scaling reverse engineering to large software systems requires automated analysis. Automation often comes at the cost of over-approximation or under-approximation. This thesis explores the limits of and opportunities for these approximations via three research questions.

First, we have explored the limits of domain model recovery by manually recovering domain models. Comparing these models to a manually constructed reference domain model we found that most domain information could be recovered -- with high quality -- from the source code. Second, we have explored using both Cyclomatic Complexity (CC) and Source Lines of Code (SLOC) for automating reverse engineering. Almost all of the literature claims a strong linear correlation between these two metrics. This is often interpreted as indication that CC and SLOC are redundant to each other. In two large corpora we did not observe a strong correlation. We interpret this as a lack of evidence for CC being redundant to SLOC.

Finally, we have explored the limits of statically analyzing Java’s Reflection API. Analyzing a representative corpus revealed that 78% of all projects use Reflection. After identifying the common assumptions and limitations of relevant static analysis tools we found them widely challenged in the corpus. We propose new opportunities for static analysis tools.

Additional Metadata
Promotor	P. Klint (Paul) , J.J. Vinju (Jurgen)
Degree Grantor	Universiteit van Amsterdam
Persistent URL	hdl.handle.net/11245.1/d7139e2b-7581-4ef8-af89-0a1df03d492e
Series	IPA dissertation series
Organisation	Software Analysis and Transformation
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Landman, D. (2017, October 5). Reverse engineering source code : empirical studies of limitations and opportunities. IPA dissertation series. Retrieved from http://hdl.handle.net/11245.1/d7139e2b-7581-4ef8-af89-0a1df03d492e

Free Full Text ( Final Version , 12mb )

Reverse engineering source code : empirical studies of limitations and opportunities

Publication

Publication

Address

CWI researchers

Questions or comments?

Reverse engineering source code : empirical studies of limitations and opportunities

Publication

Publication

Workflow

Workflow

Add Content