Many approaches to support (semi-automatic) identification of objects in legacy code take the data structures as starting point for candidate classes. Unfortunately, legacy data structures tend to grow over time, and may contain many unrelated fields at the time of migration. We propose a method for identifying objects by semi-automatically restructuring the legacy data structures. Issues involved include the selection of record fields of interest, the identification of procedures actually dealing with such fields, and the construction of coherent groups of fields and procedures into candidate classes. We explore the use of cluster and concept analysis for the purpose of object identification, and we illustrate their effect on a 100,000 LOC COBOL system. Furthermore, we use these results to contrast clustering with concept analysis techniques.

, ,
Software Engineering [SEN]
Software Analysis and Transformation

van Deursen, A., & Kuipers, T. (1998). Identifying objects using cluster and concept analysis. Software Engineering [SEN]. CWI.