Large-scale semi-automated migration of legacy C/C ++ test code

This is an industrial experience report on a large semi-automated migration of legacy test code in C and C ++ . The particular migration was enabled by automating most of the maintenance steps. Without automation this particular large-scale migration would not have been conducted, due to the risks involved in manual maintenance (risk of introducing errors, risk of unexpected rework, and loss of productivity). We describe and evaluate the method of automation we used on this real-world case. The benefits were that by automating analysis, we could make sure that we understand all the relevant details for the envisioned maintenance, without having to manually read and check our theories. Furthermore, by automating transformations we could reiterate and improve over complex and large scale source code updates, until they were “just right.” The drawbacks were that, first, we have had to learn new metaprogramming skills. Second, our automation scripts are not readily reusable for other contexts; they were necessarily developed for this ad-hoc maintenance task. Our analysis shows that automated software maintenance as compared to the (hypothetical) manual alternative method seems to be

Although a successful product family such as described above will live for a long time, its components may live shorter lives.Components may become obsolete (unused), or their maintainability has deteriorated over the years (accumulated technical debt) such that replacement or rejuvenation 3,4 is required, or a newer, better component has arrived which can replace the "legacy" component.From the perspective of the entire product family, replacing the old component with the new component is called a "refactoring."A refactoring in general is a semantics-preserving code change.The system changes, but its observable behavior from the outside does not change.

A software maintenance paradox
The question we address in this article is how to replace an existing component with a new one, given a complex product family ("legacy system") that depends on the old component in unforeseen ways.Performing such maintenance on a legacy system is very labor-intensive.The internal details of a system can only be read from the code, as are the complex interactions between subsystems, demanding a significant learning effort from a maintenance engineer.Furthermore, manual large-scale maintenance tends to be error-prone: due to the sheer size of the task at hand, it is likely some cases are missed, and due to the frequently occurring repetitiveness, accidental errors are easily introduced.For software maintainers, it is extremely difficult to guarantee that their changes are indeed a "refactoring."For these reasons, among others, responsible stakeholders are often hesitant to approve preventive maintenance replacing legacy components.Since significant development effort is required to perform such a replacement, and no new features are introduced in the system, the return-on-investment seems low, while the risk-of-failure seems high.In other words, necessary maintenance on the product family is not done to avoid risks, while not doing the maintenance also entails big risks.We are stuck.

A case of API migration
In this article, we take an "industry-as-lab" approach. 5We describe a real case of a component replacement refactoring carried out at Philips.A legacy library component for unit testing was replaced with a modern library for unit testing.The quality of the testing code that uses the legacy library is absolutely critical to the business; if tests would be lost or their effectiveness in detecting errors reduced, this would entail a commercial risk to all of our stakeholders.In our organization, passing all the automated tests is one of the (many) required quality gates for every code change to the product family, including changes to the tests themselves.
Thus, it is our task to correctly and completely migrate the code that uses the API of the legacy library to using the API of the modern library.Furthermore, all references of said legacy library in build automation scripts and other software artifacts must be removed or replaced by references to the new library.To avoid the inherent risks of manual maintenance, all changes to existing code were automated using metaprogramming scripts.The current article reports on our experiences with automating this API migration in C and C++ code, and the surrounding build artifacts.

The process of API migration
Figure 1 describes an iterative process for an API migration.On the left a hypothetical manual process is depicted and on the right our semi-automated process is depicted.Note that our starting point is a version (snapshot) of the source code of a system that is passing (and has passed) all existing quality assurance gates.
Step 0 In this (optional) step, the codebase can be partitioned into smaller chunks at the metaprogrammer's discretion.In principle, there is no limit to the size of the object software system for either process variant.However, our mandatory formal code review policy (cf. Step 5) requires that an independent colleague manually check all code changes; the task of having to do this for the complete codebase at once would simply be too large.
Step 1 The system under study comprises millions of lines of code and in principle any part of the system could depend on the legacy library.Next to code dependencies, references to the library also appear in other software artifacts, such as makefiles and configuration scripts.So, every iteration starts with an analysis which answers the questions: (a) where are dependencies on the old library and (b) how would we update these dependencies to the new library in principle?Is there a common change pattern that we can apply?Step 2 Based on the knowledge acquired in Step 1, we start changing code.In the manual process this is done file-by-file, while in the automated process we first write a script that captures the knowledge acquired as an executable metaprogram.This difference has two effects.First, running the automated script scales much more easily to be applied to hundreds or even thousands of files without additional manual effort, so this is a great time saver.Second, we can now rerun improved versions of the maintenance scripts again and again based on improved insights, without damage to our efficiency.This leads to improved consistency and thus correctness and understandability, as compared to the manual process.
Step 3 We now (automatically) compile the entire refactored system and run all of the test suites of the product family using the existing build and test infrastructure.
Step 4 Failing tests are triaged and diagnosed manually.In the manual case, we would try and run the new tests on a few of the files first, before having refactored the entire system.However, the system is probably not organized in a way as to allow such individual tests to run, so this comes at a cost of changing makefiles and isolating the tests of (parts of) components.The automated refactoring is run on the entire system and we simply continue our analysis where the first errors start to appear.Note that it can be assumed that any failing tests are due to our latest changes per our starting assumption mentioned above.
Step 5 The next quality gate to pass is a pre-delivery manual code review by an independent colleague.If this fails, we go back to the drawing board; if it succeeds, the code is accepted into the main branch of the version control system.
The new code will eventually undergo manual and automated integration tests and pass many other quality gates until it reaches (different) product deployments.This is outside our scope and influence.The entire process only stops when all tests are succeeding and all traces of the old library have disappeared from the source code and from other software artifacts.
For the sake of clarity, we repeat here: the manual process was not actually conducted and without an automated alternative this particular API would not have been migrated.

Roadmap
The remainder of this article is organized as follows.In Section 2, we will introduce the details of the case study.The metaprogramming skills we needed to automate the maintenance steps are described in Section 3. It can be read as a quick introduction into the Rascal metaprogramming language.Then, in Section 4, we will detail all of the maintenance scripts we have developed for the case study.In Section 5, we will reflect back on our experiences, zoom out of this specific case and discuss related work, before concluding in Section 6.
We hope that others who are stuck in a similar situation, where high-risk maintenance on a product family is necessary, can learn from the experiences we describe here and become unstuck like we did.

PROBLEM ANALYSIS: MIGRATING A LEGACY C++ TEST API TO Googletest
The software components we analyze and manipulate for this case control a Philips high-tech interventional X-ray system for the diagnosis and treatment of cardiovascular diseases.The software controls the collaborating machines in an operating theater, where X-rays are used both for live imaging features as well as for physical interventions.Typically there are several human operators at work at the same time.
The software controlling these devices consists of millions of lines of C and C++ code.For this case, we are focusing on the subsystem (a collection of components) that is responsible for the positioning of the X-ray beam with respect to the patient.It controls the motors that move robot arms as well as the patient support table.This "Positioning Subsystem" comprises well over half a million of source lines of code (SLOC, i.e., non-empty and non-commented lines of source code).
Before the migration, the codebase contains two separate testing frameworks: GoogleTest is used for younger components, while the older components are tested using a much older proprietary framework called Simple Test eXecutor (STX).The frameworks have slightly different API and run-time semantics, which may lead to future confusion, and STX itself also requires maintenance which seems redundant now that an open-source alternative exists.Migrating away from STX in favor of GoogleTest should improve the quality of the tests and allow us to focus on maintaining other code than STX itself.The fact that GoogleTest is a lively open-source project is also considered beneficial, since it will allow us to surf on improvements and extensions without having to maintain our own testing framework.The fact that GoogleTest is open-source is also essential for our exit strategy: in case new releases of the GoogleTest project should become unstable or unreliable we would be able to fork an earlier stable snapshot and fall back to our previous strategy of maintaining our own testing framework.
However, the task of migrating away from STX is substantial: the codebase contains over 150 STX test suites, each ranging from a few hundreds to several thousands of lines of code.Not surprisingly, the application programming interfaces (APIs) of STX and GoogleTest do not match.This impacts client code, the code that is written against the test API.It must be rewritten in order to even use the GoogleTest API correctly, and also to benefit from testing and reporting features that are present in GoogleTest but not in STX.
Referring to Figure 1, the current section describes the results of the Problem Analysis (Step 1).We describe how to make an STX test case in Section 2.1, and show the corresponding intended counterpart in GoogleTest in Section 2.2.In Section 2.3, we summarize the required editing steps to migrate STX test suites to GoogleTest.
The Positioning Subsystem uses CMake to automate the build process.CMake is a version of the Unix Make command which offers "build rules" to enable the conversion of input files (i.e., source code and libraries) to output files F I G U R E 1 Two API migration processes; one manual and one automated, as embedded into the general quality assurance/code review process F I G U R E 2 UML sequence diagram depicting the lifetime of a single STX test (binary executables) using compilation and linking steps.As part of the migration, CMake files with references to the STX library have to be changed as well.We will show a CMake file for STX in Section 2.1, followed by the corresponding CMake file for GoogleTest in Section 2.2.Finally, in Section 2.3, we summarize how to migrate CMake files from STX to GoogleTest.

The legacy STX testing framework
The STX test framework is several decades old, and is used at Philips to test software units written in the C and C++ programming languages.A single compiled test suite, colloquially referred to as "an STX," is a Windows executable file of which the name ends with _stx.Such an executable contains production code, statically linked as libraries to one or more compiled test cases.STX can be used to test both C and C++ code. Figure 2 provides an overview of the execution lifetime of a single STX, according to the following pattern.   .The StartTestServer function creates a process to execute the test case (cf.step 1).The test case performs some initializations (line 15), after which it stores the current memory usage (line 16), executes the test function (line 17), and stores the memory usage a second time (line 18).The GEN_m_assert assertion macro checks that the memory usage has not changed by executing the test steps (line 19).If this is indeed the case, the test process informs the main process that it has succeeded (line 20).
When an assertion is violated, the file name, line number, and a stack dump are written to a log file.In addition, the test framework is informed of the assertion failure, and the executable exits with a nonzero exit status.The call to exit on line 10 is only reached after successful termination of all test cases.
Test cases report on their successful or unsuccessful outcome to the main process thought the NotifyTestPassed and NotifyTestFailed functions.
To compile a STX test suite there are specific CMake rules written for each STX.Listing 2 shows a CMake file for a hypothetical example_stx file.In this script, the following variables are set: • The target name of the build is set to example_stx.
• The folder in which the project is accessible in the Microsoft Visual Studio IDE is stored in IDE_FOLDER.
• The DIRS variable is assigned additional include directories, in particular the STXServerLib headers directory.
• The SOURCES gets a list of source files that are to be compiled.In this example, there is a single C file example_stx.c.
• Production libraries are specified in the DEPS variable.
Finally, the user-defined configureTestExecutable macro adds test-specific dependencies to the DEPS variable, and sets the output location of the STX executable.Listing 2: Example CMake file for STX, containing build information for a hypothetical example_stx test file

GoogleTest framework
GoogleTest is a unit testing framework, developed by Google.Figure 3 depicts an overview of the GoogleTest approach.A test suite runs one or more test cases, according to the following pattern.
1.The main process sequentially executes the test cases.For each test case, the test steps are executed.A test case can either succeed or fail.2. The main process reports the test results and exits.
Note that GoogleTest always executes all registered test cases, regardless of the outcome of previous test cases.Because our starting assumption is that all STX tests succeed, this does not entail an observable change in the semantics of the test system.However, when we introduce a change in the system that would impact several tests, the GoogleTest F I G U R E 3 UML sequence diagram depicting the lifetime of a single GoogleTest test framework is able to report all failing tests at once while STX could have hidden failing tests behind a failing test.This is considered to be another advantage of introducing the GoogleTest framework, since it makes diagnosing new failures easier by providing more complete information sooner.To compile a GoogleTest test suite also the CMake file is slightly different.Listing 4 shows the GoogleTest equivalent of the STX CMake file in Listing 2. The scripts are very similar, differing in the DIRSvariable-GoogleTest directories are included instead of STX directories.Also, the SOURCESvariable now contains a C++ file.Finally, the last line has a different user-defined macro configureUnitTes-tExecutable that, next to adding test-specific dependencies, sets the output location for the GoogleTest executable.

Refactoring test suites from STX to GoogleTest
In this section, the required changes to refactor test suites from STX to GoogleTest are introduced.

Change 1: C to C++
The GoogleTest framework only supports C++.Hence, this requires test suites written in C to be converted into C++.Since these new C++ files may still import C headers containing production code, include directives for such headers must be encapsulated in an extern "C" block to avoid name mangling issues.Since most of our test suites do not have a header file, we define the test class in the source file instead (cf.Listing 3, line 4).Finally, we change the extension of C files from .c to .cpp.The C++ compiler is more strict than its C counterpart.Test code that previously compiled successfully with the C compiler, may therefore introduce warnings at compile time with the C++ compiler.* Being in a safety-critical domain, our build pipeline instructs the compiler to treat warnings as errors.Because of this, all new warnings and errors that are introduced by the change from C to C++ need to be resolved.
Change 2: Replace asserts STX uses the GEN_m_assert macro to compare the actual value of some variable to an expected value.Occurrences of these macros in the test code need to be replaced with their GoogleTest counterparts.
In this refactoring, we also want to improve the reporting of failing assertions.In STX, the equality of two values is checked with the == operator (Listing 1, line 19).When the comparison expression evaluates to false, it is only reported that the values were unequal.To obtain the actual values, a developer has to attach a debugger to the test executable.In GoogleTest, the expected and actual values are passed to the framework (Listing 3, line 17).This way, GoogleTest can report the values, in case an assertion fails, removing the need to attach a debugger to the executable.We have to detect the use of the == and != operators under the GEN_m_assert macro and replace it with calls to ASSERT_EQ and ASSERT_NE, respectively.

Change 3: Replace header inclusion
Our test suites include a specific header file, depending on the testing framework being used.The inclusion of the "STXServer.h"header file needs to be replaced by an inclusion of the "SetupTestDependencies.h" header.

Change 4: Rewrite test function
In an STX test suite, functions containing test code are defined using a regular function prototype.GoogleTest comes with the TEST_F macro that, besides expanding to a function prototype, registers this function in the GoogleTest runtime implicitly.The macro takes as its two arguments the name of the test class and the name of the test function.

Change 5: Refactor entry function
In STX, the entry function contains a print statement indicating test functions are being registered, followed by calls to the registration function (Listing 1, lines 6 and 7).These lines are to be removed, since test function registration is implicit in GoogleTest.In addition, the STX call to StartTestServer is to be replaced by a call to GoogleTest (Listing 3, line 7).Finally, the result of this call is passed as the exit status (Listing 3, line 8).

Change 6: Rewrite reporting of test verdict
An STX test case signals its outcome by calling NotifyTestPassed or NotifyTestFailed for success or failure, respectively.Calls to the latter are to be replaced by GoogleTest's FAIL macro.In GoogleTest, test cases are successful implicitly when no assertion violations occur; calls to NotifyTestPassed therefore have to be removed.

Change 7: Rewrite main function
The main function needs to get an if statement with the -gtest_list_tests check.For the true condition the GoogleTest function, and for the false condition the TOS_p_STP_start_continue_set and TOS_p_STP_boot functions are printed.

Change 8: Rewrite CMake files
To refactor CMake files from STX to GoogleTest, the following three changes are required.First, as GoogleTest only supports C++, all C source files are changed to C++.To reflect this change in our CMake files, the extension of C files has to be changed to .cpp.Second, the STX include directories have to be replaced by the GoogleTest include directories.Finally, the user-defined configureTestExecutable macro, belonging to STX, needs to be substituted with configureUnitTestExecutable for GoogleTest.

Conclusion
This concludes a rough informal analysis of what needs to be done to the source code of the Positioning Subsystem to be able to migrate from STX to GoogleTest.The next step in our process (Step 2 in Figure 1) is to start automating these eight changes.The automation scripts are detailed in Section 4, but first we must introduce the necessary features of the metaprogramming language we used in Section 3: Rascal.

KEY FEATURES OF THE Rascal METAPROGRAMMING LANGUAGE
The code transformations in this article were implemented in Rascal, a metaprogramming language and language workbench. 6In this section, we introduce language features of Rascal that are needed to understand the remainder of this article.In particular this introduction will enable the reader to assess the code fragments shown in Section 4. For a more complete description of the language, we kindly refer to Rascal's documentation. 7† At its core Rascal is a type-safe programming or scripting language, with immutable data, high-order functions, structured control flow, with builtin pattern matching, search and relational calculus.Rascal is not an object-oriented programming language.It is a functional, procedural and logical programming language with a Java-like syntax.The Rascal code fragments in Listing 5 will be explained in the remainder of this section.
29 30 / a t / := " match " / / t r u e : "at" i s a s u b s t r i n g o f "match" 31 / a t $ / := " match " / / f a l s e : "match" does not end with "at" 32 /AT/ i := " match " / / t r u e : c a s e i n s e n s i t i v e match 33 / < as : a * ><bs : b * > / := " aabbb " / / t r u e : as = "aa", bs = "bbb" 34 35 add ( l i t e r a l ( 0 ) , x ) / / an add node with a literal node n e ste d as i t s f i r s t f i e l d and a v a r i a b l e x as i t s 36 second 37 add ( l i t e r a l ( 0 ) , _ ) / / use o f a w i l d c a r d p a t t e r n n e s t e d under a node p a t t e r n 38 a : add ( x, x ) / / " non− l i n e a r " p a t t e r n s t e s t for e q u a l i t y : the second x should be equal t o the f i r s t .for ( / l i t e r a l ( n ) := e ) p r i n t l n ( n ) ; / / l o o p s over a l l literal nodes anywhere i n e and p r i n t s t h e i r l i t e r a l v i s i t ( e ) { / / visit i s l i k e switch , but f i n d s the p a t t e r n s anywhere deeply n e s t e d case l i t e r a l ( n ) : p r i n t l n ( n ) ; , 4, 9, 16]   for ( int i <− [ 1 . . 5 ] ) p r i n t l n ( i ) ; s t r s = " This is " a s i n g l e s t r i n g l i t e r a l ." s t r w = " world " p r i n t l n ( " Hello , <w> !" ) ; / / p r i n t s " Hello , world !" Listing 5: Extensive Rascal code fragment, containing all code examples referred to throughout Section 3

Primitive types and locations
Rascal features: (a) int, real, and rat as numerical types, (b) bool for booleans, (c) polymorphic lists, sets, maps for collections, (d) datetime for absolute time values, and finally (e) loc for location constants.On line 1, there is a constant that points to a file with the file scheme and selects the part on line 2 between the left margin and the 20th column.Line 1 contains a logical reference to a C++ class name as it could be produced by a C++ name analysis stage in its compiler.Locations are used to refer to source files and are frequently kept with the information that was extracted from source files, to help referring back to the source and also to uniquely identify names to avoid confusion.

Algebraic data types
Rascal supports the definition of user-defined algebraic data types (ADTs) with their constructor functions to create values to inhabit these types.A many-sorted algebraic data type can be used to define the shape of abstract syntax trees or the shapes of other structured symbolic values such as types or constraints.In Rascal ADT definitions are modularly extensible; an additional definition of an existing algebraic data type will extend the type rather than override it.The fragment on lines 4-6 shows a declaration of a data type for a representation of Boolean expressions on the first line, using three constructors.The next line contains a declaration for the same data type, effectively adding an alternative to the existing declaration.Rascal's reserved keywords are not permitted as names of constructors.Since if is a reserved keyword in Rascal, it must be escaped when used as the name of an algebraic constructor: \if.

Context-free grammars
Where ADTs are used for abstract syntax trees, context-free grammars are used for concrete syntax trees.Rascal's built-in context-free grammar notation corresponds in many ways to EBNF.From a grammar definition, Rascal automatically generates a parser, and supports construction of, and pattern matching on, parse trees and abstract syntax trees over this grammar.
Lines 8-13 show a grammar for a simple expression language.First, any number of new line characters and spaces are declared as whitespace.Then, the IntLit nonterminal is defined as any non-empty sequence of digits.Finally, the Expr nonterminal is defined with three productions.As the Expr nonterminal is declared as syntax, rather than lexical, the productions are internally augmented to accept layout between symbols.For example, the add production is changed to add : Expr lhs Whitespace "+" Whitespace Expr rhs.Similarly, the Expr nonterminal is declared as a start nonterminal, which adapts the nonterminal to accept layout before the first symbol and after the last.
Concrete syntax expressions are used in Rascal to construct syntax trees from embedded input strings (lines 15-17).Each expression starts with the nonterminal to use when parsing the following string between backquotes.Rascal's parser then produces a syntax tree, at compile-time, of which the top type is equal to the given nonterminal.All syntax (sub)trees have a .srcfield, which defines exactly the file and part of the file that the current tree encompasses using a location value (see line 1).
In the example syntax definition on lines 8-13, all productions and symbols are labeled.Rascal uses these labels to generate implicit abstract syntax ADTs.Such an implicit ADT abstracts away from terminal symbols in the corresponding productions, uses production labels for constructor names, and uses symbol labels for constructor argument names.For example, the ADT defined on lines 19-22 is automatically generated.

Pattern matching
For a large part, metaprogramming is about analyzing syntax trees.Pattern matching is a high-level language feature in Rascal that helps to avoid writing repetitive nested conditionals and loops which are otherwise necessary to detect patterns in syntax trees.In Rascal pattern matching surfaces in different parts of the language: switch case distinction, dynamic function dispatch, generators (with the := and <-operators), and the visit statement for recursive traversal.
So-called "open" patterns bind variable when they match.Some patterns can match in multiple ways; Rascal uses these multiple solutions as generators that the programmer can use to search (with backtracking) or loop through.For example, the for loop loops over all bindings of a pattern while the if statement finds a first match that satisfies all conditions.
Rascal supports pattern matching on all values, including ADTs, parse trees, strings, sets, and lists.Finally, descendant patterns-patterns preceded by a /-can be used to match a pattern at arbitrary depth in a value.We will show and discuss some code fragments containing examples of pattern matching in Rascal.

List and set patterns
The fragment on lines 24-28 illustrates list matching.Set matching works similarly, but the order and duplicity of elements is irrelevant.On a successful match, the (fresh) variables X and Y are bound to the appropriate subvalues.An asterisk symbol * indicates a multi-variable in the context of set and list matching; a variable *X thus represents a list or set of values, depending on the context.Nonlinear matching (e.g., on the second, fourth and fifth lines) is supported.The third line is an example of a pattern match with multiple solutions.

Regular expression patterns
Regular expressions are used to match against string values.Rascal's regular expression language is largely equivalent to the Java Regex language.Regular expression patterns are delimited by slash symbols /.As usual, the ^and $ characters denote the beginning and end of a line, respectively.Lines 30-33 illustrate matching using regular expressions.A trailing i, like on the third line, sets matching mode to case-insensitive.Rascal allows variables to capture groups of characters, like on the fourth line.

Node patterns
Rascal performs pattern matching on nodes by comparing the constructor name, then testing the amount of children and then matching the arguments recursively.It is important to know that all pattern types can be nested arbitrarily.
Patterns may be labeled with a (fresh) name; on a successful match, the appropriate (sub)tree is bound to the variable.The fragment on lines 35-37 shows several (nested) patterns.

Concrete syntax matching
Concrete syntax patterns may be used to match against parse trees.Such a pattern uses concrete syntax of the object language, augmented with syntax for (typed) metavariables, for example, <Type id>.When matching with concrete syntax patterns, the layout (whitespace and comments), inside the pattern as well as inside the subject value, is ignored.Lines 39-41 contain the concrete counterparts of the (abstract) node pattern referred to in the previous paragraph.

Deep matching and traversal
A descendant pattern-a pattern preceded by a /-finds a match by traversing all sub-values of the subject value and succeeds when a match is found anywhere (line 43).Descendant patterns usually produce more than one solution, therefore they often occur in for loops (line 44), visit constructs (lines 45-47), or comprehensions such that the programmer can collect, filter or do something with all instances.

Comprehensions and generators
Comprehensions, as illustrated on lines 49-51, are a means of generating new values by enumerating over existing values.An expression [a .. b] generates the list of integers from a up to, not including, b.We show a list comprehension: square numbers are generated (n*n) by enumerating the provided list on the right (using <-).Generators can also be used in control structures, such as for, to iterate over a value.Next to list comprehensions, Rascal also supports set and map comprehensions and value reducers that produce a single value in comprehension style instead of a collection.

Multi-line strings and string interpolation
String literals are not delimited by line endings in Rascal.Instead, string literals may span multiple lines.Single quote characters ' may be used to allow such strings to be indented nicely: these characters, and preceding whitespace characters, are not part of the string literal.Lines 53 and 54 show an example of a multi-line string.
The actual value of a variable can be directly spliced into a string by using string interpolation, as illustrated on lines 55 and 56.This is not limited to string-typed variables; all Rascal values may be interpolated.Since interpolation of if, for, and while constructs is also allowed, Rascal has full-blown string template programming capabilities.

ClaiR
The Rascal standard library does not contain a concrete grammar for C++.ClaiR ‡ (C++ language analysis in Rascal) is a separate plug-in project, providing an abstract syntax definition for C++, based on the C++ parser in Eclipse CDT. 8 ClaiR was primarily developed to parse C++; while C is not a strict subset of C++ § , ClaiR does parse C files that are also valid C++.Throughout the rest of this article, we use the abstract syntax ADTs as defined in ClaiR.

F I G U R E 4
The key components and five steps of the automated refactoring process with their implementation languages and their inputs and outputs

CASE STUDY: A SEMI-AUTOMATED APPROACH TO API MIGRATION
In Section 2, we introduced our case study to migrate away from STX in favor of GoogleTest.In this section, we explain all Rascal code we used to automate these changes.
For each test suite, the refactoring is structured as depicted in Figure 4.A CMake file serves as seed input for our automated refactorings; from these files we learn which other C or C++ files to process.In Section 4.2, we describe how such a file is parsed.Next, in Section 4.3 we show how CMake files are refactored.In Section 4.4, we show how we carried out the refactoring subtasks (cf.Section 2.3).A batch file is generated that renames C files to C++ and generates a command to check in the changes in the version control system; the creation of this batch file is shown in Section 4.5.Finally, Section 4.6 displays the generation of another batch file that executes the refactored test suites.If the tests execute successfully, the changes can be delivered to the version control system.

Writing changes to file
As discussed in Section 3, Rascal/ClaiR does not contain a grammar for C++, but provides an AST abstraction.Finding refactoring candidates is done by matching abstract syntax patterns against ASTs.Once we found out where to change the files, all we need to do is generate the changes.Because ClaiR does not (yet) come with a mapping back from C++ AST nodes to source code, we had to devise a creative solution.Every AST node has a source location attribute src, containing the filename and offset and length of the source code fragment it represents.We use this source location information to identify the exact fragments that are to be replaced and associate this location with a new code fragment that has to replace it.We store a series of these "edits" in a Rascal relation with three columns: <int startIndex, int endIndex, str changeWithString>.Such as relation is called an "edit script," since it can be executed automatically by locating each position in the file and replacing the selected portion with the changeWithString value, and keeping tabs on the shifting of the indexes due to each replacement.
Listing 6 shows how changes are applied to a source file.
• The changeSource function takes the original source code and a set of changes as its parameters.The offset of a change is based on the unchanged source code.Since replacement source code and the deleted substring are typically not of the same length, we keep track of this discrepancy in the offset variable, initialized at 0.
• Before applying the changes, the unsorted collection is sorted on the startIndex.The changes are then applied one by one, by concatenating the prefix, the replacement source code, and the postfix.¶ After every replacement, the offset variable is updated with the difference in length of the deleted and inserted strings.
• Finally, after application of all changes, the function returns a string containing the refactored source code.
The changeSource appears in multiple (disjoint) parts of the transformation, and is used at most once per file.The preconditions for applying this function correctly are that the changesList is based on the current state of the input source and that none of the intervals in the changesListoverlap.Listing 6: Function that applies an edit script (changesList) to source code

Parsing CMake
Recall from Figure 4 that CMake files serve as input for all our refactorings.In Section 2.1, we introduced the structure of the files that are used at Philips.To be able to parse these files, we defined a grammar that accepts CMake files, including our company-specific macros.Listing 7 contains a grammar for these files in Rascal's grammar formalism.First, we define the syntax of whitespace and that # is the comment prefix (lines 1-4).On the following lines, the syntactic categories are declared that define general CMake concepts as well as our company-specific uses of CMake.This syntax definition is complete for us, in the sense that if we would have forgotten about one of the macros that we use, the generated parser would have produced a parse error.However, it is not a syntax definition for any CMake file in the world, for the same reason." s e t " " ( " " $ " " { " Id t a r g e t M a c r o " } " TT t a r g e t T y p e S o u r c e L i s t+ s o u r c e L i s t " ) " ; 12 syntax S o u r c e L i s t = s o u r c e L i s t : " \ \ " " ?" $ " ?" { " ?Id s o u r c e F i l e " } " ?" \ \ " " ?; 13 syntax CompileFlags = c o m p i l e F l a g s : " s e t " " ( " " $ " " { " Id t a r g e t M a c r o " } " " \ _COMPILE\_FLAGS " " \ " " Id c o m p i l e F l a g s " \ " " " ) " ; Listing 7: Concrete grammar of CMake as used at our company, in Rascal's built-in grammar notation

Refactoring CMake
As described in Section 2.3, refactoring our CMake files consist of three actions: changing the language (if applicable), replacing include directives, and replacing the STX macro.Listing 8 shows the functions we used.The mod-ifyCMakeLists function first matches out sections containing an STX dependency (line 4).The traversal in the modifyStxTarget function has a case for each of the actions.
• Changing the language is only necessary for C files; references to C++ files remain unchanged.Recall that all our test suite file names end with _stx.The modifyCTarget function in Listing 8 first matches out filenames containing "_stx.c" from SourceList productions (line 30).For each file, the file extension is then altered when applicable (lines 31-34).
• To replace the stxserverlib include with the GTEST_INCLUDE_DIR and GTESTSETUP_INCLUDE_DIR includes, we first search for a DIRS directive containing "stxserverlib", and replace this line with the two new includes (lines 15-20).
• Finally, the configureTestExecutable macro needs to be replaced with configureUnitTestExecutable.
We search for configureTestExecutable and overwrite it with configureUnitTestExecutable, with the same arguments (lines 21-23).

Refactor STX API client code to GoogleTest API client code
In this section, we show the scripts we used to automate the refactorings described in Section 2.3.

Change 1: C to C++
Because the C++ compiler is more strict than the C compiler, we have to make certain changes to our existing code to satisfy it.In particular, there is a TOS_p_SEG_create function in our OS abstraction layer that acquires some memory from the heap.The return type of this function is pointer-to-void (void*), whereas the declared type of a variable such a value is assigned to is typically a pointer to an actual type.Such an assignment of a void* value to a more defined pointer type is flagged by the C++ compiler.To overcome this, we try to find such a variable's type, and add a typecast to this type.
• In Listing 9, the voidPointerAssignments function traverses the AST of a source file, searching for locations where the result of a call to TOS_p_SEG_create is assigned to some variable.For all such assignments, it calls the findTypeCast function, which traverses the AST to find the declaration site of the variable, and tries to extract its type from it.Notably only the local name of the variable is used here (as opposed to a fully qualified name), and the first declaration AST for this local name is used.So this works under the following assumption: the first declaration of that name is indeed of the expected type for the call to TOS_p_SEG_create.The way the test setup code is structured guarantees this assumption.
• When successful, syntax for a cast to this type is returned.
• If it fails to find the type, it returns "(FIXME)", which leads to a compilation error later in the pipeline when we would try to compile our new test code.The FIXME identifier does not occur elsewhere in our code.In this way, we make sure we can never accidentally accept ill-understood test code, which would otherwise lead to a deterioration of the quality of our test suites.The second part of this change handles potential name mangling issues introduced by changing C headers to C++.The inclusion of a header file compiled in C in a C++ unit results in linking errors, which can be resolved by wrapping the inclusion in an extern "C" block.Our C header files end with li or pi (from local or public include), which we can use to identify relevant include directives for refactoring.Since include directives are preprocessor statements, they do not appear in the AST of a file.Therefore, for this refactoring, we operate on source files in string representation.
Listing 10 shows the functionNameMangling function.It iterates over a source file line by line, wrapping groups of consecutive C header inclusions in an extern "C" block.Here we arrive at a more interesting transformation.This refactoring actually changes the semantics of the test code a bit, in order to get the benefits of the more precise GoogleTest asserts.The modifyAsserts function in Listing 11 changes STX asserts to GoogleTest asserts.
• It traverses the provided AST, matching all function calls to STX's GEN_m_assert, binding the function name and the argument to the func and arg variables.To determine which GoogleTest assertion should be inserted, it performs a case distinction on the argument.
• In case this is an equality (a == b), the STX macro is replaced with ASSERT_EQ (line 7).Since this macro expects two arguments, the equality operator-located between the two operands-is changed into a comma (line 8).Similarly, for an inequality (a != b), the macro is replaced with ASSERT_NE, and the inequality operator is replaced by a comma (lines 11 and 12).
• In case the asserted value was a negation (!a), the macro is replaced with ASSERT_FALSE (line 15), and the negation operator is removed (line 16).
• In any other case, the asserted value is left untouched, and the macro is replaced with ASSERT_TRUE (line 19).
In Section 4.9, we will show that this subtask has indeed improved our error reporting.The STX header files have to be removed, and a new SetupTestDependencies header file is to be added as the final inclusion.Since the script from Section 4.3 modified the file before, this step is performed after writing the changes of the other scripts to disk.Again, as include directives are not present in ASTs, we perform this refactoring by handling files as strings.
• The modifyHeaders function in Listing 12 first reads in the file line by line.If a line contains an include directive, the line number is stored in the last variable.Furthermore, if the included file is an STX header, the include directive is removed (line 8).
• After iterating over all lines, the last variable holds the line number of the final include directive.To ensure the new header file is not included inside an extern "C" block, the script checks whether there is a closing curly brace on the line trailing the final include directive, and increases the last variable if needed.
• Finally, an include directive for the new GoogleTest header is appended to the appropriate line (line 15).
We realize that the above transformation is risky in the sense that we have a number of assumptions on the source code we want to transform.In particular, the } check could go wrong for a macro or class definition.However, since our coding standards do not allow this, this does not occur in the codebase.It is interesting how knowing our context helps with simplifying assumptions that enable us to make rapid progress on the automated refactoring rather than having to consider how our refactoring would perform on any code in the wild.• First, all test case names are collected using the findTestCases function, which searches for the test registration function and stores the first argument, containing the test case name.Listing 16 provides the metaprogram that was used to rewrite our main functions.
• In the STX source files, we encounter main and _tmain that can serve as main functions.The modifyMain function locates functions with either of these names in the provided AST.The function body is then passed to the changeMainBody function.
• Calls to TOS_p_STP_boot are replaced by the if statement and accompanying statements (lines 14-23).
• The return value is changed from 0 to testResult (lines 25 and 26).• The refactorTestSuite function is called with the STX source file it needs to refactor.First, it generates the GoogleTest class name (line 2).The file is then passed to ClaiR to generate an AST (line 3), which is passed on to the refactoring functions (lines 7-13).Optionally, if the provided file is a C file, the additional refactoring function is called (lines 15-17).
• The accumulated changes are then applied to the file (lines 19-21).
We did not provide the code of the satisfyCppCompiler function (line 16), which is simply a wrapper for the functionNameMangling and voidPointerAssignments functions.The addTestClass function is also not provided; this function trivially generates a single line of code containing a forward declaration of the new GoogleTest test class (cf.Listing 3, line 4).7 changes += v i s i t T e s t C a s e s ( a s t , t e s t C a s e s , testClassName ) ; } r V a l += " STX t e s t c a s e s .\ " " ; return r V a l ; } void g e n e r a t e B a t c h S c r i p t ( B u i l d a s t , l o c f ) { r V a l = " " ; name = | f i l e : / / / | + g e t P a t h ( f ) + " r t c .b a t " ; r V a l += generateRtcRenameLines( a s t ) ; r V a l += g e n e r a t e R t c C h a n g e S e t L i n e s( a s t ) ; w r i t e F i l e (name , r V a l ) ; } Listing 18: Metaprograming generating the repository batch scripts

Test changes
To verify that the changes generated by the automated refactoring preserve did not break any tests, we use the gener-ateTestScript function from Listing 19 to generate a batch file that executes the refactored test suites.This batch file is used to run the converted test suites in the output directory of the repository, run the test suites from the repository location, copy the test suites to the deployment location, and run them from there as well.The results of both runs are stored in a logfile.txtfile.
The generateTestScript function gets the AST of a CMake file as input parameter.It generates a batch script by iterating over all C and C++ files, changing their extensions to .exe to get the executable names, and adding several batch commands to the intermediate result.Finally, the full batch script is written to a test.batfile.)) ; } testName = r e p l a c e A l l ( testName , " .exe " , " " ) ;

Putting it all together
We now describe how the previously described Rascal scripts are called.The main function has a list of CMake files; these are the input for all our refactorings.The function iterates over all input files: • it parses the input file; • it refactors the test suites; • it refactors the CMake input file; • it generates an rtc.bat batch file; • it generates a test.batbatch file.
Listing 20 shows the entry function of the full refactoring.

Application of the Rascal metaprograms on an older product line
As described in Section 1, we applied the automated refactorings on our new architecture, in which the legacy STX test framework was reused from an older version of the system.In the same department, we also perform maintenance on an older version of the system that is still in operation at customer sites.This line does not have GoogleTest and uses STX for all unit tests.Since we want engineers to be able to easily switch between the two product lines, we decided to also refactor the STX test suites from the old product line from STX to GoogleTest.The new product line uses CMake files from which we generate a Visual Studio solution and project files.However, the old product line uses a Visual Studio solution and project files directly; they are not generated.Similarly to the CMake files in the newest product line, these files contain references to STX include directories that need to be removed, and GoogleTest dependencies that need to be added.Using a 26 line Rascal function, we read in all solution and project files and changed them accordingly (code not shown).All other scripts related to the refactoring of STX source files and the creation of the repository and test batch scripts could be fully reused.
This experience is an indication that there is opportunity for reuse for such besproke refactoring scripts, even when we know how they depend on specific assumptions regarding C/C++ coding conventions and style within the company.

Quantifying the automated refactoring
We applied our semi-automated refactoring strategy to the current product line and an older version of the system.Table 1 provides an overview of the number of affected files and the number of modifications.In total, we used 24 change sets to commit all changes to the version control system.On the new product line 75 STX test suites were changed, where on the old product line 77 STX test suites were adapted.The majority of STX test suites were changed from C to C++; only a quarter was already in C++ before refactoring.A grand total of 5840 edits were made to the source code files.
In Table 2, we zoom in into these modifications.We distinguish three types of modifications: removal of a source line of code (SLOC), addition of a new SLOC, and changing of a SLOC.We count both alterations and complete line replacements as a changed SLOC.
Table 3 provides an overview of the number of generated lines of batch script, for both the repository and the test scripts.The length of the repository script depends on the number of change sets, and the number of source files that were converted from C to C++ (cf.Listing 18).The length of the test script is linear in the amount of test suites (cf.Listing 19).In total, 872 lines of batch script were generated.
On the new product line, a total of 192 lines of CMake were changed.We do not use CMake on the old product line, but use a Visual Studio solution and project files directly; in total, our metaprograms changed 893 lines in these build configuration files.
The GoogleTest ASSERT_EQ and ASSERT_NE macros are the preferred assertion, since they provide detailed information in case of a violation.Based on the operator that was used in the argument of the preexisting STX assertion macro, our scripting automatically deduced which GoogleTest macro to replace it with.Table 4 shows the distribution of the assertion macros generated by the automated refactoring.As the vast majority of STX asserts could be automatically transformed to an ASSERT_EQ or ASSERT_NE assert, we were able to greatly improve error reporting.Only a small portion of asserts was transformed to ASSERT_TRUE or ASSERT_FALSE, for which error reporting remains similar to the STX situation.
One of the requirements for Step 5 to be successful is that all references to STX had to be removed.To verify that this is indeed the case, we removed the STX library itself from the codebase and performed a full build.As this was successful, and all tests succeeded, we conclude that our refactoring metaprograms did not miss anything.

EVALUATION AND DISCUSSION
We described the process and the code we wrote to automate a large-scale refactoring of C++ test code.To evaluate the approach, informally, we are interested in answering the following evaluation questions: • What is the quality of the resulting code?Does it compile, run and test correctly?What is the maintainability of the output code as compared to the original code?(Section 5.1) • Was the effort of (a) learning how to automate refactorings and (b) scripting the refactoring worth it, as compared to the hypothetical manual approach?(Section 5.2) • Which parts of the current approach can be reused in future large-scale C++ maintenance scenarios, either (a) at our company, or (b) for others?(Section 5.3) • What are possible alternative implementation technologies comparable to Rascal that could have been used, and what are their drawbacks and benefits?(Section 5.5)

Quality of the refactored code
The following observations pertain to the quality of the result of our automated refactoring, in terms of functionality and maintainability of the output code: • Since only test framework API client code was changed, the code quality (readability) of the bodies of the test code is similar to (not worse than) the old situation.The resulting code as a whole after the transformation was acceptable by company standards, which was confirmed by code review by an independent colleague.
• The quality of test reporting has increased significantly, as an effect of the use of specific assert methods of the new framework API.If a test fails, the report contains more detailed, easier to diagnose, information than before.
• By design, after completion of the semi-automated refactoring process, the whole codebase compiles and runs; all tests also succeed.
• The maintainability of the codebase as a whole has improved: we can stop maintaining the old proprietary STX framework, and we now consistently use the same framework across the entire codebase.

The effort of automating a refactoring and learning how to do it
Designing and implementing this automated refactoring came with an upfront investment in (a) learning metaprogramming in Rascal and (b) writing the bespoke refactoring code.Was this worth the effort or not?To the best of our knowledge there exist only few publications on A/B testing such a situation.The use of DSL tools was A/B tested, assessing implementing domain-specific language processors against implementing DSLs in a GPL; 10 their conclusion does not apply to the current situation since we are analyzing C++ by reusing an existing front-end.Their method is also not applicable; they started from a clearly delineated set of features to implement, while our situation requires discovery and backtracking based on emerging insights.
In the current section we therefore propose a thought experiment.We imagine, by observing the (complete) work done by the automated refactoring what a hypothetical and optimal manual process would have been and what it might have cost.We first describe the experience of the first author while creating the Rascal refactoring metaprograms.Then, we reflect on the key characteristics of Rascal that have enabled our refactoring in Section 5.2.3.In Section 5.2.4,we compare the two workflows of Figure 1 in terms of the (estimated) effort required.

5.2.1
Experiences with and evolution of the Rascal metaprograms The Rascal scripts described in Section 4 were written by the first author of this article.His first steps with Rascal were made two years before this project, during a course on Software Evolution at the Open University.# He started with the creation of the refactorTestSuites function, and applied it on two example test suites.The use of a source code repository meant that it was trivial to undo changes made by the scripts on source files.After deeming the automated refactoring sufficient, he decided that CMake files would be the perfect files as input for the refactoring.Because these files themselves needed to be adapted as well, he wrote a CMake grammar.The next dry test was on a CMake file that defined the build targets for approximately 20 STX test suites.The metaprograms were slightly adapted until the CMake file and the STX source files were all refactored correctly.During this process the first author never needed external guidance other than from the Rascal documentation.
After the dry runs, he followed the process described in Section 1, and the refactoring metaprograms were repeatedly applied to a partition of the Positioning Subsystem.For each partition, this consisted of the following steps: • The Rascal scripts received a number of CMake files as input, and changes to the source files were generated accordingly.
• The changes were informally reviewed by the same person.When required, minor improvements were made (cf.Section 4.4, Change Language).
• The resulting code was checked-in into the source code repository and a change set was created using a repository script.
• The tests that were changed were built (compiled and linked).
• The test script was executed to check whether the refactored tests still succeed.It is important to note that all tests succeeded before the refactoring.
• According to company policy, all change sets were formally reviewed by senior software engineers.Review comments were processed, and the reviewers formally signed off for approval.
• Also according to company policy, the changes were built and tested on an offload system.When successful, the change sets were accepted into the source code repository.

Issues detected in Rascal and ClaiR
During the creation and application of the Rascal scripts we encountered some issues with the C++ front-end ClaiR.
In some test suites, assertions (cf.Section 2.3) were placed in a macro definition at the top of a source file.AST nodes belonging to expansions of such macros receive the source location of the macro occurrence, which gave unexpected results in our scripting.ClaiR was fixed to add an attribute to AST nodes, indicating whether a given node is the result of a macro expansion.Then we changed our Rascal scripts to not apply any changes to macro expansions, but report the macros in need of manual intervention to the metaprogrammer instead.It would sometimes occur that our metaprograms identified refactoring targets that were located in included files, rather than in the source file under analysis.This is due to the semantics of ClaiR, which runs the C preprocessor before parsing the input file.In order to keep processing files one by one, we eventually added a check to verify that only changes in the actual file under analysis were propagated.This was easy since every AST node produced by ClaiR has a field to indicate its source location.
Rascal does not come with a feature or library to model edits on a source file.Most Rascal programs either transform parse trees (from which source code can be derived by unparsing) or they transform abstract syntax trees, which require a pretty-printing function that produces source code again.The first author designed a model consisting of a relation between slices of a file and the text to substitute there, and a function which applies these edits to an existing file.A slightly generalized version of this feature would look nice in Rascal's standard library, since it circumvents the use of a (bespoke) pretty-printer and also entails high-fidelity transformations (no unnecessary loss of indentation or source code comments).
# https://www.ou.nl/en/-/IM0202_Software-EvolutionIf ClaiR would have provided a pretty-printer that maps ASTs back to source code, we could have written our transformations as simple AST-to-AST transformations, and derive the (same) edit scripts from diffing the AST nodes and pretty-printing only the new parts.In our current scripts the syntactical change code is tangled with collecting edit scripts; the code would be a little simpler without this tangling.However, a pretty-printer is a nontrivial piece of work for an elaborate language such as C++.
Another way of circumventing this tangling of code changes with their implementation as edit scripts would have been to use Rascal's experimental concrete syntax features for abstract syntax trees. 11This would allow us to write both the patterns and their substitutions in the C++ source language, while the same ASTs would be used under the hood.This would be a readable solution and the edit scripts would be derivable without the need for a pretty-printer; in fact, this would even allow us to directly unparse modified syntax trees.However, this feature is not released with Rascal yet.

Technological enablers in Rascal
In Section 3, we introduced concepts of the Rascal metaprogramming system that were used to carry out the refactoring from Section 4. In particular, we made heavy use of deep matching, that is, traversing a tree and matching patterns at arbitrary depth.Throughout Section 4, we used abstract patterns, which required knowledge of abstract syntax of the object language.The use of concrete syntax pattern-patterns that are expressed in actual surface syntax of the object language-would discard this requirement, making the conceptual entry barrier lower.
As we defined a concrete grammar for CMake (cf.Section 4.2), Rascal immediately supports concrete syntax for CMake.For C++, however, there is no concrete grammar available in Rascal.Augmenting the available functionality for C++ with concrete syntax pattern, for instance by implementing the Concretely framework, 12 would certainly make our patterns more concise and more readable.Furthermore, this would fully eliminate the bookkeeping of applied code changes, including file offset corrections, as (changed) parse trees could simply be unparsed. 11ne of our key observations is that when conducting a significant refactoring task such as the one described in Section 2, it is imperative to be flexible.In this particular case, for instance, the initial assumption was that only C++ files would be affected, and hence, being able to parse and analyze C++ files would suffice.During the process, however, it became obvious that CMake files would serve as a better starting point for analysis.Rascal facilitated seamless integration of the newly created CMake grammar (cf.Section 4.2) into the C++ analysis code.

Comparison of the two workflows
In Section 1, we described our semi-automated refactoring process and a hypothetical manual alternative.In this section, we compare the required effort for each step in both processes.
Step 0 As we perform partitioning solely to make the formal reviewing effort tractable, this step does not depend the chosen refactoring process.
Step 1 The effort for problem analysis is identical for both processes; after all, while the two approaches differ, the task at hand is identical.This applies to both the initial problem analysis, and further analyses to improve previous solutions that caused errors in Step 4.
Step 2 In the manual process, each time this step is reached, a single or a few files are selected and altered, according to the strategy devised in Step 1.In the scripted process, the strategy from Step 1 is turned into a metaprogram, which is applied to all files.
Step 3 In both processes, all test suites are executed.Note that in the manual process, the tests are run after each batch of changed files; in the scripted process, the tests are run once after all files were changed automatically.
Step 4 If the test infrastructure signals that errors have been introduced, the applied code changes are reverted and each approach loops back to Step 1. Otherwise, for the automated process, all tests succeeding directly implies that the process is finished.The manual process, however, loops back to Step 2 for the next batch of target files, until the complete codebase has been taken care of.
Step 5 For both approaches, the process ends when all references to STX have disappeared, all tests are succeeding, and an independent colleague gives consent.
The main difference in effort lies in Step 2. After problem analysis, a prospective solution is applied to one (or several) files, after which the tests are run.If successful, more files are treated in the same way.In case the tests eventually signal a flaw in the refactoring strategy, all target files have to be reverted to their original state, and a new solution has to be proposed.In the scripted process, a metaprogram is created, which is automatically applied to the complete codebase.If the tests identify an error, only the metaprogram needs to be adjusted.It is worth noting that in the manual process, Steps 2 through 4 are encountered multiple times, and that the test environment is called in each iteration.
Comparing these ways of working, we identify several differences.The automatic process eliminates the possibility of human errors, and is more consistent.Furthermore, metaprograms can be applied to all target files simultaneously, facilitating quick detection of faults.For the manual case, encountering a flaw in the current solution late in the process would yield all previous manual refactoring effort in vain.
A notable effort that is only applicable to the scripted process is that the software engineer undertaking the refactoring needs to get acquainted with a metaprogramming language, if not already proficient.While we will not attempt to quantify the amount of time this takes as this may differ from person to person, we do note that this need not be very hard or time-consuming.
The number of files that are to be refactored plays an important factor in the total effort for the manual process: as each file requires manual intervention, the total effort for Step 2 is linear in the amount of files to edit.For the scripted process this is not the case: as a script can be applied to all files directly, this is a constant effort.The effort to create or adapt a refactoring script is also a constant effort, so the total effort of a manual process will always exceed the effort of a scripted process if the size of the codebase is sufficiently large.In general, the return on investment for projects such as ours is largest when there are few patterns with many matches.
We now evaluate the efficiency of the semi-automated refactoring in terms of the number of lines of Rascal code written relative to the number of lines of C++ and CMake code that were affected.In total, a grand total of 6904 SLOC were affected by the semi-automated refactoring.For these changes, we required only 371 lines of Rascal code, including the CMake grammar.In terms of lines of code, our semi-automatic refactoring is almost 19 times more efficient.

Analysis of generalizability
The STX framework is a proprietary test framework that is used only within Philips.Therefore, the specific patterns that are used to match in this refactoring are not reusable outside of the company.In Section 4.8, we showed that the Rascal code was nearly fully reusable on an older product line of the initial target system.Therefore, within Philips, it can be reused to refactor other software suites to phase out STX in favor of GoogleTest.The scripts we wrote to represent edits and apply them to files are, in principle, reusable for other analysis scripts than can produce (a) the location of a change and (b) the text to substitute.This method of applying changes to source files seems new in the Rascal ecosystem and it could be a valuable addition to the standard library.
The grammar we wrote for CMake is reusable in principle in other C and C++ transformation scripts that require information from CMake files.However, we wrote the grammar until we could parse the targeted set of files in the test suite.It is not unthinkable that more extensions are required to be able to parse any CMake file.
The personal C++ analysis and transformation and Rascal meta-programming skills we used to create our scripts, however, are usable again.If another large-scale renovation is motivated, we can jump-start the automated refactoring using our knowledge of ClaiR and the pattern matching, analysis and templating features of Rascal.

5.4
Lessons learned

Adopting Rascal in industry
In this section, we describe several aspects that are relevant for the adaption of Rascal in industry.We start with the required steps to be able to use a new tool at Philips.We then describe how Rascal is distributed to the engineers.Finally, we elaborate on the effort that is required for the introduction of Rascal in an industrial setting.

Tool validation
At Philips, we recognize the challenges that come with legacy systems and the promise metaprogramming could bring.For this reason, we supported the work of the second author to create ClaiR and extend Rascal with a C/C++ front-end.The first author arranged formal approval for the usage of Rascal within Philips.The formal approval consists of a number of steps.Note that because Philips is a company that creates medical systems, it has to adhere to the rules created by regulators worldwide (e.g., the Food and Drug Administration [FDA] in the USA).Only tools that are validated and are part of our tool register may be used.The first step of the validation process was to convince all software managers of the business unit that Rascal is an useful addition to the tools that we use.To this end, the first author highlighted the benefits Rascal could bring and that there was no other tool in the tool register with similar capabilities.The second step of our tool validation process consists of creating a document according a prescribed template, in which the name and version of the tool and its intended use need to be filled in.In addition, an initial risk assessment needs to be added.We reasoned that a failure in a Rascal script will always be caught in a later phase of the software development process, and classified Rascal as low risk.The document was then signed by all required managers, completing the validation process.

Tool distribution
Now that we were allowed to use Rascal, we included Rascal in our Eclipse distribution.Next to Rascal, this Eclipse distribution contains several domain specific languages and support other tools; it is distributed via the source code archive.We do it in this way to make sure that all tools are in line with a particular snapshot of the archive.This is especially important because we use trunk-based development and create a branch for each product release.An engineer tasked with fixing a bug for an older product release checks out the branch and immediately has the correct tools at his disposal.

Tool introduction
We continue with an exploration of the magnitude of the required changes when introducing Rascal in a software department.Fowler and Levine 13 describe a model that, given the magnitude of technological change, predicts the learning and time that is required for adoption (Figure 5).Depending on the learning and time that is required for the technological change, the approximated time needed for an organization to adjust is either short or long.We reason that for the adaptation of Rascal within an organization, engineers should acquire new skills.They should be able to use Rascal.The procedures do not need to change as we only automate a process that would be similar as if it would have been conducted manually (cf. Figure 1).As there is no procedural change, also the structure, strategy, and culture of an organization do not need to be changed.Looking at Figure 5, we see that for the adoption of Rascal, limited time is required.This matches the experience of the last author who has a lot of experience in teaching Rascal to university students.He has taught Rascal to students enrolled in a Master's programme.For this course on Software Evolution, students learn Rascal in approximately five days.In our opinion, when applying Rascal the main driver for success is knowledge about the object language (in our case C/C++).If we translate these observations to industry, we see one F I G U R E 5 Dimensions of change 13 major difference.In Dutch industry only a minority of software engineers has an academic background, while a majority of software engineers has had a higher professional education.Software engineers with a higher professional education background may experience more difficulty in understanding mathematical concepts of programming languages.
As such, we think it might be more challenging to learn Rascal for these professionals.In practice, project teams typically consist of software engineers from various backgrounds.A engineer with excellent metaprogramming skills can team up with an engineer that is, for instance, a domain expert.Even with only a single engineer that has metaprogramming skills per project team, we think that such a team can be very productive in performing automated maintenance activities.C/C++ are notoriously difficult languages for automated transformation tools.With the recent addition of ClaiR to Rascal, we think we have solved some of the challenges.Giving the experiences described in this article, we think Rascal is ready for wide-spread adoption in industry.The existence of these kind of tools is not widely known by industry.For this reason, we think the Rascal community can improve by advertising the possibilities to industry.

5.4.2
Improving the migration process The migration process could be improved as follows.
In Step 2 of Figure 1, we run the scripts on the source code.The automated process is iterative.Hence, Step 2 is performed multiple times.Rascal will use ClaiR to parse the source files using CDT and to create an AST.This tree is used by the metaprogram to create a list of changes that need to be applied to the source file.Depending on the contents of a source file, creating an AST may take up to 10 seconds.As the resulting AST will always be the same, we could improve the process by only parsing files once and storing ASTs as files on disk.
We have run the STX test suites sequentially on a single PC.Some of these STX test suites take up to 45 min to execute; running all STX test suites takes a full night.To shorten the feedback loop, we could run the refactored test suites in parallel on multiple offload systems.
While working with the legacy code and preforming the migrations, we have learned a lot about the legacy system.This is inherent to the process.New refactorings on the same legacy system are likely to take less time, because of the things we have learned about the system with the reported activities.

Related work
In this part, we position alternatives to Rascal, the implementation language of our refactoring, by describing related work.We have chosen Rascal as our implementation language for our refactoring.In principle, we could have picked other metaprogramming systems that allows simultaneous analysis of multiple languages, such as ASF+SDF, 14 Stratego, 15 TXL, 16 DMS, 17 or, more geared to C++, Proteus 10 or CodeBoost. 18The current section is not a feature comparison between all these systems, but rather an enumeration that documents the wealth of metaprogramming systems available, which may or may not be relevant to the reader.Already in 1995, Aigner and Hölzle 19 describe a case in which they refactored C++ language instances to improve the performance of executables.They accomplish this by implementing a source-to-source transformation, rewriting virtual function calls using the inline keyword.The field started as a spin-off of Compiler Construction techniques: transformation of (annotated) abstract syntax trees.CodeBoost 18 is a source-to-source optimizer for C++ that shares many features with the technology described here.
Mass maintenance on legacy software took to flight in earnest around the year 2000: the Y2K problem motivated a principled and automated approach to maintaining large legacy code bases.The focus was on COBOL systems.Due to the omnipresence of the Y2K problem in COBOL systems and the abundance of large COBOL at that time, Van den Brand promotes the use of "software renovation factories": systems taking context-free grammars as input, in which users can create bespoke code analyses and transformations. 20After the initial Y2K fix, applications of the developed technology fanned out into more complex code enhancements.For example, Veerman 21 used sets of rewrite rules to transform and modernize legacy COBOL code, transforming spaghetti code into well-structured programs.He implemented the transformations in a predecessor of Rascal, called ASF+SDF. 22Other systems from that era which are still applied to legacy software analysis and transformation in both industry and academia are TXL (Turing eXtender Language), 16 DMS, 17 and Stratego/XT. 15All these systems provide pattern recognition and substitution on abstract or concrete syntax trees.
ClangMR is a tool based on the combination of the Clang compiler and the MapReduce parallel processor. 23The transformations are, again, based on the AST structure.ClangMR has been applied on large C++ codebases at Google to refactor callers of deprecated APIs.Mooij et al. use small iterative code refactoring steps before extracting models. 4The rationale is that the intermediate refactoring steps increasingly reduce noise in the code, making model extraction easier.They also used the Eclipse CDT parser like we did in the current article.
"High-fidelity" source-to-source transformations were pioneered by Vinju 24,25 in the ASF+SDF system, by Waddington et al. with the Proteus C/C++ transformation tool, 10 and by Thompson with the HaRe system 26 for refactoring Haskell.The goal is to transform the source code without losing arbitrary indentation or source code comments.In the current article, by generating edit scripts that are applied to the unprocessed source files we circumvent the loss of indentation and comments that is associated with a compilation pipeline that goes through different (abstract) representations.
The method of generating edit scripts is also the method used by the Eclipse Refactoring Framework. 27A "refactoring" is a source-to-source transformation with the goal of improving internal code structure without changing the observable behavior.The input of a generic refactoring tool could be any code that passes the compiler.The complexity of generic automated refactoring tools lies therefore in checking the preconditions for such a code change and making sure that the applied change will not change observable behavior.This requires advanced theoretical models, such as for example "type constraints." 28Our current work however does not require this since we are writing transformations for a specific system with specific code patterns.Steimann 29 explores an interesting middle ground where a generic system of semantical constraints (on Java code) can be used to implement "ad-hoc" refactorings comparable to our use case.However, we did not require such deep analysis to know that our transformations were correct; also the C++ language would be much harder to model semantically than Java.
Object-language-agnostic systems such as TXL, DMS, Stratego/XT, and ASF+SDF require the specification of a parser (usually in some form of BNF).Writing a parser for C++ is a daunting exercise, however.The language is inherently ambiguous and its disambiguation requires (deep) semantic analysis.Also the C preprocessor literally adds a level of complexity to the analysis of C++ input programs.The Proteus system provides parsing of C++ to the Stratego/XT environment by parsing C++ with an external parser and providing the ASTs to the rewriting engine.The current article uses ClaiR 8 in a similar fashion by reusing the Eclipse CDT parser in front of the rewriting features of Rascal.The C-Transformers project used a post-parse disambiguation stage to simplify C++ parse forests to single trees. 30he srcML family of tools 31,32 reuses open compilers to markup source code with abstract syntax tree nodes, for different programming languages including C and C++, but not CMake.This too enables high-fidelity source code analysis and transformation, using general XML tools such as XPath or XSLT.
To summarize the related work: all systems for large scale source-to-source transformations lean heavily on parsers to produce (abstract) syntax trees.The parser does most of the "heavy lifting."After this, they provide pattern matching and traversal facilities, and sometimes tree substitution.The high-fidelity source-to-source systems either edit the input file using edit scripts, or retain all information during parsing in a concrete syntax tree.The current transformations of this article also required simpler scripting tasks: file handling and such.Since Rascal is a programming language, the scripting could be done close to the code analysis and transformation code in the same language. 33Other systems for AST analysis, such as Stratego/XT and ASF+SDF require scripts in an external scripting language such as Python or Bash.

Results
This article describes our approach to the semi-automatically refactoring of a legacy software system.As compared to a hypothetical manual refactoring, we have argued that the benefits of our automatic approach outweigh its own particular challenges: the automated approach has granted us repeatability, (limited) generalizability, elimination of human error introduction, and early detection of errors, at the cost of having to acquire experience in metaprogramming.By automating our refactoring, we have solved our being stuck and allowed ourselves to perform a useful, necessary refactoring that would not have been carried out manually.In our particular refactoring case, our 371 lines of metaprogramming code identified and changed approximately a 19-fold of lines of target code, indicating that even for one-off refactorings, automating the job makes sense.

Lessons learned
In this section, we describe our lessons learned.We created a parser for CMake.Parts of the CMake parser could be reused and extended for future refactorings.Our Rascal scripts for transforming the legacy code are tailored for this case.Since the refactorings are highly context-specific, they are not easily generalizable and cannot deal with any other potential context.In fact, their specificity is part of what makes them so powerful, and is actually an enabler to perform these kind of refactorings.
In this article, we presented a way to write changes to a file without the need to unparse or pretty print the abstract syntax tree.This way of changing source files could be added to the standard library of Rascal.
In the remainder of this section, we split up the lessons learned for software engineers, software managers, and tool manufacturers.

Implications for software engineers
Legacy code is code that is typically written by someone who is retired, works at another department, or has left the company.This code is written with old and obsolete tools and technologies.Replacing or rejuvenating legacy code is labor-intense and intellectually uninteresting.For software engineers, maintenance projects are not popular, because for most it is much more attractive to create something new.Software maintenance can become fun when it consists of creating state-of-the-art Rascal scripts that do the repetitive work in an automated fashion.In fact, the software engineer is writing challenging new code in the form of Rascal metaprograms.

Implications for software managers
The fact that software maintenance projects can become more attractive for software engineers is a big plus for software managers.It could become more easy to convince the best employees to work on maintenance projects.
Maintenance projects are important for existing customers and for reusing existing components in new products. 1rom literature, we also know that a large portion of time is spent on maintenance.The promised productivity gains from applying metaprogramming techniques directly results in a more efficient software development process.
Introducing Rascal in a department requires some change management.Our experience from the financial world, in which Rascal has been used for a longer period of time, is that it works best to pair domain experts with metaprogramming experts.Both can complement each other and be successful from the start.We think initial success is key to a successful adoption of Rascal in industry.
For software managers, it is important to have someone to go to when there are issues with a tool.The availability of commercial support is a requirement for the adoption of a tool.For Rascal, such support is available.

Implications for Rascal manufacturers
The ClaiR C/C++ front-end was recently added to Rascal.ClaiR is the enabler for adopting Rascal in industry.However, the awareness of Rascal in industry can be improved.Improved awareness can be achieved by not only publishing academic papers, but also publishing in nonacademic magazines read by industry experts.
Our experience is that questions about Rascal on StackOverflow || are answered in a few hours.The Rascal community should keep on doing this.However, we think that the need for asking questions could be reduced when error reporting is improved.
The documentation of ClaiR could be improved.For the core Rascal language there is a well-written and well-maintained tutor available online.** For applying ClaiR, one has to look at the source code when creating metaprograms.† † For an improved user experience, ClaiR (and other newly created language front-ends) should be documented in a similar fashion as the core language.

Future work
The case described in this article involved the refactoring of test suites.All test cases of the test suites passed before refactoring and passed after refactoring.For legacy production code, the test coverage is typically unsatisfactory, hampering the refactoring process, since refactoring code in an industrial setting which does not have good test coverage can be tricky.Writing a formal proof of the equality of the operational semantics before and after refactoring is not realistic for an industrial software engineer.To check whether new code behaves as intended, we would like to investigate the ability to automatically generate a test suite.The test suite should cover the code that needs to be transformed.It would provide some confidence that the transformation is indeed a refactoring when the generated test cases pass before and after the transformation.

1 # example_stx 2
set (TARGET_NAME e x a m p l e _ s t x ) 3 set ( $ {TARGET_NAME} _IDE_FOLDER " PosGen / bb / T e s t " ) 4 set ( $ {TARGET_NAME} _DIRS 5 $ { POS_GEN_PATH } / bb / i n c 6 $ { POS_TEST_PATH } / STX / S T X S e r v e r L i b / s r c 7 ) 8 set ( $ {TARGET_NAME} _SOURCES e x a m p l e _ s t x .c ) 9 set ( $ {TARGET_NAME} _DEPS $ {COMMON_STX_DEPS } ) 10 c o n f i g u r e T e s t E x e c u t a b l e ( $ {TARGET_NAME} OBJ_TN ) 39V a r i a b l e a i s bound on a s u c c e s s f u l match 40 41 ( Expr ) ' 0 + <Expr x> ' ( Expr ) ' <Expr x> + <Expr x> ' / paren ( paren ( \ _ ) ) := e / / f i n d s any d i r e c t l y n e s t e d paren e x p r e s s i o n s in e

1
s t r changeSource( s t r s o u r c e , r e l [ int, int, s t r ] c h a n g e s L i s t ) { 2 o f f s e t = 0 ; 3 for ( < s t a r t I n d e x , endIndex, changeWithString> <− s o r t ( c h a n g e s L i s t )) { 4 s o u r c e = s o u r c e [ . .s t a r t I n d e x + o f f s e t ] + changeWithString + s o u r c e [ endIndex + o f f s e t . .] ; 5 o f f s e t += s i z e ( changeWithString) − (endIndex − s t a r t I n d e x )

1 layout
Layout = WhitespaceAndComment * !>> [ \ \ t \ n \ r # ] ; 2 l e x i c a l WhitespaceAndComment 3 = [\ \t\n\r] 4 | "#" ![\n ] * $ ; 5 6 s t a r t syntax Build = b u i l d : S e c t i o n+ s e c t i o n s ; 7 syntax S e c t i o n = s e c t i o n : T a rg e t t a r g e t Options o p t i o n s ; 8 syntax T a r g e t = t a r g e t : " s e t " " ( " Id t a r g e t M a c r o Deps+ deps " ) " ; 9 syntax Deps = deps : D dep ; 10 syntax Options = o p t i o n s : Sources * s o u r c e s CompileFlags * Configure ; 11 syntax Sources = s o u r c e s :

14 syntax
Configure = c o n f i g u r e : C o n f i g u r e L i b r a r y * C o n f i g u r e T e s t E x e c u t a b l e * ; 15 syntax C o n f i g u r e L i b r a r y = c o n f i g u r e L i b r a r y : " c o n f i g u r e L i b r a r y " " ( " " $ " " { " Id targetMacro " } " Id b u i l d T a r g e t " ) " ; 16 syntax C o n f i g u r e T e s t E x e c u t a b l e 17 = c o n f i g u r e T e s t E x e c u t a b l e : " c o n f i g u r e T e s t E x e c u t a b l e " " ( " " $ " " { " Id ta rg e tMa c ro " } " Id b u i l d T a r g e t " ) " 18 | c o n f i g u r e U n i t T e s t E x e c u t a b l e : " c o n f i g u r e U n i t T e s t E x e c u t a b l e " " ( " " $ " " { " Id t a r g e t M a c r o " } " Id b u i l d T a r g e t " ) " 19 | c o n f i g u r e E x e c u t a b l e : " c o n f i g u r e E x e c u t a b l e " " ( " " $ " " { " Id t a r g e t M a c r o " } " Id b u i l d T a r g e t " ) " ; 20 21 keyword Reserved 22 = " s e t " | " $ " | " { " | " } " | " _COMPILE_FLAGS " | " c o n f i g u r e L i b r a r y " | " c o n f i g u r e T e s t E x e c u t a b l e " 23 | " c o n f i g u r e U n i t T e s t E x e c u t a b l e " | " c o n f i g u r e E x e c u t a b l e " ; 25 l e x i c a l Id = ( [ a−zA−Z / .\ − ] [ a−zA−Z0−9_ / .] * !>> [ a−zA−Z0−9_ / .] ) \ Reserved ; 26 l e x i c a l TT = ( [ _ ] [ A−Z_ ] * !>> [A−Z_ ] ) \ Reserved ; 27 l e x i c a l D = ( [ a−zA−Z ] [ a−zA−Z0−9$ { } \ _ .] * !>> [ a−zA−Z$ { } \ _ .] ) \ R e se r ve d ;

1
r e l [ int, int, s t r ] modifyCMakeLists( B u i l d b u i l d ) i t ( b u i l d ) { 4 case s e c : s e c t i o n ( t a r g e t ( _ , [ * _ , deps ( / _ s t x / i ) , * _ ] ) , _ ) : 5 changes += m o d i f y S t x T a r g e t ( s e c ) l [ int, int, s t r ] m o d i f y S t x T a r g e t ( S e c t i o n s e c t i o n ) { 10 changes = { } ; 11 v i s i t ( s e c t i o n ) { 12 case s o u r c e s ( _ , " _SOURCES " , s o u r c e s ) : { 13 changes += modifyCTarget( s o u r c e s ) ; 14 } 15 case s o u r c e s ( _ , " _DIRS " , [ * _ , l : s o u r c e L i s t ( _ , _ , _ , / s t x s e r v e r l i b / i , _ , _ ) , * _ ] ) : { 16 changes += < l .s r c .o f f s e t , l .s r c .o f f s e t + l .s r c .length , c o n f i g u r e T e s t E x e c u t a b l e (a , b) : { 22 changes += < t .s r c .o f f s e t , t .s r c .o f f s e t + t .s r c .length , " c o n f i g u r e U n i t T e s t E x e c u t a b l e ( $ { <a> } <b> ) " l [ int, int, s t r ] modifyCTarget( S o u r c e L i s t+ s o u r c e s ) { 28 changes = { } ;

1Listing 9 :
r e l [ int, int, s t r ] v o i d P o i n t e r A s s i g n m e n t s ( D e c l a r a t i o n a s t ) i t ( a s t ) { 4 case a s s i g n ( i d E x p r e s s i o n ( n : name ( _ ) ) , f : f u n c t i o n C a l l ( i d E x p r e s s i o n (name ( " TOS_p_SEG_create " ) ) , _ ) ) : { 5 changes += < f .s r c .o f f s e t , f .s r c .o f f s e t , f i n d T y p e C a s t ( a s t , n .\ v a l u e ) > ; f i n d T y p e C a s t ( D e c l a r a t i o n a s t , Name n) { \ l a b e l { l s t : t y p e c a s e } v i s i t ( a s t ) { case s i m p l e D e c l a r a t i o n ( namedTypeSpecifier ( _ , name ( t ) ) , [ d e c l a r a t o r ( [ p o i n t e r ( _ ) ] , name ( n ) , _ ) , * _ ] ) : return " ( < t > * ) " ; } return " (FIXME) " ; } Metaprogram transforming void* casts to casts to a more defined pointer type s [ i ] = " e x t e r n \ " { C } \ " { ' < l i n e s [ i ] > " ; f i r s t = f a l s e ; } i f ( ! ( c o n t a i n s ( l i n e s [ i +1] , "# i n c l u d e " ) && ( c o n t a i n s ( l i n e s [ i +1] , " p i .h " ) | | c o n t a i n s ( l i n e s [ i +1] , " l i .h " ) ) ) ) { l i n e s [ i +1] = " } ' < l i n e s [ i +1]> " ; f i r s t = true ; } } } return l i n e s ; } Listing 10: Metaprogram wrapping C header inclusions in an extern "C" block to solve name mangling issues Change 2: Replace asserts void modifyHeaders ( loc f ) { l i n e s = r e a d F i l e L i n e s ( f ) ; last = 0; for ( i <− [ 0 . .s i z e ( l i n e s ) − 1]) { i f ( c o n t a i n s ( l i n e s [ i ] , "# i n c l u d e " ) ) { l a s t = i; i f ( c o n t a i n s ( l i n e s [ i ] , " STXServerLib .h " ) | | c o n t a i n s ( l i n e s [ i ] , " STXServer .h " ) ) { l i n e s [ i ]=" " ; } } } i f ( c o n t a i n s ( l i n e s [ l a s t +1] , " } " ) ) { l a s t += 1 ; / / p l a c e o u t s i d e e x t e r n "C" block } l i n e s [ l a s t ] += " \ r \ n \ r \ n# i n c l u d e \ " SetupTestDependencies .h \ " " ; w r i t e F i l e C h a n g e s ( f , l i n e s ) ; } Listing 12: Metaprogram removing STX header inclusions and adding a GoogleTest header inclusion Change 4: Rewrite test function Listing 13 describes the two functions that make up this refactoring step.

1 15 :
r e l [ int, int, s t r ] r e p l a c e P a s s F a i l ( D e c l a r a t i o n a s t ) i t ( a s t ) { 4 case f : e x p r e s s i o n S t a t e m e n t ( f u n c t i o n C a l l ( i d E x p r e s s i o n ( name( " N o t i f y T e s t F a i l e d " ) ) , _ ) ) : 5 changes += < f .s r c .o f f s e t , f .s r c .o f f s e t + f .s r c .length , " FAIL ( ) ; " > ; 6 case f : e x p r e s s i o n S t a t e m e n t ( f u n c t i o n C a l l ( i d E x p r e s s i o n ( name( " N o t i f y T e s t P a s s e d " ) ) , _ ) ) : 7 changes += < f .s r c .o f f s e t , f .s r c .o f f s e t + f .s r c .l e n g t h , " " Metaprogram removing STX's positive verdict notification and replacing STX's negative verdict notification by the GoogleTest counterpart Change 7: Rewrite main function

1
void r e f a c t o r T e s t S u i t e ( l o c f ) { 2 testClassName = createTestClassName ( f ) ; 3 ast = parseCpp( f ) ; 4 t e s t C a s e s = f i n d T e s t C a s e s ( a s t ) ; 5 6 changes = { } ;

1
void g e n e r a t e T e s t S c r i p t ( B u i l d a s t , loc f ) { 2 path = " \ " { C } : \ \ path \ \ to \ \ deployment \ \ l o c a t i o n \ \ \ " " ; 3 name = | f i l e : / / / | + g e t P a t h ( f ) + " t e s t .b a t " ; 4 batch = " cd \ \ path \ \ t o \ \ b u i l d \ \ output \n " ; 5 for ( f i l e <− S t x C p p F i l e s ( a s t ) + S t x C F i l e s ( a s t )) { 6 f i l e = changeExtension ( f i l e , " .exe " ) ; 7 s p l i t t e d = s p l i t ( " _ " , f i l e ) ; 8 testName = " " ; 9 for (sp <− s p l i t t e d ) { testName += r e p l a c e F i r s t (sp , sp [ 0 ] , toUpperCase(sp [ 0 ] s ' " ; } w r i t e F i l e ( name , b a t c h ) ; } Listing 19: Metaprograming generating the test batch scripts

19Listing 20 :
void main( l i s t [ loc ] cmakeFiles) { 2 for ( f i l e <− cmakeFiles) { 3 b u i l d = p a r s e ( f i l e ) ; 4 r e f a c t o r T e s t S u i t e s ( b u i l d , f i l e ) ; 5 g e n e r a t e B a t c h S c r i p t ( b u i l d , f i l e ) ; 6 g e n e r a t e T e s t S c r i p t ( b u i l d , f i l e ) void r e f a c t o r T e s t S u i t e s ( B u i l d b u i l d , loc f i l e ) { changes = { } ; p r o c e s s T e s t S u i t e s F r o m C M a k e L i s t s ( b u i l d , f i l e ) ; changes += modifyCMakeLists( b u i l d ) ; f c = r e a d F i l e ( f i l e ) ; f c = changeSource( fc , changes) ; w r i t e F i l e ( f i l e , f c ) ; } Main function of our semi-automated refactoring GEN_p_cold_entry function and boots the runtime environment.The functions on line 7 and line 9 are part of the STX framework.The GEN_p_cold_entry function registers one test case called example_test_1 (line 7 GEN_m_assert ( c o r e l e f t _ b e f o r e _ t e s t == c o r e l e f t _ a f t e r _ t e s t ) ; N o t i f y T e s t P a s s e d ( ) ; } int main ( int a r g c , char * argv [ ] ) { a r g c _ i n p u t = a r g c ; a r g v _ i n p u t = argv ; T O S _ p _ S T P _ s t a r t _ c o n t i n u e _ s e t ( GEN_p_cold_entry ) ; TOS_p_STP_boot ( ) ; return ⊘ ; } Listing 1: Example STX test suite, containing a single STX test case Listing 1 depicts an example test suite.The main function registers the GEN_p_cold_entry function.The scripting we use to execute the test suites on offload systems requires a clean output of the -gtest_list_tests command-line argument.For this reason, the main function first checks on line 22 whether this flag is specified.If so, booting is aborted, and instead the output this flag generates is printed to screen by evaluating the return argument on the next line.In absence of this flag, the main function registers the GEN_p_cold_entry function (line 25) and boots the run-time environment (line 26).At line 4 the ExampleStx class is defined.It inherits from SetupTestDependencies, which takes care of starting and stopping test dependencies.The test cases are started by GoogleTest on line 7.The test case is defined on line 10 and line 17 contains an assert to check whether coreleft_before_test is equal to coreleft_after_test. GoogleTest then prints the expected and actual values.
c o r e l e f t _ b e f o r e _ t e s t = TOS_p_SEG_coreleft ( ) ; Example_p_TM_test ( ) ; c o r e l e f t _ a f t e r _ t e s t = TOS_p_SEG_coreleft ( ) ; ASSERT_EQ( c o r e l e f t _ b e f o r e _ t e s t , c o r e l e f t _ a f t e r _ t e s t ) ; } int main ( int argc , char * argv [ ] ) { a r g c _ i n p u t = a r g c ; a r g v _ i n p u t = argv ; i f ( SetupTestDependencies : : I s L i s t A r g u m e n t S p e c i f i e d ( a r g c _ i n p u t , a r g v _ i n p u t ) ) { Metaprogram to refactor CMake files belonging to STX to their GoogleTest counterparts The processTestSuitesFromCMakeLists function, which is not depicted, is a wrapper for the functions listed in Listing 8.
func.s r c .o f f s e t , func.s r c .o f f s e t + func.s r c .l e n g t h , " ASSERT_EQ " > ; changes += <a.s r c .o f f s e t + a. s r c .l e n g t h , b .s r c .o f f s e t , " , " > ; } 10 case notEquals (a, b) : { 11 changes += < func.s r c .o f f s e t , func.s r c .o f f s e t + func.s r c .l e n g t h , " ASSERT_NE " > ; 12 changes += <a.s r c .o f f s e t + a. s r c .l e n g t h , b. s r c .o f f s e t , " , " > ; 13 } 14 case n∶not ( a ) : { 15 changes += < func.s r c .o f f s e t , func.s r c .o f f s e t + func.s r c .l e n g t h , " ASSERT_FALSE " > ; 16 changes += <n.s r c .o f f s e t , a. s r c .o f f s e t , " " > ; 17 } 18 default : 19 changes += < func.s r c .o f f s e t , func.s r c .o f f s e t + func.s r c .l e n g t h , " ASSERT_TRUE " > ; The removeRegisterPrintfs in Listing 14 removes occurrences of the printing function PP_p_PF_printf if what is to be printed contains "register".
• Second, the visitTestCases function iterates over the found test cases.For each test name, the AST is then traversed, matching function definitions and function declarations having this name.The declarations are removed (lines 15 and 16), and the definitions are replaced by the GoogleTest macro (lines 13 and 14).for ( t e s t C a s e <− t e s t C a s e s ) { v i s i t ( a s t ) { case f u n c t i o n D e f i n i t i o n ( r e t , f : f u n c t i o n D e c l a r a t o r ( _ , _ , name ( t e s t C a s e ) , _ , _ ) , _ , body ) : changes += < r e t .src.of f s e t , f .src.of f s e t + f .src.length, " TEST_F (< testClassName > , < t e s t C a s e > ) " > ; case f : s i m p l e D e c l a r a t i o n ( _ , [ f u n c t i o n D e c l a r a t o r ( _ , _ , name ( t e s t C a s e ) , _ , _ ) , * _ ] ) : changes += < f .src.of f s e t , f .src.of f s e t + f .src.length, " " > ; • The name of the entry function is extracted using the getColdEntry function.It searches for calls to the TOS_p_STP_start_continue_set function, which takes the entry function name as its first argument.•Thisentryfunction name is subsequently used by the modifyColdEntry function to match out the body of the entry function (line 22).In the body, calls to the STX function RegisterTestFunction are removed (lines 24 and 25), and calls to the StartTestServer function are replaced (lines 26-29). 1 r e l [ int, int, s t r ] r e m o v e R e g i s t e r P r i n t f s ( D e c l a r a t i o n a s t ) { 2 changes = { } ; 3 v i s i t ( a s t ) { 4 case f : e x p r e s s i o n S t a t e m e n t ( f u n c t i o n C a l l ( i d E x p r e s s i o n ( name ( " P P _ p _ P F _ p r i n t f " ) ) , [ s t r i n g L i t e r a l ( a r g 0 ) , * _ ] ) ) : 5 i f ( c o n t a i n s ( arg0 , " r e g i s t e r " ) ) { 6 changes += < f .src.of f s e t , f .src.of f s e t + f .src.length," " > ; 21 v i s i t ( a s t ) { 22 case f u n c t i o n D e f i n i t i o n ( _ , f u n c t i o n D e c l a r a t o r ( _ , _ , name ( c o l d E n t r y ) , _ , _ ) , _ , body ) : { 23 v i s i t ( body ) { 24 case f : e x p r e s s i o n S t a t e m e n t ( f u n c t i o n C a l l ( i d E x p r e s s i o n ( name ( " R e g i s t e r T e s t F u n c t i o n " ) ) , _ ) ) : 25 changes += < f .src.of f s e t , f .src.of f s e t + f .src.le n g t h , " " > ; 26 case f : e x p r e s s i o n S t a t e m e n t ( f u n c t i o n C a l l ( i d E x p r e s s i o n ( name ( " S t a r t T e s t S e r v e r " ) ) , _ ) ) : 27 changes += < f .src.of f s e t , f .src.of f s e t + f .src.le n g t h , 28 " int t e s t R e s u l t = SetupTestDependencies : : GoogleTest ( a r g c _ i n p u t , a r g v _ i n p u t ) ;In Listing 15, the replacePassFail function localizes STX notification functions by traversing a provided AST.• Calls to STX's NotifyTestFailed are replaced with GoogleTest's FAIL.•Since tests pass implicitly, STX calls to NotifyTestPassed are simply removed.
Overview of the number of files and modifications affected by our semi-automated refactoring Quantification of modifications in terms of source lines of code (SLOC) TA B L E 1