This thesis consists of a replication study in which two algorithms to compute JavaScript call graphs have been implemented and evaluated. Existing IDE support for JavaScript is hampered due to the dynamic nature of the language. Previous studies partially solve call graph computation for JavaScript, but come with a disappointing performance. One of the two algorithms does not reason about interprocedural flow, except for immediately-invoked functions. The other algorithm reasons about interprocedural flow, where it considers all possible flow paths. The two call graph algorithms have been implemented in Rascal. A JavaScript parser for Rascal has been implemented as preliminary work. Furthermore flow graph algorithms were implemented in order to compute the call graphs. The call graph algorithms were ran with similar input scripts as the original study. Evaluation of the computed call graphs has been done by comparing computed call graphs to dynamic call graphs. A tool has been implemented to instrument the input scripts, so that dynamic call graph recording could take place. The evaluation has been done with a precision and recall formula on a per call site basis, where the precision formula indicates the ratio of correctly computed call targets to all computed call targets. The recall formula indicates the ratio of correctly computed call targets to the total call targets that were recorded in the dynamic call graph. Both these formulas were applied for each call site that was recorded in the dynamic call graph. The results were then averaged. The averaged results were compared to two different sources. One source was the data of the original study, whereas the other data came about by applying the same formulas on the call graphs that were used in the original study. The results of the latter source, turned out not to be corresponding to the data of the paper. The problem is confirmed by the authors, which is being further investigated bythem. The precision and recall of the computed call graphs turned out to result in lower values than the previous study, with a maximum precision deviation of ~ 25% and a maximum recall deviation of ~ 11%. The algorithm that reasons about interprocedural ow in the form of immediately-invoked functions, had an average precision of 74% and an average recall of of 89%. The other algorithm that reasons about all interprocedural ow, had an average precision of 75% and an average recall of 93%. Another form of evaluation has also been undertaken by calculating precision and recall of call relationships (edges from call site to call target) rather than averaging over call sites. This evaluation resulted in lower precision and recall values, where the algorithm for interprocedural immediatelyinvoked function resulted in an average of 64% in terms of precision and 87% in terms of recall. The fully interprocedural algorithm resulted in an average precision of 43% and an average recall of 91%. The call relationships evaluation indicates that the computed call graphs are quite complete, but more polluted than the other results and original study suggest. The largest contributor to lower precision values was found to be one of the design decisions of the algorithms. This design decision yields the merging of similarly named functions, which causes addition of spurious edges. Finally the thesis presents ideas on how to improve the call graph algorithms, on how to improve the JavaScript parser and on future research work for JavaScript call graph analysis.
Additional Metadata
Keywords static analysis, call-graph, Javascript
THEME Software (theme 1)
Thesis Advisor T. van der Storm (Tijs)
Citation
Dijkstra, J.-J. (2014, January). Evaluation of Static JavaScript Call Graph Algorithms.