Automatically Capturing Source Code Context
Emily Hill, Lori Pollock, K. Vijay-Shanker
Natural Language Program Analysis (NLPA) Group
Department of Computer & Information Sciences
University of Delaware
OverviewAs software systems continue to grow and evolve, locating code for maintenance and reuse tasks becomes increasingly difficult. We have developed a novel approach that provides automated support to the developer both in formulating queries and discriminating between relevant and irrelevant search results. Our contextual search approach automatically captures the context of query words in source code by extracting and generating natural language phrases from method and field signatures. These phrases naturally form a hierarchy that allows the developer to quickly identify relevant program elements by reducing the number of relevance judgments, while the natural language phrases help the developer to formulate effective queries.
We conducted an empirical evaluation of 22 developers that compares our contextual search approach to verb-direct object (V-DO), the most closely related search technique. Our results indicate that contextual search significantly outperforms verb-direct object in terms of effort and effectiveness.
"Automatically Capturing Source Code Context of NL-Queries for Software Maintenance and Reuse." Emily Hill, Lori Pollock, and K. Vijay-Shanker. International Conference on Software Engineering (ICSE '09), May 2009. [pdf]   [Presentation: Keynote]
Contextual Search ToolYou can test out our contextual search technique, or our implementation of V-DO.
So far the search tool has been used on 5 Java programs. If you would like to try out our contextual search technique on another, please e-mail a link to the source code and she will add it to the site.
Experiment MaterialsIn the experiment, we were interested in two comparisons: contextH with V-DO and contextH with contextL.
At the end of the experiment, each subject was asked to fill out an exit survey.
Subject ConcernsA tarball of the concerns used in this study is available: concerns.tar.gz