1. Technical Field
This disclosure concerns finding existing program logic and reusing it to rapidly build prototypes and develop new applications. In particular, this disclosure relates to a search, navigation and visualization tool that accepts high-level processing concepts as inputs that drive a multi-layered search to identify applications and application programming interface (API) calls for reuse.
2. Background Information
Software professionals widely recognize logic (e.g., source code) reuse as a technique that reduces the time, money, and other costs associated with creating a new application. Software professionals recognize API calls as forms of abstraction for high-level processing concepts, which drives the wide acceptance of API calls as reusable logic. For example, implementing an existing API call that produces a pull-down menu eliminates the need to write all the underlining logic necessary to deliver the functionality of a pull-down menu. However, current logic mining techniques and mining tools fail to retrieve highly relevant software components from application repositories that developers can use to prototype requirements in support of high-level processing concepts. Modern search engines do not ensure that applications identified by the search engines can serve as highly relevant application prototypes (HRAPs). Software professionals consider the mismatch between the high-level processing concepts (e.g., the intent reflected in the descriptions of applications) and low-level implementation details (e.g., API calls and actual run-time behaviour) found in application logic a fundamental technical challenge to identifying highly relevant applications (HRAs). Software professionals intend to author meaningful descriptions of applications, in the course of depositing applications into software repositories. The mismatch between the description of an application and the actual behaviour of the application represents one example of the “vocabulary problem”, which states that no single word or phrase best describes a programming concept.
In the spiral model of software development, stakeholders describe high-level processing concepts to development teams, and together the stakeholders and development teams identify requirements in support of the high-level processing concepts. In addition, a development team builds a prototype based on the requirements, and the development team demonstrates the prototype to the stakeholders to receive feedback. Prototypes attempt to approximate the desired high-level processing concepts (e.g., features and capabilities) of the new application stakeholders desire development teams to build. The feedback from stakeholders often leads to changes to the prototype and the original requirements, as stakeholders iteratively refine their vision. In the event the stakeholders make a substantial number of changes to the requirements, the development team often discards the prototype and builds a new prototype, and another iteration of refinements repeats. Building prototypes repeatedly without reusing existing application logic costs organizations a great deal in the form of wasted project resources and time.
Development teams find it cost-effective to identify existing applications that approximate the high-level processing concepts and requirements of new software projects as the basis for prototypes. In the context of prototyping, software development professionals consider such existing applications as HRAs. Many application repositories (e.g., open source repositories and source control management systems maintained by stakeholders internally) contain hundreds of thousands of different existing applications (e.g., potential HRAs). Unfortunately, developers find it difficult to identify applications (e.g., HRAs) ideal for prototyping because of the time and expense involved in searching (e.g., querying) application repositories and source control management systems.
The amount of intellectual effort that a developer must expend to move a software system from one stage of development to another may be considered the “cognitive distance.” For example, using current search tools developers expend significant intellectual effort to identify potentially relevant applications and confirm HRAs from potentially relevant applications. Many developers employ search engines that identify exact matches between keywords and the words found in application repositories. The application repositories may include descriptions, application logic comments, program variables names, and variable types of existing applications. Such search engines actually increase the difficulty of identifying HRAs, because of the poor quality of information contained in application repositories, and the inability to reduce the cognitive distance required to identify HRAs, as well as other factors. Additionally, many application repositories include incomplete, misleading and inaccurate descriptions of applications identified in the application repositories. Consequently, even matching keywords with words in the application descriptions found in application repositories does not guarantee that the search engine will identify HRAs.
Effective software reuse techniques (e.g., prototyping using existing applications) reduce the cognitive distance between the initial concept of a system (e.g., high-level processing concepts that expressly and implicitly describe the features and capabilities of a new application), establishing discrete requirements, and the production implementation of the new system. Unfortunately, current search engines lack the ability to reduce the cognitive distance related to identifying HRAs.
For example, an application description may indicate that an application includes an encryption feature when in fact the application uses compression as a crude form of encryption. A developer entering “encryption” (e.g., as a high-level processing concept and specific requirement) as a keyword may waste precious time to review a search engine result containing the incorrectly described application, and ultimately discard the result, because the application fails to meet the encryption requirement. The developer must download the application identified in the search result, locate and examine fragments of the application logic that allegedly implements encryption before determining that the application fails to meet the requirement. The developer may spend scarce project development budget resources and significant amount of time to analyze the application before determining that an application is not relevant. The developer may even observe the runtime behavior of the application to ensure that the behavior matches the high-level processing concepts desired by the stakeholders, and meets the requirements in support of the high-level processing concepts before establishing that the application qualifies as a HRA. Current search engines also lack the ability to assist developers to rapidly identify requirements in support of high-level processing concepts described by stakeholders.
Some search tools return code snippets (e.g., segments of application logic), however, code snippets do not give enough background or context to assist developers to create rapid prototypes, and such search tools require developers to invest significant intellectual effort (e.g., cognitive distance) to understand how to use the code snippets in broader scopes. Other existing approaches and tools retrieve snippets of code based on the context of the application logic that developers work on, but while these approaches and tools improve the productivity of developers, they do not return relevant applications from high-level processing concepts as inputs.
A need has long existed for a system and method that efficiently identifies HRAs usable to rapidly build prototypes and develop new applications.