Computers are becoming more and more an aspect of everyday life. Children as young as two years old are now exposed to computers in a home environment and they are now commonplace in schools, businesses, factories, and virtually every aspect of life.
One of the most common uses of computers is to create, store, and index data for later retrieval. As a result of the burgeoning growth of computer usage, the number of data files available for searching has grown exponentially, leading to an “information overload” that can overwhelm a data searcher.
To help manage the access to these massive numbers of files, also known as “assets”, a process called “data mining” has evolved. Data mining is defined in Newton's Telecom Dictionary (15th Edition, Miller Freeman Publishing, New York, N.Y.) as “[U]sing sophisticated data search capabilities that use statistical algorithms to discover patterns and correlations in data.” In essence, computers are used to “crawl” through masses of data files, analyze the information contained in the files according to criteria input by the user, and output results to the user which the user can use to study the information further.
To support the explosive growth of computer usage, software development has become a key part of any company engaged in high-technology business. Large companies such as IBM and Microsoft may have many software development groups located at numerous locations throughout the world, with each group employing hundreds or thousands of employees.
As used herein, complete programs (e.g., Microsoft Word) developed by the programmers are referred to as “software assets” and the various subroutines used to produce the software asset (e.g., C++ subroutines and programs used to create Microsoft Word) are referred to as “code assets.” These assets may number in the thousands or more for a single company and vary substantially in complexity, function, and size. For example, an asset may be a single program comprising hundreds of thousands of lines of computer code and designed to perform a multitude of tasks; at the other end of the spectrum, an asset may be a single subroutine comprising three lines of code.
With large numbers of employees focusing their work on the development of these assets, management becomes a critical task. With multiple groups within a company at different locations developing software for a variety of tasks, it is inevitable that duplication of effort will occur.
To avoid such duplication, it is desirable for all of the members of design groups, as well as all of the design groups within a company, to be able to share with each other the assets that they develop, and systems have been developed to assist in the management of such assets. In the software development field, the management, indexing, and retrieval of assets introduces an additional level of complexity not necessarily found in other asset management schemes. In particular, within a single group, assets may be developed in several different programming languages (e.g., Java, C/C++, COBOL, HTML, and/or XML) at the same time. Searching for code assets increases the complexity and difficulty of the search, since programmers typically want to search for language-specific constructs/semantics, such as inheritance relation, in object-oriented languages which cannot be captured using standard free-text searches. This makes it difficult for the users of the system to thoroughly search all of the assets.
Accordingly, it would be desirable to have an asset location system which offers the ability for free-text “search engine” style queries, attribute-specific queries, or a mixture of free-text queries and attribute-specific queries.