The present invention relates generally to a method for estimating the amount of effort required to develop computer software, and more particularly, to a method and apparatus for analyzing text statements of requirements, using a natural language engine, to estimate the amount of effort required to develop computer software.
A requirements document is a free text document which describes functionality requirements for a software development program. Examples of requirements documents include a Request for Proposal (RFP), Statement of Work (SOW), a Statement of Objectives, a Statement of Concept, or a Statement of Requirements. Estimating the amount of effort required to develop software from a requirements document is currently a human intensive activity and is inherently subjective. Even experienced programmers often have to guess, leading to inaccurate estimates.
One measurement of software complexity and cost is lines of source code. Often programmers will provide an estimate in terms of the number of lines of source code required to provide the requested functionality. As can be appreciated, the number of lines of source code to provide the requested functionality is usually a guess. This is because the number of lines of source code over simplifies the degree of complexity for developing source code to meet the functionality requested in a requirements document. One piece of code which has a small number of line of code may require as much effort to develop as another piece of code having significantly more lines of code. Compounding the problem is the way lines of source code are measured. Two programmers looking at already developed source code will frequently differ as to how many lines of source code are required to provide the requested functionality. For example, some programmers count blank lines and other programmers do not. Thus, software development programs relying on a bare estimate of lines of source code are too often over budget and late.
To overcome the guesswork of estimating the lines of source code, function points were developed. Function points use a standard body of rules and judgment of an experienced programmer to provide an estimate of the amount of effort to develop computer software. Function points measure software by quantifying the functionality provided to the customer based primarily on logical design, independent of the technologies used for implementation. Function points can be used to measure the functionality requested by a customer (requirements document) and the functionality received. A requirements document is evaluated to determine the number of function points required to meet the functionality requested in the requirements document. Although there are rules promulgated by the International Function Point Users Group (IFPUG) the application of these rules requires interpretation. Although better estimates can be provided as compared to simply estimating the number of lines of source code, the use of function points is still subjective and subject to human judgment. Thus, two programmers evaluating a requirements document using function points will still have two different estimates for developing source code to meet a requirements document.
Function points provide a mechanism that both software developers and users could utilize to define functional requirements. It was determined that the best way to gain an understanding of the needs of users was to approach their problem from the perspective of how they view the results an automated system produces. Therefore, one of the primary goals of Function Point Analysis is to evaluate the capabilities of a system from the point of view of a user. To achieve this goal, the analysis is based upon the various ways users interact with computerized systems. From the perspective of a user, a system assists the user by providing five (5) basic functions. These functions are depicted in FIG. 1. Two of these functions address the data requirements of an end user and are referred to as Data Functions. The remaining three functions address the need of a user to access data and are referred to as Transactional Functions.
The Five Components of Function Points
a) Data Functions
1) Internal Logical Files
2) External Interface Files
b) Transactional Functions
1) External Inputs
2) External Outputs
3) External Inquiries
Internal Logical Files
The first data function allows users to utilize data the user is responsible for maintaining. For example, a pilot may enter navigational data through a display in the cockpit prior to departure. The data is stored in a file for use and can be modified during the mission. Therefore the pilot is responsible for maintaining the file that contains the navigational information. Logical groupings of data in a system, maintained by an end user, are referred to as Internal Logical Files (ILF).
External Interface Files
The second data function is also related to logical groupings of data. In this case the user is not responsible for maintaining the data. The data resides in another system and is maintained by another user or system. The user of the system being counted requires this data for reference purposes only. For example, it may be necessary for a pilot to reference position data from a satellite or ground-based facility during flight. The pilot does not have the responsibility for updating data at these sites but must reference it during the flight. Groupings of data from another system that are used only for reference purposes are defined as External Interface Files (EIF). The remaining functions address the user""s capability to access the data contained in ILFs and EIFs. This capability includes inputting, inquiring and outputting of data. These are referred to as Transactional Functions.
External Input
The first Transactional Function allows a user to maintain Internal Logical Files (ILFs) through the ability to add, change and delete the data. For example, a pilot can add, change and delete navigational information prior to and during the mission. In this case the pilot is utilizing a transaction referred to as an External Input (EI). An External Input gives the user the capability to maintain the data in ILF""s through adding, changing and deleting its contents.
External Output
The second transactional function gives the user the ability to produce outputs. For example a pilot has the ability to separately display ground speed, true air speed and calibrated air speed. The results displayed are derived using data that is maintained and data that is referenced. In function point terminology the resulting display is called an External Output (EO).
External Inquiries
The third transactional function addresses the requirement to select and display specific data from files. To accomplish this a user inputs selection information that is used to retrieve data that meets the specific criteria. In this situation there is no manipulation of the data. It is a direct retrieval of information contained on the files. For example, if a pilot displays terrain clearance data that was previously set, the resulting output is the direct retrieval of stored information. These transactions are referred to as External Inquiries (EQ).
With this brief discussion of function points in mind, although function point analysis provides an excellent framework for estimating software development time and cost, it is still subject to human subjectivity. Thus, two programmers may differ as to how many function points are contained in a document. A need still exists for a method of estimating the time amount of effort required to develop computer software and a further needs exists for a method of estimating the amount of effort required to develop computer software.
It is, therefore, an object of the invention to develop a method of using function point analysis which eliminates to a large extent the subjectivity inherent in estimating the amount of effort required to develop software to meet functionality requirements outlined in a free text document.
It is another object of the present invention to develop a method of using function point analysis which eliminates to a large extent the subjectivity inherent in estimating the number of function points or feature points in a piece of source code.
It is yet another object of the present invention to use a natural language engine to determine the number of function points or feature points represented in a free text document.
It is yet a further object of the present invention to use a natural language engine to determine the number of function points or feature points in a piece of source code.
These and other objects of the present invention are achieved by analyzing an electronic version of a free text document. A natural language engine is trained to locate function points. The natural language engine performs an analysis of the electronic version of the free text document to locate function points in the electronic version of the free text document. Advantageously, the natural language engine eliminates human subjectivity from the identification of and counting of function points. Other types of functional counting methodologies can also be used in the present invention such as feature points.
An article, comprising at least one sequence of machine executable instructions; a medium bearing the executable instructions in machine readable form, wherein execution of the instructions by one or more processors causes the one or more processors to: train a natural language engine to recognize and search for phrases in textual documents which are representative of software functionality; analyze the textual document using the trained natural language engine to determine software functionality requirements requested in the textual document.
A computer architecture for analyzing a textual document to determine the amount of effort required to develop software code to meet functionality requirements requested in the textual document, comprising: training means for training a natural language engine to recognize and search for phrases in textual documents which are representative of software functionality; analyzing means for analyzing the textual document using the trained natural language engine to determine software functionality requirements requested in the textual document.
A computer system comprising: a processor; and a memory coupled to said processor, the memory having stored therein sequences of instructions, which, when executed by said processor, cause said processor to perform the steps of: training a natural language engine to recognize and search for phrases in textual documents which are representative of software functionality; analyzing the textual document using the trained natural language engine to determine software functionality requirements requested in the textual document.
Still other objects and advantages of the present invention will become readily apparent to those skilled in the art from following detailed description, wherein the preferred embodiments of the invention are shown and described, simply by way of illustration of the best mode contemplated of carrying out the invention. As will be realized, the invention is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the invention. Accordingly, the drawings and description thereof are to be regarded as illustrative in nature, and not as restrictive.