1. Field of the Invention
The present invention relates generally to methods and apparatus for automatically extracting data from a file. More specifically, the present invention relates to automatically extracting documentation information that is embedded within a software program for the purpose of creating code documentation related to the software program.
2. Description of the Related Art
The software industry is expanding at a frenetic pace. As a result of this expansion, software now touches almost every aspect of our lives. For example, we cook our food in stoves and microwaves that are likely to include programming capabilities. Many of us do business on a computer using various word processing and data base applications, for example. Various recreational outlets are also available that utilize software, such as video games and television sets.
To meet this high demand for software products and services, the software industry continuously strives to increase its software output and decrease its software development time. As a result, the software industry has found various ways to increase the pace of software development. For example, one way to increase software development efficiency is to modularize software so that a particular software module may be used within more than one program. Thus, a new software application may be quickly developed by combining existing software modules and possibly adding newly created modules. Modularization of software may also facilitate modification of software since, in theory, it""s easier to determine where to make changes within a modularized program than in a nonmodularized program.
However, modularization of software is one of the reasons why there is a need for robust software documentation. In other words, in order for software to be shared efficiently, software must be adequately documented so that a programmer does not have to spend too much time determining how to implement or make changes to each software module. Additionally, software developers need documentation for a particular module in order to develop additional software that is compatible with the particular software module.
Although there are many techniques available today for automatically documenting code, these techniques fall into two broad categories. The first set of documentation techniques implement an xe2x80x9cenforcement approach.xe2x80x9d In this first approach, the programmer develops code within a specific environment, such as a Computer Aided Software Engineering (CASE) tool, that enforces collection of information that is then used to document the code. For example, before the programmer can begin creating her code, the programmer is required to go through various planning stages, such as generating flowcharts, drawings, or text proposals. By way of another example, the programmer may be forced to document her code in a specified manner prior to checking in her completed code.
Although the enforcement approaches provide a mechanism for ensuring that the code is adequately documented, the enforcement approaches have several disadvantages. For example, the programmer must learn how to work within the specific environment and follow the specific enforcement rules of the particular CASE tool in a specified order. That is, the programmer must know what to do for each enforcement procedure. Thus, the enforcement approach requires many man-hours for the programmer to learn how to comply with the particular enforcement requirements and how to work within the specific enforcement environment. Additionally, an individual programmer may not need to follow certain enforcement rules (e.g., creating a flowchart prior to coding), and thus, valuable time is wasted by enforcing inflexible documentation rules on all programmers. In sum, the enforcement approach may get in the way of software production by forcing the programmer to perform pointless, and sometimes complicated, tasks.
The second set of techniques for automatically documenting code implement a kind of xe2x80x9cartificial intelligencexe2x80x9d (A/I) approach. In general, an A/I approach is implemented on the completed code. That is, the completed code is read by a code interpreter that converts the code into code documentation. Although the A/I approaches give programmers more freedom to develop code without stringent enforcement requirements than the enforcement approaches, A/I approaches have their gown disadvantages. For example, the A/I approach is typically unreliable and cannot accurately interpret certain portions of the code and convert the portions into meaningful documentation.
Additionally, the code interpreter is limited to interpreting and documenting what is contained within the code. That is, the interpreter may fail to include relevant documentation. For example, if the programmer uses information that is not within the code to write the code, this extrinsic information is typically not included within the code itself. By way of specific example, the code typically does not include such information as the intended purpose of a particular function over another. Thus, the intended purpose of a particular function is left out of the documentation, even though this information may be relevant for interpreting and implementing the code.
In view of the foregoing, there is a need for methods and apparatus for creating code documentation that are simple to use and produce meaningful documentation. Specifically, there is a need for flexible methods and apparatus that allow a programmer to choose any programming environment for generating code and inputting corresponding documentation that sufficiently describes the code. Additionally, there is a need for an easy-to-use technique and system for automatically generating code documentation directly from the generated code and corresponding documentation input.
Accordingly, the present invention provides an apparatus and method for automatically creating code documentation from the code, itself. Simple tags are added to the code and used to extract relevant documentation information from the code and/or other documentation or code files.
In one embodiment a method of creating or modifying a documentation output object that describes a portion of computer code is disclosed. A documentation input object within a code file that is associated with a first documentation information object is provided. The first documentation information object is extracted based on the documentation input object. The first documentation information object is output to the documentation output object.
In a preferred embodiment, the extraction of the first documentation information object is based on a position of the documentation input object. In another embodiment, the method further includes the act of formatting the documentation output object based on the documentation input object. Additionally, the documentation information object may be located either within the code file or outside of the code file.
In other embodiments, the documentation input object is in the form of an engineering tag within the code file for identifying the first documentation information object for extraction to the documentation output object. The method may further include the act of providing a documentation tag within the code file for indicating how to extract a second documentation information object. The method may also include the act of filtering out the second documentation information object based on the documentation tag such that the second documentation information object is not output to the documentation output object.
In yet an other embodiment, the documentation input object may also be in the form of a control tag within the code file for indicating from where to extract the first documentation information object. The method may also include output ting the first documentation information object to a data structure.
In another aspect of the invention, a method of creating a data structure is disclosed. A plurality of tags are provided, wherein each tag is associated with a documentation information object. The documentation information objects are extracted based on the associated tag(s). The documentation information objects are arranged in a predetermined order. Additionally, the data structure is formed from the arranged documentation information objects. In a preferred embodiment, the data structure is in the form of a binary tree. In the binary tree, the tags are the in the form of a plurality of templates; each template is associated with a plurality of fields; and each field is associated with a text object. In an alternative embodiment, a first one of the templates is linked to the right with a subsequent template from the plurality of template; each template is linked to the left with a first one of the associated fields, with each associated field being linked to the left with a subsequent one of the associated fields; and each field is linked to the right with an associated text object.
In yet another embodiment, a computer readable medium containing program instructions for creating or modifying a documentation output object that describes a portion of computer code is disclosed. The computer readable medium includes computer readable code for (1) providing a documentation input object within a code file that is associated with a first documentation information object, (2) extracting the first documentation information object based on the documentation input object, and (3) outputting the first documentation information object to the documentation output object.
In another embodiment of the present invention, a computer system for creating or modifying a documentation output object that describes a portion of computer code is disclosed. The system includes a code file having a documentation input object that is associated with a first documentation information object and a documentation device that is configured to extract the first documentation information object based on the documentation input object and output the first documentation information object to the documentation output object.
These and other features and advantages of the present invention will be presented in more detail in the following specification of the invention and the accompanying figures which illustrate by way of example the principles of the invention.