The invention relates to source code generators, more specifically to source code generated in the context of component-based programming. In particular, using a set of generation instructions and parameters, the generator produces nearly-repetitive and repetitive source code, ready for use by the developer.
Software development processes have changed dramatically in recent years as a result of the increasing usage of object-oriented models for programming, project management, and systems integration.
Enterprise applications are large, complex systems that turn abstract business practices into specific processes and event-driven interactions. These applications form the core of an organization""s methods.
Firms can capture and distill these xe2x80x9cbest practicesxe2x80x9d in a number of ways. In many cases, the creation of custom software for the entire organization is both unprofitable and beyond the capabilities of the enterprise. As a result, these companies turn to enterprise application software vendors for software solutions. Variously referred to as Enterprise Resource Planning (ERP, from vendors like Peoplesoft and SAP), Customer Relationship Management (Vantive, Clarify, Siebel) and Professional Services Automation (Niku, Visto), these systems quite simply run the business.
Each company is different, however, and these differences are what make each firm competitive. To translate their distinct competencies into electronic processes, companies must tailor their enterprise applications to the way they work. This means tuning, changing, and growing the applications over time.
One approach to this is known as xe2x80x9cparameterizationxe2x80x9d. In parameterization, trained ERP technicians configure many parameters and variables in order to tailor the behavior of the application to the nature of the business.
A second approach to modifying existing enterprise applications is to take an existing, generic application and to modify its source code directly. Until recently, a company that needed a highly specialized application because of its unique business practices was faced with paying a license fee and rewriting huge volumes of code from one of these generic applications.
Companies use source code customization as a response to the lack of control and customization afforded them by off-the-shelf applications. Similarly, they use parameterized ERPs as a response to poor maintainability and complex coding of licensed software or internally developed systems. A powerful alternative to these two approaches is the object-oriented framework approach.
Application development always starts from a set of incomplete, imprecise, and sometimes contradictory or inconsistent requirements. It is a difficult task for the developers and analysts of a complex system to clearly define from the outset what is being built.
Object-oriented programming methodologies strive to correct this imprecision through a more accurate modeling of the real world. As a successor to procedural models of coding, object-oriented techniques create a common vocabulary and well-defined boundaries in order to ease the integration of disparate objects. This helps developers define the scope and interaction of an application as a set of discrete components, improving design and facilitating subsequent modifications to the software.
Developers and systems analysts use high-level design tools called modeling tools to describe the business purpose of an application in a logical way that is abstracted from the physical implementation of the specific programming language, operating system and hardware on which the application runs.
Having an abstract description of the application reduces the burden of creating and maintaining the various software iterations it will undergo in its lifetime. If the job is to locate a piece of functionality in a million-line program, the guidance a model can provide becomes invaluable.
To promote a consistent methodology among modeling vendors, the industry has developed the Universal Modeling Language (UML) in order to standardize the elements used during the analysis and design of applications. The wide acceptance of UML and the emergence of component-based development have made the use of modeling tools the starting point to any object-oriented development.
Rather than designing an application from scratch, companies develop multi-tier e-commerce applications that make use of framework-based solutions offering services like persistency, security and transactions. A specific framework is an object-oriented abstraction which provides an extensible library of cooperating classes that make up a reusable design solution for a given problem domain. In essence, a framework offers a generic solution for which developers need to implement specialized code acting as component-to-framework integration code in each component in order to specify its particular process in the framework.
The framework approach is a dramatic improvement to the informal alteration of licensed, procedural source code. Frameworks leverage capital-intensive software investments through re-use, and provide a much higher-level application programming interface so that applications can be developed far more quickly.
When a commercial framework is customized or when new functionality is added to an in-house framework, changes are often required in the component-to-framework integration code for all participating components. Maintaining the code over the course of such changes is a tedious, error-prone process. With the popularity of framework-based solutions, it is also an increasingly large part of a developer""s work. Even when this code is produced by a code generator, developers still have to manually change the generated integration code in every component participating in a given framework because code generators cannot be customized to the degree where company requirements are properly implemented inside the framework. Also, most frameworks prevent a company from centralizing and capturing their corporate repository data because typical code generators are seldom well integrated with a modeling tool.
Frameworks are not only used to build integrated business applications in any industry, but also in the development of: user interfaces, graphics libraries, multimedia systems, and printing solutions; low-level services such as drivers and network protocols; common sets of infrastructure services such as those provided by the Enterprise JavaBeans(trademark) Component Model and CORBA (short for Common Object Request Broker Architecture) Object Services and development tools that further speed application creation.
Because a framework links software components with a high-level generic solution, it is imperative that the framework and the components remain synchronized. This is generally achieved by modifying both the framework and the components manually.
In some specific cases, however, information within the modeling tool can be used to generate portions of the integration code, without manual intervention. This is known as code generation. Code generation systems promise to streamline application development by letting developers work at a more abstract level in their design.
Unfortunately, typical code generators tend to generate incomplete code that must be manually altered or even discarded entirely.
The two most time-consuming repetitive tasks for a developer are: repetitive coding methods, such as writing the specialized software methods associated with each property of a component known as xe2x80x98getxe2x80x99 and xe2x80x98setxe2x80x99 selectors and nearly-repetitive coding of specific methods that the framework calls. When a developer extends the functionality within a framework, it is done by sub-classing one of the existing components and extending its functionality. In order to reflect these changes in the component-to-framework integration code, developers must manually make the changes according to the new functionality. Nearly-repetitive coding is prone to error, and its monotony reduces the satisfaction and productivity of developers.
Typical code generation solutions provided by modeling tools, persistence tools and integrated development environment vendors aim to reduce the impact of repetitive and nearly-repetitive coding, but in doing so they reduce the flexibility that a framework approach affords a developer. For this reason, the developer cannot: change the method name being generated; modify the code being generated for a particular method; generate different content for the same method name, based on contextual information; generate personalized code for in-house developed frameworks; generate a method name across a set of components if it is not pre-defined in a modeling tool for all components concerned and generate derived components with their associated required code for mainstream solutions such as the Enterprise JavaBeans(trademark) Component Model.
There are several ways to produce the code to maintain the integration between a framework and its participating components.
Manual Coding
The simplest method of maintaining an application is to manually code the repetitive and nearly-repetitive portions of the application. While this affords a greater degree of control over the resulting software than typical code generators, it is a costly, error-prone, monotonous technique that puts projects at risk and leads to the dissatisfaction of developers.
Typical Integrated Code Generators
Many modeling tools include integrated code generators. These produce small quantities of repetitive or nearly-repetitive code essentially skeletons of each intended method that guide developers in terms of the format of the method""s content. Unfortunately, the resulting code is seldom regenerated without losing any previously inserted manual code which is still required in most cases to fulfill the application""s complete functionality.
Controlling the code generation process with such coding tools is a potential workaround to these limitations. Some modeling tool vendors expose their code generation algorithms in the form of a scripting language.
Modifying these scripts will further complicate the development process, however, since the development team will now have to maintain these scripts in order to generate useful code and doing so requires a far deeper understanding of the specific applications and the modeling tool than that required by an object-oriented development environment. Furthermore, by modifying scripts the organization cuts itself off from the vendors"" code generator support staff and undermines upgrade efforts as the modeling vendor releases new software versions.
Scripting
In some cases, even without access to a modeling tool""s code generation algorithms, a developer can still use a scripting language to actually build a personalized source code generator for a particular target language.
This approach requires skilled development teams, and adds to the burden of development and maintenance. Rather than maintaining code manually, developers must now maintain the scripts that generate the code. Unfortunately, the resulting code is hard to use and difficult to maintain and evolve because the target source code is found in broken pieces inside the scripting language and it must be modified in the proper sequence. This is a problem that extends far beyond the capabilities of these tools. Furthermore, altering the scripts to target a new language or development environment is an intimidating challenge.
4GL
Fourth-generation Languages (4GL) let the developer work entirely at an abstract level rather than coding directly in the target programming language. 4GL systems rely on either the UML notation, a proprietary language or a visual programming environment in order to provide the specifications for generating applications. Unfortunately, the 4GL developer pays a price in terms of code efficiency and the resulting performance since the generated code is not tailored to the specific needs of a given algorithm it cannot be optimized.
Code generated from 4GL tends to be difficult to maintain because of poor integration with existing systems and the use of generated numeric variables. For example, the generated code contains generically named variables (label1, label2, etc.) rather than meaningful names such as xe2x80x98namexe2x80x99 or xe2x80x98customer numberxe2x80x99.
Static Generation Instructions
The static approach begins with the creation of a set of static generation instructions as a piece of parameterized target source code. The generator then replaces any variables found in the set of static generation instructions with information contained in the modeling tool. These set of static generation instructions are static because they access only the same type of information (particular method properties, for instance) that is contained in the modeling tool.
The sets static generation instructions approach does not adapt well to the generation of component-to-framework integration methods because it usually requires information which may be located in either the class, attribute or role. Static sets of generation instructions are often used to develop method skeletons that extract associated properties such as pre- and post-conditions from the model. Methods generated in this manner require manual intervention from the developer in order to complete the method content.
Generators using sets of static generation instructions can also translate programming instructions from object-oriented source code (e.g. C++) into a non-object-oriented source code (e.g. C) using systems such as described in U.S. Pat. No. 5,675,801 granted to Lindsey on Oct. 7, 1997.
Current generation technologies can be illustrated by the schematic block diagram in FIG. 1. A developer utilizes known input means, such as a keyboard 36, mouse 37 or other user interface 38 to interact textually or graphically with visual modeling tool or integrated development environment 30 and to describe the components involved in his application domain model. The tool 30 generates its own internal representation of the model declarations 31. Most parser/generators engines 33 produce generated code by mapping model declarations 31 using their corresponding subset of pre-defined static generation instructions 32, directly into target code 35 or with the use of an intermediate representation like an abstract syntax tree 34. Using this approach, the generator initializes, at any given step of a set of generation instructions, its generator context with the current transversal node of the syntax tree. The generator processes the instructions associated with that node. Although the generation process requires changing the generator context with any unprocessed nodes representing the process to execute, the generator context is static because it does not change during the execution of the generation instructions associated with that node.
A model declaration 31 would include a list of classes, potentially grouped by package name, the description of the operations and attributes for each class, the relationships among classes and potentially, the attributes characteristics associated with a particular framework.
Static generation instructions 32 have been involved in code generation in order to translate what was provided by the developer and expressed in a different representation or abstraction than the target source code 35. Developers do not easily have access to the library of pre-defined templates 32 in order to modify them because they are not used to dictate what code to generate but are indirectly involved based on their relations with the model declarations 31 being translated.
Some application development tools, like those developed with the programming language called Smalltalk, provide a framework for source code generation in which the target source code 35 is created by binding parameterized target source code embedded within one or more Smalltalk operations with values representing a subset of the model declarations 31 provided by the tool and used as input for the application development tool. This kind of source code generation cannot easily be modified by the developer because he has to find out where to make his changes. Also it might not be supported by the tool vendor because he changed the tool""s internal source code.
A current technology based on frames generates source code using the concept of an automated assembly line which manufactures software modules. A software module represents a component and it is assembled from a hierarchy of frames. Frames contain data elements and/or methods, expressed in any language and it could be parameterized using frame parameters. Frame parameters are local to the frame that first sets their values. A frame typically initializes (all the parameters) to the default values it uses in order to have ancestor frames setting the parameters that need to be changed by string values representing target source code. This technology makes code generation hard to achieve because the user has to know in advance the portions of repetitive code from the generated code in order to design the frame hierarchies.
FIG. 2. illustrates the sequence of events necessary when using a prior art system. A model 40 is created in a tool. Code 41 is generated, which has low usability as explained earlier. significant amounts of manual coding 42 are necessary to render this code output 41 useful. Names of operations, for example, could be changed to clearly express what they are. Once these manual modifications 42 are done, the final code 43 is ready for use within the system. An example of such a prior art system is found in EP 0,219,993 wherein a system for generating software source code components is described.
Accordingly, an object of the present invention is to provide a system for automating repetitive developer tasks associated with component development and source code generation requiring a minimum of developer intervention in order to obtain the desired functions.
Another object of the present invention is to provide a source code generator associated with component development which produces source code files which are more complete and more accurately match the desired functions than those produced by using the previously mentioned generators.
Another object is reducing code maintenance activities by regenerating operations. Also, by achieving a greater percentage of generated code, the amount of code that needs to be maintained is reduced because the generated code never needs to be manually altered. Also, the actual number of changes needed in the set of generation instructions is significantly less.
Another object of the present invention is to be able to integrate it with all major visual modeling tools or major integrated development environments in order to avoid capturing the same information in more than one tool.
Another object of the present invention is to provide a user interface tool which lets the developer define textually or graphically what needs to be generated through the use of sets of generation instructions and lets him customize the source code generator options, once the target programming language has been specified to the tool. Once the developer saves sets of generation instructions and generator options into a file, the source code generator tool associated with component development could be invoked in batch mode from a modeling tool or an integrated environments; without any developer interaction.
Yet another object of the present invention is to provide a source code generator tool which generates code for any object oriented programming language or any programming language supporting components.
Also, another object of the present invention is to provide an interface for creating target source code found in each set of generation instructions which requires a minimum of developer intervention to obtain the desired set of generation instructions.
Also, another object of the present invention is to provide traceability between a block of code and its set of generation instructions.
Yet another object of the present invention is to provide a generation instructions editor that meets the needs for editing the source code in a set of generation instructions, which can be used in all kinds of development environments such as web-based servers, framework environments, code generation tools or others.
Sets of generation instructions link the conceptual components in the high-level model to their respective coding implementation and their integration with given frameworks by establishing a reference to several types of properties.
Sets of generation instructions represent target source code that is parameterized with context variables. Each context variable indicates the referenced node (such as class or attribute) and the identifier (such as superclass name, name or type). The superclass name identifier could be used only when the referenced node is a class and the type identifier could be used only when the referenced node is an attribute. Also, the name identifier would give different kinds of results depending on whether the referenced node is a class or an attribute. In other words, the node that is being referenced can vary since the code generator""s context is dynamic.
If the generator context contains an attribute of a class, the set of generation instructions can still refer to its class; the identifier called xe2x80x98namexe2x80x99 would return the class name or the attribute name based on whether the referenced node is viewed as a class or an attribute.
Dynamic, context-sensitive code generation techniques generate greater amounts of more usable code because they leverage information within the model during the generation process. If a developer wishes to alter code during maintenance, he or she simply changes the set of generation instructions or the content captured in the class diagram of a modeling tool.
According to a first aspect of the present invention, there is provided a method for generating computer source code in an object-oriented programming language or a programming language supporting components. The method comprises the steps of providing a set of generation instructions, the set of generation instructions comprising definitions of at least two nodes and at least two corresponding node identifiers; target source code parameterized with at least two context variables, wherein each the at least two context variables points to one of the nodes and the corresponding node identifiers and wherein the at least two context variables indicate at least two different nodes, wherein the target source code is a valid syntax statement for the programming language; at least two filter instructions each comprising a selection criterion for selecting at least one of the nodes according to the definitions; and the step of automatically generating, in response to the at least two filter instructions, a plurality of code segments in the programming language by replacing, in the parameterized target source code, the at least two context variables with a value of selected ones of the node identifiers.
Preferably, the steps are repeated until a set of code segments is obtained which defines a plurality of components.
Preferably, the method further comprises a step of selecting the at least one context variable from a list of context variables.
Preferably, the method further comprises automatically generating pointer data, for each the code segment, pointing to locations in the set of generation instructions used in the code segment generating step. Preferably, the pointer data is stored in a separate file from the plurality of code segments or as comment lines throughout the code segments. Preferably, the method further comprises finding at least one portion of the source code requiring a change; using the pointer data from the portion to find locations in the set of generation instructions needing a corresponding change; changing the set of generation instructions at least some of the locations; and repeating the step of automatically generating the plurality of code segments using the set of generation instructions as changed in the previous step.
Preferably, the method comprises at least one of creating, ordering and customizing the set of generation instructions.
Preferably, the method further comprises a step of a programmer translating the at least one set of generation instructions written in a first programming language into a second programming language, whereby the plurality of code segments are generated in a second programming language.
According to another aspect of the present invention, there is provided a computer program comprising code means adapted to perform all steps of the method, embodied on a computer readable medium.
According to another aspect of the present invention, there is provided a computer program comprising code means adapted to perform all steps of the method, embodied as an electrical or electro-magnetical signal.
According to still another aspect of the present invention, there is provided a computer data signal embodied in a carrier wave comprising a plurality of code segments generated according to the steps of the method.
It will be appreciated that the set of templates may specify the creation of components (and also of operations or attributes) not originally intended to be specified in the model declaration data. This is achieved by specifying in the set of templates the creation of components related to components found in the model declaration data. This allows for certain sets of templates to be used to respond to the needs of particular programming frameworks.
An apparatus for generating computer source code in a component-based language is also provided, the apparatus comprising a template editor for creating or customizing sets of generation instructions, a template manager for selecting and ordering sets of generation instructions and setting generation parameters, a source code generator and a source code output, the apparatus can further comprise a pointer data generator and language databases for choosing in which language the generated code segments will be.
Another apparatus is provided for modifying generated computer source code, comprising a source code editor for retrieving the source code, a template manager, a template editor, a source code generator and a source code output.
For the purpose of the present invention, the following terms are defined below.
The term xe2x80x9ctemplatexe2x80x9d is intended to mean the target source code parameterized with at least one filter variable and at least one context variable.
The term xe2x80x9ccontext variablexe2x80x9d is intended to mean a variable which indicates a referenced node and an identifier.
The term xe2x80x9cfilter variablexe2x80x9d is intended to mean a variable which specifies the selected components for which code is to be generated.
The term xe2x80x9cobject-oriented programmingxe2x80x9d is intended to mean a type of programming language in which developers not only define the data type of a data structure, but also the types of operations that can be applied to the data structure. In addition, developers can create relationships between one object and another.
The term xe2x80x9ccomponentxe2x80x9d is intended to mean an entity which has the encapsulation and abstraction characteristics of an object. Additionally, they are part of a technology framework that defines the standards for their structure, capabilities and interaction.
The term xe2x80x9cmodel declarationxe2x80x9d is intended to mean a representation which captures the static structure of a system by showing the components in the system, relationships between the components, and the attributes and operations that characterize each class of components.
The term xe2x80x9cbindxe2x80x9d is intended to mean an association between an identifier or variable and a value for that identifier or variable that holds within a defined scope.
The term xe2x80x9cset of generation instructionsxe2x80x9d is intended to mean a set of instructions comprising at least one template, filter variable and model declaration.