1. Field of the Invention
The present invention relates to a system for integrating existing application programs in a networked environment, and more particularly, to a system with mechanisms for transforming and manipulating data messages for transfer between different applications on the same computer or on different computers connected via a network or networks and having the same or different computer architectures.
2. Description of the Prior Art
Since the beginning of the computer age, computers and, in particular, computer software programs have been used in a variety of settings to automate processes which were previously conducted mechanically. This automation has typically led to improved efficiency and increased productivity. However, because of the costs of such automation, automation of large businesses and factories has often been conducted on a piecemeal basis. For example, different portions of an assembly line have been automated at different times and often with different computer equipment as a result of the varying functionalities of the various computer systems available at the time of purchase. As a result, many assembly lines and businesses have developed "islands of automation" in which different functions in the overall process are automated but do not necessarily communicate with one another. In addition, in the office environment LANs have been used to allow new computer equipment to communicate; however, software applications typically may not be integrated because of data incompatibilities.
Such heterogeneous systems pose a significant problem to the further efficiencies of automation since these different "islands of automation" and machines with incompatible data types connected to the same network cannot communicate with one another very easily. As a result, it has been difficult and expensive to control an entire assembly line process for a large manufacturing facility from a central location except on a piecemeal basis unless the entire factory was automated at the same time with homogeneous equipment which can intercommunicate. Thus, for those businesses and factories which have already been automated on a piecemeal basis, they are faced with the choices of eliminating all equipment so that homogeneous equipment may be substituted therefor (with the associated prohibitive costs) or waiting for the existing system to become obsolete so that it can be replaced (again at significant expense).
One solution to the above problem has been to hire software programmers to prepare custom code which allows the different "islands of automation" to communicate with each other. However, such an approach is also quite expensive and is rather inflexible and assumes that the overall system remains static. In other words, when further equipment and application software must be integrated into the overall system, the software programmers must be called back in to rewrite the code for all applications involved and to prepare additional custom code for interface purposes. A more flexible and less expensive solution is needed.
The integration of existing heterogeneous applications is a problem which has yet to be adequately solved. There are numerous major problems in such integration of existing applications because of the differences in hardware and their associated operating systems and because of the differences in the applications themselves. For example, because computers are built on proprietary hardware architectures and operating systems, data from applications running on one system is often not usable on another system. Also, programmers must frequently change application code to create interfaces to different sets of network services because of the diversity of such network services. In addition, different applications use different data types according to their specific needs, and, as a result, programmers must alter a receiving application's code to convert the data from another application into types that the receiving application can use. Moreover, incompatible data structures often result because of the different groupings of data elements by the applications. For example, an element with a common logical definition in two applications may still be stored in two different physical ways (i.e., application A may store it in one two-dimensional array and application B may store it in two one-dimensional arrays). Moreover, applications written in different languages usually cannot communicate with one another since data values are often interpreted differently. For example, C and FORTRAN interpret logical or boolean values differently.
Partial solutions to the above problems have been proposed to provide distributed networks for allowing various applications to share data. In doing so, these applications have relied on transparent data sharing mechanisms such as Sun Microsystems' Network File System (NFS), AT&T's Remote File Sharing (RFS), FTAM (as defined by the MAP/TOP specifications), or Apollo's Domain File System. However, these systems are limited in that they allow data sharing but do not allow true integration of the different application programs to be accomplished.
Another example of a system for providing interprocess communication between different computer processes connected over a distributed network is the Process Activation and Message Support (PAMS) system from Digital Equipment Corp. This system generally allows processes to communicate with each other regardless of where the processes reside on a common network. Such processes may be located on a single CPU or spread across workstations, clusters, or local or wide area networks (LANs or WANs). The PAMs system manages all connections over the network and provides integration features so that processes on respective workstations, clusters and the like may communicate. In particular, the PAMs message processing system is a network layer which is implemented above other networks to transparently integrate new networks and events into a common message bus. Such a system enables network configuration to be monitored and message flow on the message bus to be monitored from a single point. The result is a common programming interface for all host environments to which the computer system is connected. Thus, all host environments appear the same to the user.
For example, an ULTRIX host environment running ULTRIXPAMS is directly connected to a VMS host running VAX-PAMS on its networks, and ULTRIX-PAMS uses VAX transport processes to route all messages over the network. Specific rules are then provided for routing messages using ULTRIX-PAMS and VAX transport processes, where the ULTRIX-PAMS functions as a slave transport in that it can only communicate to other PAMS processes via the network to a full function PAMS router. As a result, the PAMS system is limited in that there is no support for "direct" task-to-task communications between ULTRIX processes. In addition, since all traffic must be routed through a VAX-PAMS routing node, a single point of failure exists for the system.
Other systems have been proposed for an information processing environment in which various machines behave as one single integrated information system. However, to date such systems are limited to connecting various subroutines of homogeneous applications running on different machines connected to a common network. For example, the Network Computing System (NCS) of Apollo is a Remote Procedure Call (RPC) software package which allows a process (user application) to make procedure calls to the services exported by a remote server process. However, such RPC systems are typically not fit for the development of a networked transaction management system, for NCS does not provide a message and file handling system, a data manipulation system, a local and remote process control system and the like which allows for the integration of existing applications. Rather, NCS allows for the building of new distributed applications, and does not provide for the integration of existing heterogeneous applications. RPCs instead isolate the user from networking details and machine architectures while allowing the application developer to define structured interfaces to services provided across the existing network.
RPCs can be used at different levels, for the RPC model does not dictate how they should be used. Generally, a developer can select subroutines of a single application and run them on remote machines without changing the application or subroutine code. The simplest use of RPCs is to provide intrinsic access to distributed resources which are directly callable by an application, such as printers, plotters, tape drives for backup tasks, math processors for complex and time-consuming applications, and the like. A more efficient use of RPC at the application level would be to partition the application so that the software modules are co-located with the resources that they use. For example, an application which needs to extract data from a database could be partitioned so that the modules which access the database could reside on the database machine.
A diagram of NCS is shown in FIG. 1. The system 100 therein shown generally consists of three components: an RPC run time environment 132,134 which handles packaging, transmission and reception of data and error correction between the user and server processes; a Network Interface Definition Compiler (NIDC) 136 which compiles high-level Network Interface Definition Language (NIDL) into a C-language code that runs on both sides of the connection (the user and server computers); and a Location Broker 128 which lets applications determine at run time which remote computers on the network can provide the required services to the user computer. In particular, as shown in FIG. 1, a user application 102 interfaces with a procedure call translator stub 104 which masquerades as the desired subroutine on the remote computer. During operation, the RPC run time system 106 of the user's computer and the RPC run time system 108 of the server system communicate with each other over a standard network to allow the remote procedure call. Stub 110 on the server side, which masquerades as the application for the remote subroutine 112, then connects the remote subroutine 112 across the network to the user's system.
The NCS system functions by allowing a programmer to use a subroutine call to define the number and type of data to be used and returned by the remote subroutine. More particularly, NCS allows the application developer to provide an interface definition 114 with a language called the Network Interface Definition Language (NIDL) which is then passed through NIDL compiler 116 to automatically generate C source code for both the user and server stubs. In other words, the NIDL compiler 116 generates stub source code 118 and 120 which is then compiled with RPC run time source code 122 by C compilers 124 and 126 and linked with the application 102 and user-side stub 104 to run on the user's machine while the subroutine 112 and its server-side stub 110 are compiled and linked on the server machine. After the application 102 has been written and distributed throughout the network, location broker 128 containing network information 130 may then be used to allow the user to ask whether the required services (RPC) are available on the server system.
Thus, with NCS, the NIDL compiler automatically generates the stubs that create and interpret data passed between an application and remote subroutines. As a result, the remote subroutine call appears as nothing more than a local subroutine call that just happens to execute on a remote host, and no protocol manipulations need to be performed by the application developer. In other words, the NCS system is primarily a remote execution service and does not need to manipulate data for transfer by restructuring a message to allow for conversion from one data type to another. A more detailed description of the NCS system can be found in the article by H. Johnson entitled "Each Piece In Its Place," Unix Review, June 1987, pages 66-75.
The RPC system of the NCS primarily provides a remote execution service which operates synchronously in a client/server relationship in which the client and server have agreed in advance on what the requests and replies will be. Applications must be developed specifically to run on NCS or substantially recoded to run on NCS. Moreover, because a remote procedure cannot tell when it will be invoked again, it always initiates communications at the beginning of its execution and terminates communications at the end. The initiation and termination at every invocation makes it very costly in performance for a remote procedure to set up a connection with its caller. As a result, most RPC systems are connectionless. This is why RPC systems such as NCS must build another protocol on top of the existing protocol to ensure reliability. This overhead causes additional processing to be performed which detracts from performance.
Accordingly, although NCS provides a consistent method for remote execution in a heterogeneous network environment, it is designed primarily to broker distributable services such as printing and plotting across the network, where the user may not care which printer prints the information as long as it gets printed. Another type of service might be providing processing time for applications where a small amount of data in a message can trigger an intensive and time consuming calculation effort to achieve an answer that can itself be turned into a message. However, the NCS system cannot provide a truly integrated system for incompatible node type formats and data processing languages.
None of the known prior art systems address the substantial problems of integrating existing heterogeneous applications in a heterogeneous and/or homogeneous network environment. Accordingly, there is a long-felt need in the art for an integration system which provides for flexible data transfer and transformation and manipulation of data among existing applications programmed in a networked environment of heterogeneous and/or homogeneous computers in a manner that is transparent to the user. The present invention has been designed to meet these needs.