As computer software and hardware has evolved, there has been an increase in the availability of multiple processor systems and an increase in connectivity between systems achieved by high-speed networks. This in turn has led to the increased development of distributed systems. Compared to other software development efforts, a key distinction of heterogeneous distributed systems is their variety of components. Each distributed system is unique, and can consist of a diverse combination of: multiple platforms, databases, or types of interfaces; geographical distribution; legacy, off-the-shelf, and custom components; etc. Many large distributed systems are built by adding new components onto legacy applications, or integrating new and existing components in ways not planned by the initial component designers.
This variety of components also introduces a variety of performance patterns. Multimedia applications will often be I/O-bound, while other components may have peripheral, memory, or network bottlenecks. Thus different parts of a large system may have significantly diverse bottlenecks. One of the architectural challenges consists of identifying changes in the performance, and potential bottleneck shifts, caused by adding a new component to an existing distributed system.
Large-scale distributed systems also typically have performance and reliability requirements that are more stringent than other types of software. A distributed system can have hundreds or thousands of users accessing it through several types of interfaces, as in a corporation's customer information database. Another distributed system may need to combine and calculate knowledge from a variety of sensors to support hard-real-time decision-making, as in a command-and-control weapons system. One of the most difficult situations occurs when an existing system must be scaled up and integrated with other existing systems at the same time, as when a departmental application is incorporated into the other systems that support an entire organization.
The usefulness of distributed systems is due primarily to six characteristics: resource sharing, openness, concurrency, scalability, fault tolerance, and transparency . . . . It should be noted that they are not automatic consequences of distribution; system and application software must be carefully designed in order to ensure that they are attained.
When designing a distributed system, there are architectural decisions that determine how the software is mapped to the hardware, resulting in a system configuration. These decisions must be made early in the development cycle of a large system, and they have a significant impact throughout the lifetime of the distributed system. An inappropriate architecture may be difficult to implement, have poor performance or scaling problems, and/or be hard to maintain.
The analysis and design of a system configuration typically has two primary steps, partitioning and allocation. Partitioning determines how to group the data and processes into components that have high cohesion and low coupling, and allocation determines how the components are mapped to the available hardware suite. One method of partitioning the system, functional decomposition, focuses totally on the system's functionality as the criteria for modularity. Functional decomposition, however, often results in systems with poor performance caused by ignoring the other design criteria. In particular, the practice of applying functional decomposition to determine the component boundaries does not provide a good guideline for allocating across a network. For example, minimizing network traffic or considering the impact of network latency on real-time requirements result in better divisions between local and remote access, than does dividing along functional boundaries.
In general, the partitioning process starts with a distributed system's individual units of code and data, and results in the system's logical view, the application architecture. The allocation process generates its physical view, the system architecture or system configuration. The logical view shows the functional sub-applications and their inter-communication and external integration, independent of how these functional entities relate to the actual components that implement the functionality. The physical view shows the actual components and how they are mapped to the hardware suite, including replication. The process of defining the logical view and transforming it into the physical view is a major part of defining the architecture of a distributed system.
While good software design is just as important as when building smaller applications, designing the system configuration, or architecture, becomes more important as the application increases in system-level complexity. In contrast to small software packages where several applications may run on a single processor (PC or workstation), distributed systems contain a complexity at the structural level, completely aside from the algorithms they implement.
Many of these configuration decisions are frequently made early in the development cycle of large systems. Doing configuration analysis and design early in the development/integration cycle minimizes the risk that the distributed system will perform poorly, either at initial roll-out or at major change points during the system's production and maintenance phases. The design decisions not only have a significant impact on performance, but the data and process locations impact each other in a recursive fashion. However, postponing the configuration decisions is possible only when the application components are developed with good modularity: there are many legacy systems that consist of a single monolithic component for each platform, no matter what comprises the hardware suite.
Configuration decisions are also based on conflicting goals, and the tradeoffs are often not obvious in a complex system. For example, network traffic is lowered by minimizing inter-processor communication (IPC), but may be increased by maximizing throughput. Predicting when a different configuration would improve a particular system's performance is not easy, nor is it always clear what changes would improve a specific application.
Architectural paradigms, such as 3-tier client/server and X-windows, have been developed to simplify the decision-making involved in distributed system design, but the adoption of a particular paradigm does not necessarily provide a satisfactory configuration for a specific application. In addition, these paradigms were developed to address the performance issues, but they are often inadequate beyond a certain size or complexity of distributed system. Predicting when a different configuration would improve a particular system's performance is not easy.
Commercial tools, such as Forte, that automate the generation of communication code still require developers to manually decide on the mapping and provide this information to the tool, via drag-and-drop. Manually attempting to analyze the tradeoffs is a difficult task for a system of any complexity and scale.
Commercial tools implementing UML (Unified Modeling Language), such as Rational Rose™, allow for manual partitioning and allocation steps, and represent the results. However, neither the standard nor the supporting tools provide an automated process for determining the component diagrams or the deployment diagrams.
As a result, there is a need in the art for the present invention.