1. Field of the Invention
The present invention relates generally to digital computer input/output data transfer and transformation, and more specifically to algorithmic control of data streams.
2. Description of the Related Art
Numerous existing hardware and software technologies are used to interconnect computer systems, peripheral devices, software programs, and other elements. These technologies are stratified, from low-level hardware connections to high-level message and software protocols. FIGS. 1-5 illustrate existing connection technology, to clarify terminology and to set context.
Referring to FIG. 1, a desktop computer 102 illustrates typical connections 114 that link it to external devices 122 (such as monitors, keyboards, printers, and scanners) and to other computers 132 (such as web servers and file servers). The computers 102 and 132 are used to run programs 142 and 152 that utilize the connections 114 to communicate with each other and with the devices 122. Software-to-device communication is supported by messaging interfaces 162 that may be implemented in hardware, firmware, or software.
Computer connections 114 are most often considered in terms of their physical components (cables, connectors, etc.), but they also involve other stratified elements, all working together to connect programs 142 with remote resources 122 and 132. Proceeding to FIG. 2, a scanner 202, printer 204, and two computers 102 and 132 are linked with what appear to be three simple physical connections 114, 214, and 216. Data 210 travels from scanner 202 to computer 102 (after transitioning intermediate devices such as routers 232 and hubs 236), where it is received by a software application 242. The software 242 sends different data to a program 252 on the second computer 132, which in turn sends other different data through a connection 216 to a printer 204, where output 262 is produced (in this example, a French translation of an English expression scanned as input).
An untrained observer might conclude that data 210 simply travels through the wires and devices, to emerge 262 from the printer 204 like water through a hose, or like electricity flowing to a light fixture. However, in modern data communications, structured protocols have replaced simple on/off settings and analog signals. Thus, layered connections link the programs 242 and 252 with each other and with remote devices 202 and 204.
Proceeding to FIG. 3, two connections are shown in greater detail. In the center are the lower-level or physical connections 214 and 216, which move messages between computers 102 and 132 and the printer 204. These connections might transition through a “cloud” of diverse lower-level devices and connection technologies 316; examples include routers, modems, hubs, “wedge” interfaces, and cables, as well as the standards and protocols they utilize, such as Ethernet, USB, RS-232, and IEEE 1394. These lower-level connections may incorporate hardware, firmware, drivers, or software components, providing transparent links or “tunnels” from end-to-end. The choice or configuration of such network elements can often be modified without affecting their external users. In the present invention, when we refer to lower-level or physical connection technologies, we mean this very broad range of device-to-device linkages.
The programs 242 and 252 also utilize higher-level or virtual connections 322 and 324 to exchange messages. These messages relate to but are different from the lower-level messages of physical connections 214 and 216, and typically are embedded within those lower-level messages. They conform to structured messaging interfaces 344, 354, and 364, which are built into the programs 242 and 252 and the printer 204, respectively. Messaging interfaces define rules for conversations between endpoints; they also provide virtual access 374, 376, and 378 to system software, firmware, and hardware. Such interfaces are usually hierarchical and interrelated, with a given program using many such interfaces in a single connection. Examples of higher level connection strategies include use of common technologies (such as Named Pipes, Sockets, virtual devices, virtual circuits, “Hartmann Pipelines,” and RPC), use of standards and protocols (such as HTTP, SOAP, XML, YAML, the TCP/IP protocol stack, and the OSI seven-layer protocol stack), and use of interface paradigms intrinsic to specific development tools or frameworks (such as AJAX, C++, PHP, Python Pipelines, UML Statecharts, Prograph, Smailtalk, and Microsoft .NET). In the present invention, when we refer to higher-level or virtual connection technologies, we mean this very broad range of software-oriented linkages. The choice of levels, layers, protocols, standards, etc., are implementation decisions; different practitioners might make different choices for a given application. Moreover, there is no strict dividing line between higher-level and lower-level protocols. In general, the behavior of all such connections is determined by their messaging interfaces 344, 354, and 364, which in turn are controlled through software 242 and 252. Programs 242 and 252 thus utilize both lower-level connectivity 214, 216, and 316 and higher-level connectivity 322, 324, 344, 354, and 364.
Proceeding to FIG. 4, the stratified elements of FIG. 3 deliver messages at several levels, using various technologies, by which for example the print program 252 can utilize a physical connection 216 to output text on the printer 204. The two endpoints 252 and 216 participate in a structured conversation implemented in source logic 402 and destination logic 404, where software and firmware implement shared messaging interfaces. The program 252 thus utilizes system services 412 (including firmware and hardware components on the computer 132) to send messages via the physical connections 216 to the printer's firmware 414. These messages have layered content, with formats dictated by the messaging interfaces, and which contain such elements as network data 424 (for addressing, etc.), printer control 426 (for page layout, etc.), and the text to be printed 428. Each message usually has a specific recipient—in this example, either a messaging interface within the printer firmware 414 or one of the other infrastructure components (such as a network card). All these messages and components work together to create the simplistic illusion of text flowing down the wire, appearing as ink at the print-head.
As described above, FIGS. 1-4 illustrate typical data communications strategies. A common feature of connectivity applications, and indeed of most computer applications, is the need for continuous improvement. In engineering, this is an unusual need. A dam, building, or vehicle may never need alteration, if designed and built properly; but computer systems are expected to evolve, in step with environment changes and technology advances. However, this computer system evolution is constrained by the high cost of creating and changing software.
Many techniques have been developed for improving software productivity, such as operating systems, higher-level languages, structured methods, databases, automated testing, and object-oriented systems, as well as advances in communications technology such as interface standards, protocol stacks, automatic error correction, and diagnostic tools. These have all helped. Nevertheless, data communications applications remain complex and costly to change, and have distinctive challenging problems that are well-known to practitioners.
Today, software modification is complicated by the available data communication strategies, which generally focus on moving intact messages between endpoints, subject to messaging interfaces and protocols dictated by those endpoints. FIGS. 5A-5D illustrate this problem, by considering how to enhance an existing connectivity application, using the pair of example programs 242 and 252 from FIGS. 2-4, which process and print scanned input. Suppose it is desired to change these programs, for example to support a new model of scanner having a different data format, or to monitor scanned input for a list of keywords. Proceeding to FIG. 5A, a common modification technique is shown, where changes are made to one or both programs 502 and/or 504. The changed programs must conform both to the existing physical connectivity (via the link 214) and logical connectivity (via the links 322, 344, and 354). Logic changes 506 and 508 might also alter the messaging interfaces 512, 514, and 516. Software changes of this nature are well understood in the art, but the conventional programming used often involves significant effort and cost.
Proceeding to FIG. 5B, another modification technique is shown, where a new program 522 is added, rather than modifying the existing programs 502 and 504. This approach might implement the same functionality changes (in this example, adding a new scanner model or monitoring for keywords) without altering the existing components, perhaps because they are not easily modified, because they use dissimilar technology, or because integrating the new logic would be difficult. The new program 522 would be inserted between the two communication endpoints 242 and 252, using program logic 524 that conforms to the existing messaging interfaces 526 and 528, and that is hosted on some platform 532, such as a new computer system connected via extensions of existing physical 214 and 534 and logical 322 and 536 connectivity. This configuration is recognized as a “three-tier architecture” (i.e., three cooperating programs that all operate independently), which is a well-known but complex communications system design. This technique is sometimes employed in middleware layers of large enterprise systems, in “server farms,” and in other sophisticated applications, and usually involves high costs due to development, testing, error recovery, deployment, management, and other challenges.
The two approaches shown in FIGS. 5A and 5B each use standard software development methods, which historically and by definition focus on the creation of programs and their external interfaces. Specific implementation choices would normally be dictated by the programming methods chosen, which each treat connectivity in different ways, e.g. via simplifying models (as in stream-oriented systems), via built-in constructs (as in message-oriented systems) or via standard protocols and messaging interfaces (as in most communication systems). Regardless of method, the building or modifying of necessary software would involve significant effort, costs, and risks, particularly when the existing systems are complex.
Proceeding to FIG. 5C, the elements of FIGS. 5A and 5B are presented in a different way, to illustrate at least one problem to be solved: how to modify the behavior of a communications application without either modifying or adding programs. One desired solution would be to avoid programming by instead reconfiguring the connection 542. If the parameters of the existing connection 322 could be changed (for example, so that the connection 322 automatically translated a new scanner format, or monitored data for keywords), then the desired behavior changes could be achieved, while preserving the existing two-tier application. Unfortunately, today's connectivity tools divide responsibility unfairly between programs and connections. All application knowledge resides in the programs, whereas connections move messages between endpoints. This is because today's connectivity tools use a program-centric model.
Proceeding to FIG. 5D, current software development is constrained by its focus on “islands” of computation, which communicate through mechanisms intrinsic to each “island.” Thus, the two programs 242 and 252 exchange messages 322 according to their built-in messaging interfaces. If the application uses a third program 562, this is done via an independent connection 564, with only Program B 252 being aware of both connections 322 and 564. Changing any interface requires changing the associated programs. But what if these programs 242, 252, and 562 communicated through active pathways 572 and 574, capable of filtering, transforming, or redirecting their content? What if the behavior of such pathways could be altered at will, without changing the programs, e.g. to correlate their content or to divert data to other systems? Today, the only option for making such application changes would be to modify the programs, such as shown in FIG. 5A, or to break the connections and insert new programs, such as shown in FIG. 5B. No mechanism exists for creating “smart” connections, such as shown in FIGS. 5C and 5D.