1. Technical Field
This invention pertains to computer communication and information networking. More particularly, it concerns generic application level connectivity without depending on an end-to-end network address space.
2. Description of Background Art
Prior networking art embodies a long standing perception that even logical, or application level, connections must be determined by uniquely identifying the physical end points, ie. by a globally unique addresses. The belief is central to the Internet Protocol (IP) suite and is enforced by almost all network application programming interfaces (APIs), including the Berkeley sockets. A symptom of this approach is that the application end points are directly exposed, in the form of (IP_address, port_number) tuples, allowing room for inadvertent or malicious connections, unless protected by a firewall; applications must defend themselves by validating the protocol (eg. HTTP), using magic cookies (eg. the X protocol), or by encryption (eg. SSL). It would also improve security somewhat if the identity of the final destination were concealed or eliminated from the data packets, because anonymous data is often less useful.
More importantly, an IP address is merely a symbolic substitute for the network end point, which means that, notwithstanding its role as an inter-networking architecture, IP does not really solve the problem of confinement by network address boundaries, but works around it by emulating a global end-to-end virtual network: IP applications can run across heterogeneous component networks, but only if the end-point physical hosts (or network interfaces) bear unique IP addresses. This was a strength in the early days of the Internet, because fixing the basic transport format first was crucial to the collaborative development and deployment of the infrastructure protocols of the IP suite. Nevertheless, it has left IP inherently restricted to a finite address space, which means one must currently resort to embedding techniques, such as tunnelling and Network Address Translation (NAT), to extend the existing infrastructure. The restriction more generally means that the associated data structures and protocols must be hardcoded into application software or middleware, and it limits the flexibility and power available to these programs in ways that will be become clear from the description of the present invention.
Accordingly, it is important to avoid application-level dependance on addressing, but because of its crucial role in the development of the Internet, IP-like addressing is commonly assumed to be both necessary, in that any sound scheme for internetworking is expected to critically depend on end-to-end fixed length addresses, andsufficienteven for future internetworking frameworks and applications. These notions are reflected in the view, formalised by the ISO open systems interconnection(OSI) seven-layer model, that the transport mechanism must be solved independently of the other layers,without any help from the latter. This is an undue restriction and a very wrong assumption, as will be particularly shown by the description of the present invention, which exploits simple techniques from the client-server model and operating system (OS) and compilation domains to solve the transport problem in fundamentally different way. The assumption is manifest in the existing specifications of switched networks, including X.25 and Asynchronous Transfer Mode (ATM), which require the signalling to depend on preassigned globally unique multi-byte addresses for the individual switches and host interface adapters. The addressing is currently needed to enable application processes to identify the final destinations without intimate knowledge of the network configuration, only because the current frameworks, such as ATM""s network-network interface (NNI) and user-network interface (UNI), were once again conceived without considering higher layer techniques. The result is double addressing and signalling when ATM is used as transport under the IP suite, once for establishing the virtual circuits and once more for emulating IP subnets and virtual LANs (VLANs). Surprisingly, the two-level approach of the present invention, involving connection-oriented networking even over IP, manages to eliminate this duplication.
Another duplication of function concerns the name service. In the earlier Unix-to-Unix Copy (UUCP) system, client applications were required to identify successive hosts all the way to the destination, which put the burden of route discovery and specification squarely on the clients and made the system quite unscalable. The Domain Name Service (DNS) name strings still trace out a logical path to the destination through the DNS hierarchy, but IP goes to the other extreme of not using this logical path structure at all in the routing of data. Instead, IP server applications simply listen at port numbers on their own respective hosts, and their clients are expected to locate them by their host names. This makes the hierarchical organisation of the DNS critical to its operability, as each client""s nameserver would otherwise need to be able to locate every server in the IP universe with no geographical hints from the client applications whatsoever.
Also, IP""s prescribed use of the addresses for routing is turning out to be inefficient in some ways, and the functionality is now being replaced by Multi-Network Label Switching (MNLS), in which routing labels are affixed to the packets within the network, introducing further duplication into the scheme. Every duplication means avoidable computational or bandwidth overheads, in addition to increased development and maintenance costs. Furthermore, the packet address fields are being extended to implement IPv6, along with the corresponding infrastructure, processing and communication overheads, in every packet, application program, host and router, principally in order to accomodate the growing IP membership. These costs could have been considerably avoided had the Internet not been address space-dependent.
Additionally, per the traditional prescription, the final destination addresses must be interpreted at every router, bridge and gateway along the way. There is no protocol-independent notion of logical connectivity in the IP suite, nor in other address-oriented internetworking suites of the past, so that every logical transport path must be freshly established on a per-packet basis. The difficulty and limitation this imposes is that between any given pair of application end-points, the connectivity must be independently established for each transport stream, and depending on the protocol, may not be possible at all. This is becoming especially clear with the emergence of streaming multimedia applications, where the clients conceptually make logical connections over TCP using HTTP, but the preferred media streams involve RTP over UDP and are stopped by most corporate firewalls. The problem is currently addressed by application-level proxies, but this is a piece-meal approach, as newer protocols are being formulated all the time, and is an impediment to the development of newer network applications. For example, the SOCKS V5 protocol finally supports UDP relay, but applications still need to be specially compiled and linked, or the OS specially SOCKSified, for it to work, and it provides only one-way traversal across a set of firewalls. If the networking were instead inherently connection-oriented, firewall traversals would never have been a special issue, since the authentication could then be applied to the logical connection once, as will be demonstrated by the present invention, for any number of firewalls and transport streams.
In any case, IP addresses are losing their one-time significance as long term identifiers of client hosts, as more and more clients use dial-up connections via Internet Service Providers (ISPs) and even office equipment is migrating to Dynamic IP. A similar trend may be noticed in the server space, as servers of every kind migrate to the Hypertext Transport Protocol (HTTP) and its derivatives, and are referenced by DNS names rather than by IP addresses in the prescribed Universal Resource Locators (URLs). The client references are likewise transparently redirected for load balancing and geography-specific service. As IP addresses thus become all but invisible to the users, it would seem that the application and user level functionality should be eventually served solely by the DNS, and that IP addresses would be essentially confined to the routing and transport layers, but, as mentioned, even these functions are being taken over by switching.
It should be understood that it is neither feasible nor intended to eliminate addressing altogether. Addressing, in the sense of distinguishing destinations, is unavoidable at the lowest level, for instance, between interfaces to a switch and successive Local Area Network (LAN) drops, and indispensable, in the sense of fixed-length addresses, for routing efficiency within LANs. It is between networks, especially disparate networks like ATM and Ethernet, and at corporate, political and geographic boundaries, that the efficiency benefits of fixed-length addressing appear to be outweighed by the address space limitation inseparable from the fixed-length property, and by the administrative costs of address management in addition to those for DNS-like name space allocation, which is unavoidable in any case. Fixed-length addressing is thus an inherently low level issue of primarily local, implementational significance, and deserves to have no visibility at the level of applications and users, which would be better served by some form of indirection that would achieve the same efficiency but without the address space limitation.
Another limitation of the addressing approach is that the associated network APIs become inherently oriented toward point-to-point connectivity, and are neither elegant nor sufficient for encapsulating multipoint connectivity required in distributed parallel applications. This functionality is currently addressed, for example, by the Message Passing Interface (MPI) and the Parallel Virtual Machine (PVM) libraries, and related functions like message queuing, quality-of-service (QoS) negotiation, etc. are currently handled by custom libraries and services. A simple and elegant OS abstraction is needed to provide these functions in the future. A networking abstraction is also needed to complement such facilities, for instance, by providing in-network application-oriented functionality, which is currently being explored in the active networks field.
Accordingly, an object of this invention is to provide an uniform and infinitely scalable device or process for establishing and managing communication between application processes across diverse networks and internetworks. A further object is to provide such adevice or process means with the least duplication of functionality. Another related object is to eliminate the existing need for a single end-to-end network address space and the identifiability of the final destination host from the contents of data packets.
Another object of the invention is to provide uniform means in the operating system to support both point-to-point and multipoint connectivity. A related object is to provide uniform support in the network for application-oriented facilities.
These objects, and others which will be apparent, are achieved in the present invention essentially by providing means for application processes to request connectivity and for setting up the requested connections without requiring an end-to-end address space. Generally, the illustrative system and method according to the invention comprises a plurality of hosts executing application processes, a network of transport media connecting the hosts, a service network of nameservers to enable application processes executing on respective hosts to define and reference by name, one or more shared contexts of communication on specific nameservers, and to translate the service paths over the service network, obtained from the defining and referencing requests, into transport paths for data between the application processes. More particularly, each defining or referencing request identifies a nameserver on which to define or reference a context by a pathname given with the request; each host is configured to pass each defining or referencing request of its application processes directly to one or more neighbouring nameservers to be propagated through the service network to the identified nameserver; each such propagation path, comprising a sequence of names beginning with the requesting host and including the successive nameservers along the path, is construed as a request service path; end-to-end service paths are then constructed between the defining and referencing application processes by concatenating the corresponding request service paths; the concatenated service paths are translated, using configured or dynamically generated routing rules, into transport paths within the transport network; and the transport paths are then realized as virtual paths by signalling performed by the nameservers and the requesting hosts to the routing entities, including routers, bridges, gateways or switches, of the transport network.
At each node of the service network, more than one nameserver system may be configured to serve in parallel or as standby, for load balancing, robustness and fault tolerance. For the same purposes, as well as for ensuring adequate bandwidth or other qualities of service, more than one service path may be constructed for each defining or referencing request and more than one transport path may be constructed for each referencing request. Furthermore, the service network and the transport paths may be constructed over the same or different transport media, as well as utilize any kind or combination of media for a given transport path, including and not limited to dedicated point-to-point lines, address-oriented networks, such as Ethernet, circuit-oriented networks, such as ATM. In particular, the last segment of a virtual path leading to a simple client host handling only one active application connection, for example, a networked temperature sensor, would require no information within the data stream to distinguish between virtual paths, so the virtual path would be trivially equivalent to the physical path itself. Similarly, a virtual path whose end-point hosts happen to lie in the same Ethernet or IP network would not need signalling because the destination address provided by the underlying network suffices as the virtual path.
Since references are made to a context at a nameserver, rather than to a server host address, the invention does not have the inherent point-to-point flavour of prior art networking, and the user and programming interfaces necessary for the invention are equally convenient for providing multipoint communication between application processes, suitable, for instance, for distributed parallel processing. Additionally, the necessary peer-to-peer service and transport paths are automatically obtainable by combining the corresponding end-to-end paths initially obtained individually for each peer process. In prior art, the end points of the transport paths are provided to the application processes as handles or file descriptors. Advantageously in the present invention, the end points obtained from a given context are uniquely identifiable within that context and thereby form a virtual network address space special to that context.
Further, a concatenated service path itself constitutes an end-to-end route between the respective hosts of a pair of defining and referencing application processes via the nameserver network, and suffices as an end-to-end signalling framework for setting up the requisite transport paths. Whether setting up an individual link of the service network or a segment of a transport path, only the physical path or medium directly leading to the next nameserver, routing entity or host needs to be identified, requiring only local addressing to distinguish the immediate destination from other such entities already plugged, or likely to be plugged, into the same path or medium. Thus, only local addressing is at all used within the system, and global addressing occurs only in the form of the request pathnames. Since the pathnames and their service paths can be of any length, the present invention is inherently capable of serving an unlimited number of physical hosts and contexts, with each of the latter constituting a virtual network address space in its own right.
The notion of locality extends to the computation of possible transport paths for a given service path, because this too requires knowledge only of the physical links of the transport network, together with access to the corresponding switches, within a reasonable physical or geographical neighbourhood of the service path. It is thus assumed, for the workability of the present invention, that each nameserver can indeed be configured with access to adequate number of switches in its neighborhood, and to arrange for it to learn, from configuration files, signalling interaction with the switches or other means, of the links configured on these switches. An analogous assumption of local configurability exists in prior art, since the DNS servers must be configured with the hostnames and allocated IP addresses, hosts are configured, statically or dynamically, with the DNS server addresses, and both hosts and routers are configured with xe2x80x9cdefault routesxe2x80x9d to known routers.
An illustrative embodiment according to the present invention includes availability of routing rules at one or more of the nameservers along the concatenated service paths for computation of transport path segments at these nameservers; signalling communication between these nameservers and the routing entities of the transport network to setup or teardown corresponding segments of the transport paths; and linkage communication among these nameservers and the requesting hosts to connect the transport path segments for completing the end-to-end transport paths. Alternatively, the whole or portions of the concatenated service paths may be passed back to the requesting hosts and the signalling for realizing the corresponding transport segments or paths performed directly by the requesting hosts. Additionally, one or more of the hosts may include operating system means to encapsulate the service communication with the nameservers, including the defining and referencing requests, the signalling communication if any, and the linkage communication as operating system services available to the application processes executing on those hosts.
An alternative of an embodiment of the invention is configuration of the service network as a directory tree of nameservers, and uniform interpretation of the request pathnames with reference to this tree. The present invention is not as critically dependent on a strict hierarchical configuration of the name service as the prior art, since the application processes supply the complete name service paths, via the request pathnames, and the nameservers are not as burdened with the responsibility of discovering remote hosts. It becomes feasible, therefore, as well as efficient and cost-effective, to implement multiple hierarchies to suit political, geographical and corporate boundaries, since applications seeking connections across such boundaries are empowered to expressly request such traversals.
At the other extreme, an embodiment may choose to implement no hierarchy at all, in which case, the pathnames would need to be literally interpreted as the request service paths, but here, too, the invention makes a fundamental improvement over the prior art analogy of UUCP, since the nameservers would then act purely as exchanges where the clients and servers can meet, with reduced burden of routing and none of discovery. The only difference visible to the application processes between the hierarchical and the xe2x80x9cflatxe2x80x9d configurations is that the defining and referencing request pathnames would be identical in the first and complementary in the second.
Generally, the illustrative method of the present invention favors circuit or path oriented routing over the packet routing of IP. This is not at all a detriment, because the long haul routes are already based on ATM, an increasing number of corporate networks use ATM VLANs and optical fibre is becoming the medium of choice for future building networks and home connections. The present invention would thus eventually provide a uniform application level service appropriate to this emerging transport media without emulating IP, as in the current generation of ATM VLANs, and without requiring severe redesign of existing applications in adapting to the ATM paradigms, since it essentially generalises the bind-connect paradigm of client-server design. It will become clear from the detailed description that IP and packet routing remain convenient and usable as local transports under the current invention, as limited extensions of the basic Ethernet. The invention provides a way to integrate multiple local IP networks in a manner more directly reflecting the path oriented character of long-haul transport, and conversely, confines the direct use of packet or address-oriented routing to local routes. This also means that the present invention can be easily deployed over existing IP, the effect being to merely replace calls to the DNS, via gethostbyname and related functions in the sockets API, with those to the service network of the present invention. Performance is also not adversely impacted, but appears likely to be improved since the nameservers are not burdened with network discovery and the only changing data to be handled is that of the application-defined contexts and the associated service and transport paths, all of which can be extremely transient, as while debugging an application, or long lived as services well beyond the useful lives of their host hardware, and to which known techniques like caching can be easily applied for performance.
The present invention also achieves an improvement in host security, since client applications cannot even request connections until their respective servers have advertised their services as contexts on the nameservers. By definition, as it were, network security cannot be perfect because data can always be physically intercepted, any set of network protocols broken into and mimic""d, and every network of non-trivial dimensions invariably contains both intentional and inadvertent vulnerabilities. In absense of end-to-end addressing, however, hosts outside the immediate LAN cannot be easily identified by intercepting their data, thus providing some measure of security-by-anonymity. Fundamentally achieved are inherent isolation and potential for incorporating security mechanisms, which can both be appreciated by analogy to the Unix operating system, in which the in-memory data of an individual process is first protected from inadvertent access or corruption by other processes by isolating it in a per-process virtual memory address space. The present invention likewise isolates the in-network data of individual application contexts by context-specific virtual address spaces and virtual paths, as well as confining address-oriented routing to local segments, analogous to the use of physical memory in Unix. More significantly, since the present invention provides a generic connection-oriented framework, in the form of the end-to-end service paths, tied to the actual transport paths used for the application data, it is ideal for providing both traditional and multi-media services traversing any number of gateways and firewalls.
Additionally, in-memory and file data are protected in Unix by authentication and access control mechanisms that have been added to and considerably improved over the default mechanisms of the early versions of the operating system. Analogous default mechanisms are meaningless in the networking context, but the present invention does provide room for implementing these functions within the service network, to allow defining requests to specify application of these functions to their contexts, thus protecting server applications, and to correspondingly verify the privileges of both defining and referencing requests, thereby preventing unauthorized application processes from posing as legitimate services and, among other possibilities, stealing sensitive data such as passwords from unwary clients. Alternatively, an embodiment may pass referencing requests, via the end-to-end service path, to the server application process for verification and approval by the latter before realising the transport paths to fulfill the requested connection. A further advantage of the present invention is the sharing or transfer of the server application""s responsibility for authentication and access control among the processes sharing its context. Another option is the granting of access for limited time, and automatic initiation of teardown signalling on expiration of the granted period, to limit the exposure of server applications to network attacks.
Other objects, features and advantages of the present invention will be apparent when the detailed description of the preferred embodiment is considered in conjunction with the drawings, which should be construed in an illustrative and not limiting sense as follows.