During the past years, the evolution of distributed computing that is offered as a service to various clients was driven by the concept of leasing hardware and software as metered services. One such model is cloud computing. Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet to interested clients. The clients need not have knowledge of, expertise in, or control over the technology infrastructure “in the cloud” that supports them.
The concept incorporates one or more of the infrastructure as a service, platform as a service and software as a service as well as other recent technology trends that have the common theme of reliance on the Internet for satisfying the computing needs of the users. Cloud computing services usually provide common business applications online, which can be accessed from a web browser, while the software and data are stored on the servers. Such a scenario is illustrated in FIG. 1, in which a user 10 connects via, for example, Internet 12, to a network 14. The network 14 may include one or more processing nodes (PN) 16. The processing nodes PN 16 may belong to one or more providers.
Common to the operational cloud platforms are the implementation of data centers (often of massive size) hosting clusters of servers. These servers may be logically sliced using virtualization engines. Cloud platforms are traditionally distributed across multiple data centers to achieve robustness and global presence. However, this distributed presence is coarse-grained, i.e., data center-based clouds consider the entire network operator simply as the first mile connectivity. The closeness of the data centers to end-users is thus limited.
However, some end-users may benefit from having processing elements/nodes of the network closer to them than the data centers-based clouds can provide. One such example is a system deployed on a distributed platform, for example, video-on-demand systems, or enterprise information system accelerators. Such a system should be robust and scalable so that, for instance, a failing equipment or transiently increased loads do no jeopardize the systems' operation, stability and availability.
Providing servers closer to end-users imply more distributed and geographically scattered server constellations. For example, when the processing elements/nodes are highly distributed and geographically scattered across an operator's network for being situated closest to the end-users, one or more of the following problems may appear.
As the end-users may be concerned with selecting processing elements/nodes that are geographically located in a desired area, the end-users, e.g. system and/or software developers may have to know which processing elements/nodes are available, where are they located, how can these processing elements/nodes be accessible, which specific processing element/node should be used for a certain component of an application.
To select appropriate processing elements/nodes in response to all these questions, especially when the number of resources/servers in a large network may be in the range of hundreds or thousands, is challenging, i.e., time consuming and/or prone to mistakes. Supplementary resources have to be employed only to correctly distribute the existing tasks/applications to the large network. The complexity of the selection becomes itself a problem, which may overwhelm the end-user, especially if the platform is a simple constellation of independent servers, i.e., servers that are “glued” together by nothing more than plain IP connectivity.
Further complications arise as the network operator, i.e., the operator of network 14 in FIG. 1, maintains confidentiality of the design and topology of the network, e.g., the locations of the processing elements/nodes, the available resources, the particular division of the real nodes into virtual nodes, etc. In this case, even if the end-user 10 has the capability to determine which machine (real or virtual) will process which component of an application, by not knowing the topology and availability of the network 14, the end-user 10 cannot make use of the full advantages provided by the network 14.
For illustrating the limitations of the traditional methods and networks, the following two examples are considered. Two real life tasks are deployed in an operational network-based processing platform. The first task is to dispatch an application on every processing element close to an edge of the network 14 that is closer to the end-user 10. The second task is to execute a distributed application, which includes two software modules (x and y), on separate processing elements while fulfilling the condition that x is always upstream of y (with respect to the end user 10).
Having to manually process such tasks as well as to implement the distribution and communication aspects of the software components and their interworking is challenging and time consuming for the system and/or software developer, especially when the number of processing elements/nodes is large. In one example, FIG. 2 generically illustrates the minimum effort that goes into such manual process. Initially, in step 20, the user 10 determines, on his/her side, which processing elements from the network are necessary, where are they located, etc. Then, in step 22, after figuring out the processing elements allocation, user 10 contacts network 14 and request the necessary resources. The network 14 replies in step 24 to user 10, after running the applications desired by the user. The network 14 provides in this step the user with the results of the applications that were run on the processing elements.
However, if for any reason the network changes or other characteristics to be discussed later change in step 26, i.e., its topology and/or a characteristic required by the user, the user 10 has to determine again, in step 27, what processing elements of the network to be used for which application. Then, the user 10 manually re-enters in step 28 the multiple requests to be transmitted to the network 14. Thus, the network configuration change in step 26 forces the user to redo all the work previously performed in steps 20 and 22, which is inconvenient for the user.
More specifically, with regard to the first task discussed above, the user determines in step 20 of FIG. 2 the number of processing elements to run the desired task and also, based on his geographic location and limited geographic location provided by the network, only those processing elements that are closer to the user. With regard to the second task, the user determines in step 20 of FIG. 2 which processing elements would execute software module x and which processing elements would execute software module y. Then, the user has to determine in the same step 20 which processing elements satisfy the condition that x is always upstream of y with respect to the user 10.
From these simplified examples that require resources for only one application, it can be seen that the amount of calculation that takes place at the user side is high and time consuming.
Another example that exemplifies the problems of the traditional systems is discussed next. Consider a video-on-demand system. The core functionality of this system is providing video content using appropriate codecs, interacting with the end-user client, accessing and conditional controlling functions, searching a database of video titles and associated user interface, billing, storing of video files, transcoding proxies that adapt video streams to the capabilities of the end-user client, etc.
The software components realizing these functionalities should themselves be (internally) robustly implemented (e.g., the code should deal with error conditions in an appropriate manner, for example, use thread pools if request rates are assumed to be high, etc). In order to make the overall system robust and scalable, the components need to be combined and orchestrated appropriately. These desired properties of the system may require having redundant hot standby components that can step in if a certain component fails, or dormant components that can be activated to scale up the system if the load suddenly increases. However, to achieve these features requires extensive experience and skills. This means that developers holding expertise in video coding, video transport and video rendering in the video-on-demand example above may lack such competence. Thus, creating such a system is demanding and man-resource intensive. Even more, supposing that the developer is able to determine which processing elements of the network should execute his or her applications. However, a change in the configuration of the network (for example a failed connection or component) may alter the processing elements allocation and thus, the developer may be forced to redo the processing elements allocation, which results in more wasted time and resources.
Accordingly, it would be desirable to provide devices, systems and methods that avoid the afore-described problems and drawbacks.