1. Field of the Invention
The present invention is related to the field of computers resources management. More particularly, the invention is related to a method for storing, updating and retrieving information, related to such matters as the allocation of computing resources, which are associated with a plurality of clients/users hosted by a web host, and may include both hardware (CPU, disk space, bandwidth, etc.) and services (FTP, Web, etc.).
2. Prior Art
Although reference is made herein to Web hosting companies, as possible users of the invention, it should be noted that the invention can be used for retrieving information of other sorts as well, and the mentioning of Web hosting companies is for an illustrative purposes only.
Web hosting companies, or WHCs, are companies that own, control and allocate many powerful computers to consumers. Their business is to sell their computer resources to smaller clients, who use these resources for their own business. Each one of the WHC's clients is given private domain, website, e-mail address etc. The task of a WHC is allocating its computer resources to its clients, according to the level of service each client purchased from the WHC. This allocation might include some limit on the resources. This way, no one of the WHC's clients would be able to consume more resources of the WHC's computer than he had paid for, thereby guaranteeing that the other clients on the same machine will get the resources they paid for. Some WHC's, however, only monitor the actual use of their resources, and bill their clients accordingly.
In addition, there are companies with an even larger number of computers, which in turn hire or sell them to a multitude of WHCs for a margin of the profits. They also have a similar goal, of allocating their resources among the WHCS. Such WHCs that buy these resources from a larger WHC and sell them to their own clients are referred to as resellers.
In order to have efficient management of computer resources, it is important for the WHC manager to have easy access for allowing easy retrieval and updating of information regarding the availability of the computer resources. This information is either static (e.g., the amount CPU resources allocated to a certain client) or dynamic (e.g., the client's current CPU usage), and is required for several reasons.
A first reason is that the WHC manager can obtain indications to what extent a client used the resources for a certain period, and formulate his billing to this user by using this information. This billing could take any form, the common way being a payment for some fixed package of computer resources, and if the client ends up exceeding this package, then he would pay an incremental fee, which is more expensive, for the amount by which he exceeded this fixed package. In order to prevent one client from using too many resources, the extra resources he may use are limited. For example, a package of disk memory could be $200 for 20 gigabytes, plus $15 for each additional gigabyte, for up to 20 more gigabytes. This example illustrates how the WHC would determine how much to charge the client, after analyzing his usage of computer resources. Conversely, the client can use this information to see how much he had used the computer resources for a certain period, and verify that his billing was correct.
A second reason is that this information can show the WHC manager if there is a need to change the allocation of computer resources for a certain client. For example, if a client buys 10 gigabytes of disk memory each month, but actually uses half of it, then he could alert the client that perhaps it's better for him to purchase a smaller package, and this would free the unused memory for other clients who may need it. This type of manipulation could be beneficial for the WHC manager. For example, if the 10 GB package, this client bought, costs $15, while a 5 GB package costs 10$, then it's more beneficial for him to have 2 clients that are entitled for 5 GB each, instead of one client for 10 GB.
A third reason is that keeping statistics of the usage of the resources can help the WHC manager for improving his computers' performance, avoiding accidents, and learning how to utilize them in a more efficient way. These statistics can take many different forms, for example the weekly throughput of data of FTP traffic, and thus can be used to analyze a variety of different aspects of the computers' performance.
Organizing all the information regarding computers' resources is performed by utilizing a repository. In information technology, a repository is a central place in which an aggregation of data is kept, maintained and managed in an organized way. A repository may be directly accessible to users or may be a place from which specific databases, files or documents, are obtained for further relocation or distribution in a network. A database is defined as a collection of data, organized in such a way that its contents can be easily accessed, managed, and updated. Typically, the repository is hierarchical. For example, a network might comprise a group of computers, each having its own configuration and users' list. Each user on the list further has its own configuration (e.g. disk quota, permissions), and so forth.
A tree structure is a method for placing and locating hierarchical data entities in a repository. A node (or a decision point) in the tree can have an unlimited number of branches (also called “children”), though usually no more than several dozens, or as little as one branch. The endpoints are called leaves. Data can be stored in both nodes and leaves. The starting point is called the root. The maximum number of children per node is commonly referred to as the “order of the tree”.
Some nodes/leaves can reflect information that is local to the network object. For example, the current CPU load can be a leaf on the tree. In order to reflect an up-to-date value, this leaf should be frequently updated.
Since nodes and leaves can be expressed as Extensible Markup Language (XML) tags (definition given below), XML structure can be used as a hierarchical database. This notion is disclosed in WO 0133433.
XML is a flexible way to create common information formats, according to a formal recommendation from the World Wide Web Consortium (W3C). It is similar to the well-known HTML language, which is used to generate Web pages content. Both XML and HTML contain markup symbols to describe the contents of a page or file. However, while HTML describes a Web page (mainly text and graphic images) only in terms of how it should be graphically displayed, XML describes the actual content of the information in the page, such as—the number representing CPU usage.
Keeping the information in an XML structure, as will be later described in the invention, has several advantages:
The XML structure supports a hierarchical structure of a repository. This allows the operator (e.g., a WHC manager) an access to the information regarding all the computers, associated with the same repository, from one place (i.e., the root), instead of accessing to, and retrieving information from, individual computers. The computers, including the root computer, may be a part of a LAN. This approach is easier, faster, and more general, as it requires handling only one computer, instead of several computers.
Extendibility—new nodes and information can be added, without the need to modify existing information.
The information is retained in a human-readable format, which allows browsing and locating the information using any browser.
Determined information—the place (and format) for locating each information entity is defined as part of the XML of the specific element.
The tree can be of an unlimited depth, as new levels can be added underneath existing ones.
Currently, the market offers a variety of tools for reading and writing XML files, and, therefore, managing the information can be carried out using external programs that the user can write.
Although an XML-based repository can be stored as a single file, it is advantageous to use multiple smaller files when implementing it, for the following reasons:
Smaller files lower the computational effort whenever updating the file. Since an XML file containing a database may comprise a substantial quantity of data, the computational effort of updating such a file can be substantial as well, and, hence, updating, adding or deleting data from the database can be a very slow process. Furthermore, XML files are sequential. From a data processing point of view, sequential files have several drawbacks. For example, records may have different sizes, making modifications nontrivial. Therefore, the smaller the size of the file, the lower the computational effort for updating the file.
Unlike relational databases, that have a mechanism that ensures integrity of the information, an XML file is a standard system file, thus can be corrupted by a system crash during the update process. Smaller files lower possibility of such crash.
Handling an XML file requires loading the whole file into fast access memory. Hence, handling an XML-based database, in which the data is broken into many smaller files, requires a significantly smaller amount of memory resources.
A calling application is an external program that accesses the repository to retrieve or update required information about the resources or configuration of the computers. It may come from both the WHC manager side and client side. For example, it could be a program that is responsible for billing the clients for the resources they consumed. This program sends a request to the repository to retrieve the information about each client. Conversely, a client may wish to verify the correctness of his billing, so he could use a program of his own to check his account in the repository. Normally, a calling application must direct the repository to the exact place from where the required information should be retrieved.
Although using XML as a hierarchical database is known in the art, the current solutions suffer from several drawbacks. For example, they are not easy to operate, and offer very few capabilities beyond being a mere tool for storing information. One important feature these XML databases lack, is the ability to access information in a way which is transparent to the operator. In other words, in these type of databases, the operator (i.e., user) must be directly involved in the process of accessing the exact location in the database, in order to retrieve the required data. This procedure can be wasteful in terms of computer time, and imposes rigidity and encumbrance—as it necessitates that the calling application would always know where to seek the information in the database, instead of just asking for it. Another major drawback is that in order to access data that is contained in a certain computer, it is necessary to access the computer itself, and there is no single central computer to handle the requests. This could be problematic whenever trying to retrieve information from several computers, as for each one there is a need for a single request.
SNMP (Simple Network Management Protocol) is used for a similar purpose as XML-based databases. It is a widely used network management protocol. SNMP is designed for the application level in order to enable transition of management-related information between an NMS (Network Management System) and “agents” installed in network devices that collect this information from the network devices, and perform management operations on the relevant network device, such as setting parameters. The use of SNMP enables a network manager to access information related to the network. It uses a repository, which stores information about the managed objects, and a program that manages the repository (that may reside on a remote computer). In order to control several devices from a central place, there should also be a “root” computer through which information about the network's computers passes, to and from the managing computer.
However, the SNMP suffers from several drawbacks:
The information exchanged between the computer that manages the repository, and the “root” computer, is passed as clear text (not encrypted). Therefore, the information is exposed to malicious objects.
The tree-level of an SNMP repository is limited, and, hence, the information presented is limited as well.
The SNMP protocol uses User Defined Protocol (UDP), which is a communication protocol that is not “delivery-guaranteed”, i.e., there is no mechanism to guarantee that information that is transmitted from a source will arrive to the intended destination, and failed messages are not returned as an indication for transmissions failure.
The SNMP uses the ASN.1 format, which is based on assigning a unique number to each property, in order to describe the elements to be changed. This format, which uses numbers, rather than text-based commands and names, makes it very hard to understand SNMP commands. Moreover, it forces users to use cumbersome tools for every SNMP operation, even a simple one, instead of writing a simple command line.
SNMP uses ports 160 and 161, which might be closed when a firewall is installed in front of the computer at the data center. Consequently, users cannot access their information from their home computer.
SNMP is incapable of switching from one computer to another. In other words, it does not support the hierarchical tree property of XML, as suggested in the invention. Thus, when accessing several computers, each command should be directed to the relevant computer, instead of dispatching all of the commands together to the root. Furthermore, this requires that each computer will have its own unique IP address in order to allow the calling application an access to the relevant computer. The latter feature is not recommended, as such international IP's are getting scarce. In order to solve this problem, a load-balancer can be used, to regulate traffic of information among a plurality of computers, under a single IP address. However, when the managed computer is located behind a load-balancer, it has only a local address, and thus SNMP cannot be used. Therefore, SNMP is a technology which may be used only in cases of single-computer level.
SOAP (Simple Object Access Protocol) is a protocol for communicating between programs running on different computers or operating systems by using HTTP and XML as a mechanism for exchanging information. SOAP specifies exactly how to encode an HTTP header and an XML file, so that a program in one computer can call a program in another computer and exchange information. It also specifies how the program, thus contacted, can respond. However, in SOAP, the information cannot be encrypted, and does not use exclusively XML for the information inside the request. Moreover, a request for the information about a certain computer must be sent directly to this computer, thus requiring several requests in order to gather information from several computers.
The solutions specified above have not provided a satisfactory solution for the problem of easily and efficiently managing computer resources, or to other problems that pose a similar challenge.
It is an object of the present invention to provide a method for efficiently managing computer resources, consumed by remote users, in both data memory and processing time.
It is another object of the present invention to provide a method for allowing a user to exploit computer resources that are distributed among several computers while communicating with only one (i.e., central) computer.
It is still another object of the present invention to provide a method for allowing a computer system manager to dynamically change computers' resources that are allocated to users, such that the changes are transparent to the users.
It is a further object of the present invention to provide a method for managing computer resources, consumed by remote users, which allows calling applications an easy and convenient access to the data required for management.
Further purposes and advantages of this invention will appear as the description proceeds.