This invention relates to systems and methods for discovering and monitoring relationships among network elements within a network.
Computer networks consist of a number of computers in communication with each other. Whilst these networks can be small and deliberately planned so that the infrastructure and communication are well understood, in practice networks are often complicated and/or built on an ad hoc basis. For example in a school, computers will generally be added when they can be afforded and joined into the network one by one. With larger organisations the complexity of the network and the communication between the devices within them can be even greater. Further, since the popularity of the internet, networks are no longer restricted to a single building or organisation and instead the computers in the network can be spaced all over the world and across organisations. Since the construction of different parts of the network was by different people in many instances no one person knows the overall infrastructure of the network and how the elements are connected together. Even without the addition of new computers it is quite common for the implementation of real systems to be changed many times during its building operation (server swapped maintenance repair, etc).
Networks may include servers (which provide a service by delivering requested data) clients (which request data and are generally attended by end users of the service), firewalls, proxy servers, and other intermediaries. Further, any particular general purpose computer can act as more than one of these for different programs and different sets of data so that a single computer can be used as a client for one application but as a server for another application. As well as conventional personal computers, computers in a network may incorporate other devices with a processor such as mobile telephones or printers.
The arrangement of devices in the network and how they communicate with each other is generally called the network topology. The term “physical topology” can be used to refer to the arrangement of hardware and cabling but generally it is at least as important to know the manner of communications and the paths of the signals between the computers (sometimes referred to as a logical or signal topology).
There are many applications and circumstances for which it is beneficial to understand the network and how computers are connected to each other i.e. to acquire a mapping of the topology. Many technical benefits are well documented.
Different components such as gateways, network address translators, firewalls, load balancers, searching machines, application servers, message queues, databases and other data sources can all have their own technology specific means for configuration. Occasionally a re-design of a system may be required and therefore the whole end to end design of the existing system must be resurrected along with any other amendments that have been made in order that the correct configuration for each is applied.
Another application is where an ICT (information and communication technology) company wants to offer a service level agreement to all its customers. This cannot be achieved unless the service company understands how each service or application actually works. In addition, auditing of software licenses and disputing a network environment relies on knowledge of that network.
Where a software application or network fails because a particular component has failed, identifying the location of that failed component can be extremely difficult when the manner in which the computers are connected together is not understood. This can be called root cause analysis where network analysis tools utilise data provided by suitable systems regarding the topology of the network to diagnose a root cause event, such as a failure or interrupt. Accordingly knowledge of the system can be used for effective isolation of failure points in the network. In risk analysis, knowing which application or network services will be affected by failure of a specific network element is useful. In development of security for a network, understanding the topology is vital to planning how to protect it, and allowing for study of likely points of attack for hackers.
For effective load balancing of a network, knowing which servers and clients rely on each other for services and applications allows for efficient and correct planning of resources such as which computers to upgrade etc.
Traffic generators can be aided by knowledge of the topology of the network.
In some industries, such as aerospace, knowledge of the infrastructure is so important that each amendment is tracked back and forth using both IT systems and raw manpower, however this is very costly and may be prone to human error in recording the topology correctly. In most industries such careful checking does not exist.
It is known to attempt to map and then monitor the topology of networks using various systems and methods, but unfortunately none of these are satisfactorily efficient or effective for many of the situations described above.
One known method is to install a software agent on each computer in the network. Each agent then searches the file system of the computer it is hosted on to determine what software is running on it. Based on what software is running it attempts to deduce whether it is a client (because it has found software to be used by the end user), a server (by having hosting software), or a firewall, etc. Since applications and software that may be a loaded on the computer come in many different forms which change frequently, having an agent that can successfully identify all relevant software is difficult, and such agents must constantly be updated to accommodate new software. Further, finding what software is on the system only tells you that that software is installed, not that it is running and in active use.
Another approach is to attempt to establish data paths by using software such as ‘trace route’. Trace route is a computer networking tool that is used to determine the route taken by packets across an IP network. Trace route and similar programs work by sending successive batch of packets over the network and calculating the route from this. Trace route relies on ICMP (Internet Control Message Protocol). A number of problems are associated with this approach. Firstly, it relies on new data being sent through the system, thereby changing the traffic flowing. This in itself may not be a problem where there is sufficient bandwidth, but due to security concerns it is relatively common for firewalls to identify such foreign packets and to stop them from proceeding any further into the network. Further, using ICMP only gives information about IP routing. In fact, many modern application routers such as ‘Solace’ and various XML routers, do not work at the IP level and may for instance redirect packets at a service level rather than the IP level. Systems that trace IP routes will not be able to see when traffic changes IP address. In Tact, some devices such as firewalls and load balancers change the IP address of incoming traffic before forwarding it to a further computer.
These existing systems also fail to detect duplicates which may be used for load balancing or redundancy. This is particularly problematic where the analysis of the network topology is for the purpose of load balancing and planning network resources since it can give an overtly pessimistic view of the current resources. The existing systems also struggle where traffic is being distributed over a number of servers.
US2005/0157654 suggests using an installed agent together with ICMP trace route and therefore suffers from both sets of disadvantages discussed above.
Further, many existing systems rely on installing agents on the entire computer network including clients. Often the people in charge of running the network do not have control of the users of client computers and therefore are not able to insist that they install agents or keep them on their computers.
Existing systems also only detect client-server behaviour. Rather than simply provide a service to requesting clients, in practise servers often depend on other computers. For example a web server may depend on other services such as DNS (domain name servers), a database or web service calls to another server. Because the server is dependent on these, the client is indirectly dependent on these too and therefore they should form part of a complete topology of a network.
Another method is so called ‘port scanning’ where a central manager probe reports on remote machines to determine which services they support. This can trip security features of the host as port scanning is used by hackers and zombie machines. Further determining which service they support is not the same as determining which services they use and using port scanning to produce the complete picture of a complex network is not straightforward even where it is allowed.
In U.S. Pat. No. 7,318,105 (Bongiovanni et al), a method of detecting the topology of a communication network is described. The method comprises: obtaining a data set including times of arrival, durations, and source nodes for chunks of data in the network; identifying most recent chunks of data arriving from source nodes other than a source node of interest in which arrival times of the most recent chunks occur before a chunk arrival time associated with the source node of interest; calculating weights for the other source nodes based on time differences between the chunk arrival time associated with the source node of interest and the most recent chunks of data; updating a probability matrix based on the weights for the other source nodes; repeating the identifying, calculating, and updating for other times of arrival and associated source nodes of interest in the data set; determining the topology of the network from the probability matrix; and outputting the topology of the network. This method is designed to work even when identifying information associated with messages transmitted through a system is encrypted. In particular, the method has application to wireless networks that use encryption.
Accordingly, being able to map and monitor the topology of a computer network gives rise to many technical benefits and applications but all the existing attempts at solutions add their own technical problems.
It is an object of the present invention to overcome or mitigate one or more of the above referenced problems.
According to a first aspect of the invention there is provided a method of determining the topology of at least part of a network comprising the steps of: monitoring traffic to, and/or from, a plurality of computers in the network, storing information relating to the monitored traffic for each of the plurality of computers, the information including an identifier of a requested service, selecting a first computer of the plurality of computers; reading the stored information related to the first computer and identifying, using the stored identifier of the requested service, at least one traffic flow to or from the first computer that corresponds to the requested service; using the stored information to identify the destination or origin of the identified traffic flow for the first computer, which traffic flow information includes the identifier of the requested service; using the identified destination or origin to identify one or more computers that are immediately upstream or downstream of the first computer, and determining a topology based on the identified one or more upstream or downstream computers.
According to a second aspect of the invention there is provided a method of determining the topology of at least part of a network comprising the steps of: receiving and storing information relating to traffic to, and/or from, a plurality of computers in the network, for each of the plurality of computers on the network the information including an identifier of a requested service, selecting a first computer, of the plurality of computers, reading the stored information related to the first computer and identifying, using the stored identifier of the requested service, at least one traffic flow, to or from the first computer, that corresponds to the requested service; using the stored information to identify the destination or origin of the identified traffic flow for the first computer, which traffic flow information includes the identifier of the requested service; using the identified destination or origin to identify one or more computers that are immediately upstream or downstream of the computer with a determined role, and determining a topology based on the identified one or more upstream or downstream computers.
Preferably aspects of the invention further include the steps of: determining the role of at least one of the plurality of computers based on the stored information by comparing the stored information relating to the traffic for one or more computers with at least one expected behaviour of traffic for a computer fulfilling a role. More preferably wherein the first computer is a computer which has had its role determined by the step of determining the role of at least one of the plurality of computers.
Preferably aspects of the invention further include the steps of: using the stored information to identify the destination and/or origin of traffic to and/or from the one or more, and preferably each of the, identified upstream or downstream computers, which traffic includes the identifier of the requested service; and using the identified destination or origin to identify one or more computers that are immediately upstream of an identified upstream computer or downstream of an identified downstream computer.
Preferably the identified computers comprise one, some or all of the plurality of computers.
Preferably the steps of using the stored information and identifying upstream and/or downstream computers are repeated until the origin and/or destination of traffic does not correspond to one of the plurality of computers or comes from an unknown computer or until the final destination and/or original origin of the traffic has been identified by those steps.
Preferably the stored information includes identifiers for a plurality of services, a plurality of traffic flows corresponding to the computer with determined role are identified, and the steps of using the stored information and identifying upstream and downstream computers are performed for more than one, and preferably each, of the plurality of traffic flows.
Preferably the step of determining the role of at least one of the plurality of computers identifies a server by finding a computer with a terminating traffic flow that is not resent to another computer and/or comparing to the expected behaviour of a server as a computer with a terminating traffic flow that is not resent to another computer
Preferably the step of determining the role of at least one of the plurality of computers identifies a firewall or proxy by finding a computer which redirects an incoming traffic flow to another computer and/or identifies a load balancer by finding a computer which redirects an incoming traffic flow to more than one computer.
Preferably the role of two or more computers and more preferably each of the plurality of computers is determined.
Preferably the first computer is a computer identified as a server. More preferably the steps of reading the stored information, using the stored information and using the identified destination or origin performed for the first computer, are repeated for each computer determined to be a server.
Preferably the stored information includes the source address and destination address of traffic. More preferably a traffic flow is defined as traffic with the same service identifier and wherein when is incoming traffic it is traffic with the same source address and when it is outgoing traffic it is traffic with the same destination address.
Preferably terminating traffic that it is not resent is defined as traffic for which the destination address is the server's address and there is no traffic with the same URI for which the source address is the server's address.
Preferably the service identifier is a URI or is a representation or identifier of a URI.
Preferably the next upstream or downstream computer is found by reading the destination or source address respectively of the traffic flow in the stored and matching this to the address of one or more computers, such as of the plurality of computers. Preferably if the address for matching does not match to any of the plurality of computers, it is matched using a database of other computer addresses or determined or approximated using geolocation techniques.
Preferably the destination, source and/or computer addresses comprise an IP address.
Preferably aspects of the invention further include the step of marking a first item of traffic or a first traffic flow, in the stored information, as corresponding to a second item of traffic or a second traffic flow, in the stored information, for one or more and preferably each of the plurality of computers, when the identifier of the first and second traffic items/flows are the same but the first item/flow of traffic is traffic to the computer to which the stored information relates and the second item/flow is traffic from the computer to which the stored information relates. More preferably whether the traffic is moving to or from the computer to which the stored information relates is measured by reading the stored destination or source address and comparing to the address of the computer to which the stored information relates.
Preferably the step of determining the role uses the marking of corresponding traffic or absence of marking, such as by checking that there is no marked corresponding traffic to a terminating traffic flow when identifying a server or noting that there is marked corresponding when identifying a proxy or firewall.
Preferably the stored information includes the content type of the traffic and the step of determining the topology is further based on the content type of traffic in the stored information. Preferably wherein the plurality of computers comprises all computers in the network except client computers or those that are solely client computers and/or the monitoring of traffic is done by IP sniffing.
According to a third aspect of the invention there is provided computer apparatus for determining the topology of at least part of a network, the apparatus comprising a plurality of computers, which computers form at least part of a network and each comprises a memory and a processor, each of the plurality of computers configured to monitor traffic to and/or from one of the plurality of computers in the network, and wherein at least one of the plurality of computers is configured to: select a first computer of the plurality of computers; read the stored information related to the first computer, identify, using the stored identifier of the requested service, at least one traffic flow to or from that computer that corresponds to the requested service; use the stored information to identify the destination or origin of the identified traffic flow for the first computer, which traffic flow information includes the identifier of the requested service; use the identified destination or origin to identify one or more computers that are immediately upstream or downstream of the first computer, and determine a topology based on the identified one or more upstream or downstream computers.
Preferably wherein at least one of the computers is configured to determine its role or the role of at least one of the other plurality of computers based on the stored information by comparing the stored information relating to the traffic for one or more computers with at least one expected behaviour of traffic for a computer fulfilling a role.
According to a fourth aspect of the invention there is provided computer apparatus comprising a processor, a memory and an input in communication with a plurality of computers which form at least part of a, network each of which computers have been configured to monitor traffic to and/or from one of the plurality of computers in the network, and transmit information relating to the monitored traffic, the information including an identifier of a requested service, the computer apparatus configured to: select a first computer, of the plurality of computers; read the stored information related to the first computer and identify, using the stored identifier of the requested service, at least one traffic flow, to or from the first computer, that corresponds to the requested service; use the stored information to identify the destination or origin of the identified traffic flow for the first computer, which traffic flow information includes the identifier of the requested service; use the identified destination or origin to identify one or more computers that are immediately upstream or downstream of the first computer, and determine a topology based on the identified one or more upstream or downstream computers.
According to a fifth aspect of the invention there is provided computer apparatus comprising a computer, the computer apparatus configured to monitor traffic to and/or from the computer, and configured to determine the role of the computer by comparing the monitored traffic to at least one expected behaviour of traffic for a computer fulfilling a role.
Apparatus according to any aspect of the invention may be configured to perform any of the preferable features/steps of a method in accordance with the invention such as the preferable features/steps listed above for the first and second aspect.
According to another aspect of the invention there is provided a computer readable medium containing computer executable instructions which when run on a plurality of computers on a network causes the computers to perform the method of the first aspect of the invention.
According to another aspect of the invention computer readable medium containing computer executable instructions which when run on a central processor in communication with a plurality of computers in a network which have been configured to perform the step of monitoring traffic to and/or from a plurality of computers in the network, and transmit information relating to the monitored traffic, the information including an identifier of a requested service, cause the central processor to perform the steps of the second aspect of the invention.