An online service can be provided by a service provider for use by a plurality of customers (or tenants). One example of such an online service is a cloud-based electronic mail (e-mail) service that customers can use to, among other things, send and receive emails. Such an online service can be hosted on a data center comprising a plurality of geographically-distributed server clusters. When a given customer uses a client computing device to connect to the online service, the signals transmitted from the client computing device often traverse a complex route over a network topology to ultimately arrive at a destination server of the data center. For example, a signal transmitted from the client computing device can be relayed by one or multiple Internet service providers (ISPs) before reaching the destination server.
Various problems can arise at different points along the network route from the client to the server. Regardless of where they occur, these problems often manifest themselves as connection issues at the client computing device. However, sometimes these issues are caused by problems occurring at points that are outside of the service provider's domain. For example, a particular ISP's equipment might be experiencing an equipment failure, or a client-side configuration may have been improperly configured, either of which can cause the customer to experience connection issues when attempting to connect to the online service. These issues are often perceived as issues with the online service itself even though the issues actually stem from problems occurring outside of the service provider's domain.
Currently, there is no easy way for a service provider to identify problems that occur at points outside of the service provider's domain. For example, the service provider cannot actively monitor a third party ISP's equipment or a client-side configuration to determine if a problem occurs at those points along the client-server network route, and it can take an extended amount of time to identify such a problem. This, in turn, increases the recovery time to restore operation of the online service, and the reliability of the online service is compromised as a result.