1. Technical Field
The present teaching relates to methods, systems, and programming for serving web pages. Particularly, the present teaching is directed to methods, systems, and programming for fault tolerant web servers.
2. Discussion of Technical Background
Occasionally, servers may stop serving content because of, for example, erroneous code pushes, editorial mistakes, or capacity overload. These stoppages give users a bad experience because users are either shown a standard error page with little information that is helpful to the user, or the software that the user using stops working or freezes. Such experiences cause users to become frustrated and migrate to alternative servers of content. This may cause the owner of the servers to lose revenue, or may cause the company to lose money because of the disruption caused to the employees of the company that use the service.
Servers are usually configured to notify service engineers, when problems arise with delivering content. The service engineers are often notified either automatically through some monitoring system or manually by customers or customer care representatives. The notifications may happen in a timely manner, but then the service engineers may take time to find the cause of the problem immediately. Further, the service engineers may have to enlist the help of other teams and engineers to tackle the problem. During this time, the users may continue to receive poor service. The need to restore quickly the service means that a large number of qualified engineers need to be on staff 24 hours a day to fix the issues quickly. This is expensive for the owner of the server that would prefer to fix problems during normal working hours when the costs are not so great.