US 12,169,434 B2
System, method, and computer program to improve site reliability engineering observability
Mahesh Napa, Secaucus, NJ (US); Gordon Robert MacDonald, Glasgow (GB); and Mark Leslie Gibbons, Kent (GB)
Assigned to JPMORGAN CHASE BANK, N.A., New York, NY (US)
Filed by JPMorgan Chase Bank, N.A., New York, NY (US)
Filed on Feb. 2, 2023, as Appl. No. 18/105,052.
Prior Publication US 2024/0264894 A1, Aug. 8, 2024
Int. Cl. G06F 11/07 (2006.01); G06F 9/54 (2006.01)
CPC G06F 11/079 (2013.01) [G06F 9/542 (2013.01); G06F 11/0778 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for improving site reliability engineering (SRE) observability by utilizing one or more processors along with allocated memory, the method comprising:
defining a schema in a common manner;
causing any application included across a distributed set of applications to utilize the schema to describe an error associated with a downstream application such that a root failing component associated with the error is always at a bottom error frame in a response;
implementing a common structure for distributed error propagation in a chain of applications across the distributed set of applications in connection with an error message;
generating error logs received from the chain of applications;
storing the error logs in a centralized location accessible by all SRE users and application owners;
calling a corresponding application programing interface (API) to access the error logs from the centralized location; and
automatically implementing a remedial algorithm to correct the root failing component of the error message identified in the error logs.