Skip to main content

Observability Patterns

Topic: Software Design                                                                                  Level: Intermediate

Observability Patterns - What?

Logging, Tracing, and Monitoring of multiple instances of distributed services across numerous servers

1. Log Aggregation

With a microservices design paradigm, the application can have multiple services designated for fulfilling user requests in a discrete, loosely coupled manner isolated in their process boundaries. There may also be numerous instances of the service running facilitating load balancing and scaling on additional machines based on the demand. 
As the services involved process the request it simultaneously generates log statements (information, warning, error, debug) related to its logic processing with a specified format directed to a defined log file.

Log Aggregator Illustration
Log aggregation is responsible for consolidating the logs spanning across multiple instances onto a centralized service, such that we might be able to understand the sequence flow, debug an issue, filter and validate warnings that might arise, and troubleshoot in case of error/exceptions. 
Centralized logging service further benefits in the parsing of log tracing (with exact request CorrelationID passed through all the services) to comprehend the evolution of the initial request as it traverses through multiple services.
Enables setting up alert monitoring agents based on the single source of log information that has accumulated across services.
Helps in understanding the data transition leading to the chronology of error manifestation.

2. Performance Metrics

Instrumenting the decoupled services to acquire details on the response latency, error rate, request threshold, thread consumption, CPU and memory utilization benefits in fine-tuning the service/server instance for better scalability, robust fault tolerance and graceful termination is part of performance metrics enablement and evaluation.

Metrics compiled for the application involving multiple services can provide insights into the E2E system usage and determines the threshold points for optimization and also narrow down the fundamental service responsible for degradation.
The analysis could further introduce possibilities to scale services that take a toll on the incoming request thereby enabling dynamic real-time application expansion.

3. Distributed Tracing

Tracing a particular user application request traversing through multiple services for understanding the effect of the request, so as to pinpoint its existence in the different services and to obtain the conclusive data response for such a request.

A unique transaction ID (correlation ID) is passed through the call chain of each transaction in a distributed topology. One example of a transaction is user interaction with a website.
The unique ID is generated at the entry point of the transaction. The ID is then passed to each service that is used to finish the job and written as part of the services log information. It's equally important to include timestamps in your log messages along with the ID. The ID and timestamp are combined with the action that a service is taking and the state of that action.

Distributed Tracing Illustration
For instance, in the above illustration when the request enters into the application a unique alphanumeric ID  is assigned to it and as the request context flows through multiple services for processing its business logic facilitation the ID is passed along the context thus enabling manageable tracing in distributed architecture.

With distributed tracing, you can,
  1. Chronologically track the sequence of processes performed
  2. Establish interlinkages between multiple services
  3. Resolve the request data lifecycle
  4. Investigate and troubleshoot issues in decouples services
  5. Build an audit trail of events for the request

4. Health Check

Consistently monitoring the distributed services for availability can reduce unanticipated application downtimes thereby improving resiliency. Having a service client implementation that periodically invokes the service endpoints to inspect the health and state of the service instance, by reconciling with the expected response (predefined static (or) dynamic response evaluation) else alert notify on the corrective action.
At a minimum, a health check API is a separate REST service that is implemented within a microservice component that quickly returns the operational status of the service and an indication of its ability to connect to downstream dependent services. An advanced health check API can be extended to return performance information, such as connection times. The results must be returned as an HTTP status code with JSON data.
The health check APIs can be further enhanced to assess the below on the instrumented services,
  • Bugs
  • Memory Leaks
  • Thread Leaks
  • Configuration Issues
  • Deadlocks
  • Connection Pool Managment
  • Process Redundancies
  • External Connection Dependencies
References


Disclaimer: 
This is a personal blog. Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated. Any views or opinions are not intended to malign any religion, ethnic group, club, organization, company, or individual. All content provided on this blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information.
Downloadable Files and ImagesAny downloadable file, including but not limited to pdfs, docs, jpegs, pngs, is provided at the user’s own risk. The owner will not be liable for any losses, injuries, or damages resulting from a corrupted or damaged file.
  • Comments are welcome. However, the blog owner reserves the right to edit or delete any comments submitted to this blog without notice due to :
  • Comments deemed to be spam or questionable spam.
  • Comments including profanity.
  • Comments containing language or concepts that could be deemed offensive.
  • Comments containing hate speech, credible threats, or direct attacks on an individual or group.
The blog owner is not responsible for the content in the comments. This blog disclaimer is subject to change at any time.

Comments

Popular posts from this blog

Tech Conversant Weekly Jul 03 - Jul 15

Topic: General                                                                                                                                              Level: All Welcome to the world of cutting-edge technology! Every bi-week, we bring you the latest and most incredible advancements in the tech industry that are sure to leave you feeling inspired and empowered. Stay ahead of the game and be the first to know about the newest innovations shaping our world. Discover new ways to improve your daily life, become more efficient, and enjoy new experiences. This time, we've got some exciting news to share with you! Boosting Java startup with Class Data Sharing (CDS) https://www.youtube.com/watch?v=vvlQv1Dh-HU JDK21 LTS Maintenance and Support https://www.youtube.com/watch?v=3bfR22iv8Pc Health checking of multiple cloud applications with Spring Cloud Gateway https://spring.io/blog/2023/07/05/active-health-check-strategies-with-spring-cloud-gateway Functional Style Non-reactive HTTP clie

Tech Conversant Weekly Jun 19 - Jul 01

Topic: General                                                                                                                                              Level: All Welcome to the world of cutting-edge technology! Every bi-week, we bring you the latest and most incredible advancements in the tech industry that are sure to leave you feeling inspired and empowered. Stay ahead of the game and be the first to know about the newest innovations shaping our world. Discover new ways to improve your daily life, become more efficient, and enjoy new experiences. This time, we've got some exciting news to share with you! Modelling common behaviors between the List and the Set interface has been partially provided by LinkedHashSet. Now from JDK21 with the new interface SequencedCollection extending the Collection interface and is also extended by the List, SortedSet via SequencedSet (for reversal operation), Deque. The SequencedMap interface extends the Map interface by providing the below me

Microservices - Design Patterns

Topic: Software Design                                                                                                        Level: Intermediate Microservices - What? Microservice is a software design methodology, delegated to perform an isolated decoupled single functionality (following the Single-Responsibility Principle from object-oriented SOLID design principles).  Moreover, microservices by design, are decoupled making it easy to develop, test, maintain, deploy, configure, monitor and scale modules independently. Microservices - Why? Having one microservice would not be helpful without it being able to interact with other microservices, to aid in bringing an end-to-end business solution. So arises a question, how can I design a software system that is resilient, decentralized, fault-tolerant, scalable, maintainable, and extensible that complies with the microservice architecture? Design Patterns - What? Design patterns are solutions for commonly occurring problems within a given