What is Istio? A dummies guide to service mesh

Background

Microservices architecture is one of the most popular way of architecting distributed systems. It breaks down business functions into individually deployable components which can be distributed. Often these micro services communicate with each other synchronously via protocols like HTTP (think REST-API) or gRPC, or asynchronously via some kind of messaging technology like RabbitMQ or PubSub.

Challenges

While breaking the functionality into several microservices brings in agility in deployments and promotes loose-coupling, it also poses some challenges.

Fig 1: Microservices World

Challenge 1: Resiliency and Load Balancing

In any modern decent-sized application, there could be tens of instances of a given microservice running. Many a times these instances crash owing to random hardware failures, or are overwhelmed with unanticipated number of requests that could lead to failures. Typically, engineers overcome these situations via techniques like Retry mechanism or Circuit Breakers. These techniques safeguard the system against cascading failures and need to be implemented at microservices-level.

Challenge 2: Encryption

When we have many microservices written in different languages, some of them may enforce encryption of the message. For example, in a FinTech organization, there could be two groups — Banking and Trading each having many microservices of their own. When the services in Trading interact with services in Banking, they will often encrypt the data. The problem can become compounded when these services are not written in the same language. This would mean that logic of encryption/decryption needs to be taken care of in different languages at the microservices-level.

Challenge 3: Tracing

In our earlier FinTech example, imagine a client placing a BUY request for shares of some company. Typically this request will originate in the Trading services which in-turn will interact with Banking services to check balance, allocate appropriate amount, place a BUY request with stock exchange and then reflect the success/fail/queued status in the Trading services. This constitutes a single transaction. For debugging purposes, we should be able to trace the request from its origin to its destination. This is called Distributed Tracing. Also, each of these services will have their individual logging. As a support engineer, it becomes highly inefficient to trace the request in each of the interacting services and scan through their logs. Distributed Tracing and Logging comes under a broad topic of Observability.

Challenge 4: Discovery

When service A invokes API endpoint of service B, A needs to know the network location (i.e. IP address and port number) of the endpoint. Then only the communication between the two can be established. When we have many such microservices, each need to know the location of other microservices. This leads to a ‘mesh’ kind of communication. In a non-cloud environment typically the IP address would be static. But in a Kubernetes-based cloud environment where we dynamically assign IP addresses it becomes a challenge to register and discover services. API gateways are used to solve this problem. However, it should be noted that API gateways are used for many other purposes.

Service Mesh

As you can notice that any microservice does more-than what its supposed to do. Discovery, Security, Tracing, et al is not the prime responsibility of the service. This needed to change and had to be abstracted out in its own separate component. This led to the birth of Service Mesh.

Fig 2: Service Mesh

As you can see from Fig 2, after separating the unrelated things (unrelated from the point of view of microservice) each microservice has become lightweight and deals only with its own business function. All the other functionalities are extracted out to a separate layer called Control Plane. A new Proxy is injected in each microservice whose responsibility is to communicate with other proxies and also to collect metrics to pass on to the Control Plane. These proxies constitute a Data Plane.

Did we directly jump to Service Mesh?

No. As everything in computer science goes through a series of small evolutions, so did the Service Mesh. Each of the extra-functionality done by microservices was extracted out into its own library. E.g. Netflix developed the Hystrix library for circuit breaking. Eureka was another library created for service registry and discovery. Soon it became cumbersome to manage these individual libraries. Service Mesh came into being which offered a neat grouping of these into Control Plane and Data Plane.

What is Istio

Istio is one implementation of Service Mesh. Other options are

Istio Architecture

Fig 3: Istio Architecture. Image Credit: Istio.io

Congratulations if you have reached this far. If you liked my article, consider giving a few claps and follow me on medium or Linkedin.

Just in case you were wondering what does Istio mean, it means to sail :-)

Engineer and Water Color Artist @toashishagarwal