Communication between Microservices

4 min readDec 16, 2018

Now a days, it’s hard to make a living as a software engineer without having heard of the term “microservice architecture”. All big giants like Google, Facebook, Uber etc use microservices in favor of monolith codebase for underlying advantages like freedom of using different tech stacks, scaling the applications independently and decoupled deployment (hence less chance of bad deployment bringing the whole system down).

Microservice communication

Breaking down the large codebase into microservices comes with a challenge of communication and data security between the services. For example, The payment service need card details from user service in order to proceed and then sends payment status to order service and so on.

This type of communication can be categorized into two kinds :

Synchronous communication : In this kind of communication one service can’t proceed without getting some response or timeout from another service. Application Processing Interfaces (APIs) are the main medium of this type of communications. The system supporting this kind of communication is stateless in nature.

Asynchronous communication : When the response of one service is not needed for another service to proceed, we use event as trigger for communication. This is also called event driven architecture. The system supporting this kind of communication maintains message log or queue state so that it doesn’t get started over.

While synchronous way is more common and has huge resource on the internet, this article is going to cover the less discussed asynchronous communication between the microservices.

To ensure high scalability and throughput, these messaging system utilize Queue data structure in cloud which gets messages from producers and consume them on same or different machine using consumer processes as shown below.

There are three most popular messaging system that are widely used.

1. Amazon Simple Messaging Queue (SQS)

SQS stands for Simple Queue Service which provides easy to use, reliable queue at low cost. This becomes extremely powerful when used in combination with other AWS products like SNS, Lambda, dynamoDB etc. It is highly reliable low cost fully managed message service for low throughput applications (~50 TPS per worker). Though throughput can be increased by increasing number of producers/consumers or by sending batch requests.

Each consumer polls for the message from queue for some time (20s default) and gets batch of messages. A message in process is called ‘in flight message’ and is not visible to other consumer till it finishes processing or its request timeout occurs (decided by visibility timeout ). A Dead Letter Queue can be configured for failed messages to process them again or to monitor the cause of failure.

2. RabbitMQ

RabbitMQ is a solid, mature, general purpose message broker that supports several standardized protocols such as Advanced Message Queuing Protocol (AMPQ). In comparison with SQS, RabbitMQ is push based system i.e. a consumer sits and waits for the message to arrive it. Messages are published to exchange which distributes messages to queue using rules called bindings. Depending on the type of exchange, messages are delivered to single or multiple queues based on the binding pattern.

Different type of RabbitMq exchange. Source: Cloudamqp

3. Apache Kafka

Started out in LinkedIn to connect different internal data streams, Kafka meant to be durable, fast and scalable for various data integrations and realtime stream processing applications. It requires Zookeeper to maintain coordination between cluster nodes and keeping track of topics’ offsets. It can handle 100,000 messages/sec on a single machine. Recent version of Kafka supports almost “exactly-once delivery” (which is single biggest challenge in distributed messaging systems) using idempotent producers and transactions.

When to use what ?

There are few major factors that one should consider in-order to decide which one of the systems to use.

Data Persistance : Both SQS and RabbitMq delete the message after successful consumption by any of the consumers, to avoid reading it again. Kafka persists the messages in message logs and only increases the offset for consumer-group. This becomes very important when different actions needs to be performed on same message.
Throughput : Kafka is very much suited for high transactions systems like analytics and IOT , whereas low cost SQS and RabbitMq can be easily setup for a system of 20k/s messages. LinkedIn support 2 million messages per sec with just 3 machine using Kafka.
Fault tolerance and availability: SQS doesn’t support replication of queue. So if queue is down, you can’t do anything. The communication between RabbitMq cluster brokers has limitations. Kafka supports master-slave replication at the partition level and hence, provides high availability.

Along with above factors, cost and team size also play important role in deciding which one to use (considering the fact that Kafka cluster is manually managed) .

Hope now you have a better clarity of which of the message systems works well for your scale and team. If you like this post, don’t forget to clap. If I’ve missed anything, just leave your valuable feedback in the comment section below.