Kafka V/s RabbitMQ Queuing Systems
Queuing systems are asynchronous in nature which are built to transfer the data from producer to consumer (mainly known as PUB-SUB model). This whole architecture is known as Message Oriented Middleware (MOM). This architecture involves the data structure “queue” so that the data can be stored to be processed later. It participates in the development of “Service Oriented Architecture” and if implemented correctly, these queuing systems can increase the user experience of a web page by reducing the load times.
These queuing systems came into scenario because :
- Asynchronous in nature. It helps in queuing the data, the consumer can consume it later.
- Decoupling behaviour of producer and consumer. The producer implementation does not depends on consumers ones and they can run independently.
- Resilience nature in which if producer or consumer fails, the whole system need not to be stopped.
- These provides buffering which helps in maintaining and optimizing the speed of the data which travels among the applications.
- These are scalable in nature where more than one applications can submit their jobs in the queue.
Widely used queuing systems are Apache Kafka and RabbitMQ. So, here are comparisons between the systems.
S.No. |
Apache Kafka |
RabbitMQ |
Kafka is basically producer-centric, a client server system that is built for the purpose of intake of high data streams at the publisher end and give the data to a batch of subscribers (whether online or offline) can be loaded together to consume the data. |
RabbitMQ isserver only system which is built on the protocol AMQP. It is used for guaranteed delivery of messages between publisher and subscriber. |
|
2. |
Kafka is mainly designed to to handle both slow batch consumers and online consumers.It has a capability to handle almost 100k + events per second. It basically acts as “shock absorber” between enormous events exchange between publishers and subscribers. |
If the volume of the data is large and the consumers are too slow, RabbitMQ will fail. It can handle at most 20k+ events per second. Post 2.0 version of RabbitMQ, the feature of handling slow batch consumers introduced. |
3. |
Kafka persists all the messages by writing them on disk immediately. So it behaves like a database storage system. Hence terabytes of message can be pertained without impacting any performance. |
RabbitMQ is designed to guaranteed delivering of messages to the consumers or buffers it for later use. |
4. |
It is an open source of Apache Licence 2.0 and it is written in Scala (JVM). |
It is also an open source of Mozilla Public Licence and it is written in Erlang. |
5. |
The brokers and consumers use Zookeeper to reliably maintain their delivery state across a cluster, because it doesn’t have the feature of message acknowledgements, it considers that the consumers are tracked on the basis of what they consumed so far. |
Does have the feature of message acknowledgements. The RabbitMQ broker uses Erlang Mnesia to maintain their delivery state across the clusters. |
6. |
Kafka has a very simple routing approach. it uses topic exchanges only. |
RabbitMQ has got a rich routing capabilities because it internally uses AMQP 0.9.1’s exchange, binding and queuing model. |
7. |
The messages which are to be consumed are distributed in partitions to a topic which acts as “log stream”. This stream is carried forward via consumers. |
Messages are published through exchange points to queues. Online consumers get the messages via message brokers. The delivery is to done again in case of failure, hence it provides guaranteed delivery of messages for the queues having consumer count 1. |
8. |
Kafka provides message ordering inside the partitions. So if strict ordering of messages are required at consumer end, the consumers have to placed in smart fashion. |
RabbitMQ provides unordered delivery of messages at consumer end, because according to AMQP 0.9.1 model, in order to achieve in order delivery there must be “one producer channel, one exchange, one queue and one consumer channel”. |
9. |
The drawback of this system is that it can have a single consumer for each partition. So, a slow message can block all other incoming messages, hence different strategies to be applied that time to remove the issue. One of the strategy can be to make another consumer group and synchronize with the present consumer. The synchronization is required to avoid the duplication of messages processing. |
In this case , if a slow message is to be consumed by consumers, we can connect more and more consumers to the queue to intake the queued messages that are behind the slow message. |
10. |
Kafka is mainly designed for “fast” and reliable consumers. |
RabbitMQ is mainly designed for “slow” and unreliable consumers. |