Site icon Bizety: Research & Consulting

Kafka vs RabbitMQ

There are many messaging products in the market; but two of the most popular open source messaging technologies available today are RabbitMQ and Apache Kafka. Which software is right for you? Each has its own story, design framework, set of features, use cases in which it is particularly efficient, integration potential, and developer experience. We will focus on the features in this post, and touch on some of the other areas along the way.

At its simplest, Kafka is a message bus optimized for high-ingress data streams and replay while RabbitMQ is a mature, general purpose message broker that supports several standardized protocols, including AMQP.

A streaming platform has four main capabilities:

In the early days of data processing, the main way to process and output data was batch-oriented data infrastructure. Today, however, in a world in which real-time analytics are necessary to keep up with network demands and functionality, stream processing has become essential.

The Publish/Subscribe pattern is a distributed interaction paradigm well suited to the deployment of scalable and loosely coupled systems.

Main Features of Pub/Sub Systems

Founding Stories

Apache Kafka began life at LinkedIn as a way of making data ingestion to Hadoop from Apache Flume more straightforward. Ingesting and exporting from several data sources and destinations using tools like Flume involved writing separate data pipelines for every source and destination pairing. Kafka allowed LinkedIn to standardize the data pipelines and enabled the getting data out of each system just once and into each system just once, ultimately simplifying pipelines and operations. Kafka is now well integrated into the overall ecosystem of Apache Software Foundation projects. It is particularly well integrated into Apache Zookeeper, which provides the backbone for Kafka’s distributed partitions, and offers various clustering benefits for Kafka users.

RabbitMQ, meanwhile, was initially developed as a traditional message broker in order to implement a range of messaging protocols. At first, it was designed to implement AMQP, an open wire protocol for messaging that has powerful routing features. AMQP enabled cross-language flexibility for open source message brokers, thereby enabling the growth of non-Java applications, which needed distributed messaging. It was one of the first open source message brokers to achieve a strong level of features along with robust documentation, dev tools and client libraries. As a result, with over 35,000 production deployments, RabbitMQ is the most widely deployed open source message broker in the market.

Use Cases

Kafka is used for two main classes of application:

  1. Building streaming data pipelines that operate in real-time in order to reliably get data from one system or application to another;
  2. Building streaming applications that operate in real time-time in order to transform or react to the data streams.

The way that Kafka describes itself is as a distributed streaming platform; however, it is well known mainly for being a durable storage repository with robust Hadoop/Spark support. Popular use cases include:

The best types of messaging scenarios in Kafka are:

RabbitMQ, meanwhile, tends to be used in situations in which web servers need to respond quickly as opposed to being forced to perform resource-intensive procedures while the user waits for the result. RabbitMQ is also frequently used to distribute a message to several recipients for consumption or to balance loads between workers under high load (20k+/second). It also offers numerous features that extend beyond throughput, such as reliable delivery, routing, federation, security, management tools, HA, and others.

Scenarios that work best in RabbitMQ include:

RabbitMQ shines when integrating to existing IT infrastructure.

Design

Apache Kafka

RabbitMQ

Languages and Libraries

Apache Kafka only ships a Java client, however, the catalog of community open source clients and ecosystem projects is growing. In addition, there is an adapter SDK, which lets you build your own system integration. Most of the configuration is performed programmatically or via properties files. There are many client libraries in Apache Kafka including Ruby, Python, Node.js and Java.

RabbitMQ, by contrast, supports a range of languages, including Java, Spring, .NET, PHP, Python, Ruby, JavaScript, Go, Elixir, Objective-C, Swift as official languages. It also supports various other clients and devtools via community plugins. Its client libraries are mature and well documented, and include Ruby, Python, Node.js, Clojure, Go, Java and C.

Since these two tools are so popular, most other software providers offer solutions that mean RabbitMQ and Kafka work well with or on their technology.

Security and Operations

The Kafka 0.9 release added TLS and JAAS role-based access control, in addition to Kerberos/plain/scram auth, with the use of a CLI to manage security policy. This was an improvement on previous versions in which you could only lock down access at the network level, which made sharing and multi-tenancy tricky.

Kafka’s management CLI is made up of shell scripts, property files, and specifically formatted JSON files. Kafka Brokers, Producers, and Consumers emit metrics through Yammer/JMX, but they do not retain any history, which entails the use of a third party monitoring system.  Operations is able to manage partitions and topics through the use of these tools, in addition to checking the consumer offset position, and using the HA and FT capabilities that Apache Zookeeper offers for Kafka.

Security and operations are strengths of RabbitMQ. The plugin for RabbitMQ management offers an HTTP API, a browser-based UI for monitoring and management, in addition to CLI tools for operators. External tools such as CollectD, Datadog, or New Relic are necessary for longer-term monitoring data storage. RabbitMQ also offers APIs and tools for monitoring, auditing and other types of troubleshooting. In addition to offering TLS support, RabbitMQ ships with RBAC backed by a built-in data store, LDAP or external HTTPS-based providers and it supports authentication using the x509 certificate as opposed to username/password pairs. Additional authentication methods can be developed via the use of plugins.

Additional Features and Information

Conclusion

Apache Kafka scales up to 100,000 msg/sec on a single server, so easily outbeats Kafka as well as all the other message brokers in terms of performance. It’s often a key driver for people in choosing to work with Kafka. However, while Kafka is well optimized to work with “fast” consumers, due to its partition-centric design, it is a little less successful at working with “slow” consumers. Its performance capability is also in part determined by a significant amount of responsibility on the developer writing the consumer code.

RabbitMQ supports a wide range of development platforms with ease of use and all the benefits of having a mature history behind it. It scales well at around 20,000 message/second on a single server, but it also scales well as more servers are added. If overall throughput is sufficient for requirements, this message broker works OK for “fast” consumers. “Slower” consumers, however, are the ones to really reap the benefits of RabbitMQ.

Exit mobile version