Breaking Down the ELK Stack

The ELK Stack is a new system that offers a way to take data from any source on your network, analyze it and visualize it for your convenience, all in realtime. Recently, it’s been gaining traction as a new leader in the open source market for logging analytics and visualization.  Over the years, the ELK Stack has been growing in terms usage, as monthly downloads have exceeded 500,00, with companies like Google, Netflix and Linkedin relying on it for their  analytics.

The stack itself consists of three parts: Elasticsearch, Logstash, and Kibana. Together these pieces add up to create a full system for analytics, that we’ll break apart below.

Elasticsearch

The Elasticsearch data structure helps to change mappings and gain performance benefits by indexing information in a searchable data storage. Some features include:

  • Interactive Search Analytics: search your data with real time insights and advanced analytics to help optimize your network performance
  • Scalability: As in the name, Elasticsearch offers incredible elasticity with a massively distributed structure that allows you to start small and scale horizontally by adding more nodes as your network expands
  • Reliability: Elasticsearch clusters help to detect failed nodes in order to organize and distribute data automatically so your data is always accessible and secure. Any changes are recorded in transaction logs on multiple nodes to minimize data losses
  • Multitenancy: clusters can contain more than one index that can be grouped together or searched individually. Index aliases allow administrators to present filtered views of an index that can be updated transparently to your application
  • Extensive Search Parameters: Elasticsearch offers full-text searching capabilities with a query API that supports multilingual search, geolocation, contextual suggestions, autocomplete and result snippets
  • Automation: simply index a JSON document and it will automatically structure the data, making it searchable, with customization options
  • RESTful API: all API driven with a simple RESTful API using JSON over HTTP
  • Free: Built on Apache 2, using Apache Lucene, it is all open source, allowing you to download, configure and modify all based on your needs

Logstash

Logstash is an open source data collection tool that organizes data across multiple sources and ships log data to Elasticsearch. Features include:

  • Logs and Metrics: handles various types of logging data such as syslog, Windows event logs, networking and firewall logs, Apache and log4j for Java
  • Metrics offered from Ganglia, collectd, NetFlow JMX and many other infrastructure application platforms over TCP and UDP
  • Build events from HTTP requests  that can then be pooling HTTP endpoints on demand
  • Unifies all your data streams
  • Wide uses for logging data from any source, such as IoT use cases, collecting data from any sort of connected sensor on any device, from cars, to homes to phones
  • Many tuning options available to help modify your Logstash with pattern matching, geo mapping, dynamic lookup capabilities, also deciphers geo coordinates from IP addresses, etc.
  • Integration with Grok to help structure data with their unique filters

Kibana:

Kibana takes the information from the datastore and presents in in graphical format for log analysis. This includes:

  • Integration with Elasticsearch and Logstash: helps to visualize both structured and unstructured data that is indexed into elasticsearch
  • Easily create charts, plots, histograms, maps, etc.
  • Visualize all of the analytics provided by Elasticsearch to provide practical uses for you data and make it digestible for you team
  • User friendly: makes it easily to share and set up on its own web server

How It Compares

Since it’s inception in 2010, ELK Stack has been disrupting the log analytics industry. The leading tool beforehand was Splunk, which was founded in 2003 and quickly grew into a global product, but since ELK Stack, Splunk has finally found some competition. The biggest benefit that Splunk had against open source logging tools was organization and reliability. Most of the time with open source projects, they fail since they are not able to create the same type of enterprise as privately funded companies. But ELK Stack seems to have finally broken that mold.

And while ELK Stack may be getting a lot more users, it’s not the only open source logging tool that is making waves. Recently, Fastly issued a post about how they are using Greylog, an open source competitor that has some similarities to ELK.

Their reasoning consisted of:
  • Ability to restrict access to sensitive logs
  • Ownership of their security to reduce attacks
  • Benefits of Elasticsearch technology, which makes indexing and searching straightforward
  • Also has a Kibana competitor with similar features called MongoDB
  • Streaming capabilities that allow multiple users to view subsections of logs at the same time

To better meet their needs, Fastly decided to develop their own, homegrown logging system, which still used some components of the ELK Stack and has a lot of similarities. One thing about all these advancements in open source logging tools, is the fact that you can take whatever pieces that work for you and perform modifications along the way.

That’s exactly what CloudFlare did once their size outgrew the capabilities of PostgreSQL. Due to their rapid growth, CloudFlare needed a logging system that scales out their performance and huge data storage with high availability. With all their experience running PostgreSQL, they wanted to keep using something with that compatibility and the best way for them to do that was to use CitrusDB. So whether the ELK Stack seems best for your logging needs, or if just pieces of it will work, there are tons of options and modifications in the open source market that can help to optimize your system analytics.

Digiprove sealCopyright secured by Digiprove © 2016