Fastly Partners with Azure For Big Data

Fastly just announced its partnership with Microsoft Azure. It involves Fastly’s integration with a range of Azure services, such as Azure Blob Storage, Azure Event Grid, and Azure Data Explorer (ADX, formerly known as project “Kusto”), a big data cloud-based analytics platform. The goal, according to Fastly, is “to create a powerful, near real-time data analytics solution”.

ADX calls itself “a lightning-fast indexing and querying service to help you build near real-time and complex analytics solutions”. Through the performance of ad-hoc queries on large volumes of data, ADX aims to allow developers to rapidly identify behavioral patterns, trends, or anomalies across all types of data, including structured, semi-structured and unstructured data.  

Use Case: Taboola

The Fastly/ADX partnership is already in use by global advertising business Taboola. Taboola delivers personalized recommendations to over one billion users monthly, rendering 3 billion web pages on a daily basis to leading publishers, marketers and agencies.

Fastly is now Taboola’s CDN provider, and Taboola deploys Fastly’s edge cloud platform to deliver and optimize online experiences. By leveraging Fastly’s edge cloud platform to stream the real-time logs straight into Azure Blob Storage (massively scalable object storage for unstructured data), the advertising company is able to configure its ADX service to automatically ingest the log data and then explore it to identify valuable insights and act on them to improve customer experience, enhance products and maintain a competitive advantage.

Taboola was previously employing a home-grown solution, which involved stitching together various technologies from several different vendors to storage, manage and query their data. They were not happy with its complexity or its pricing, however. Taboola needed a better, more reliable way to rapidly, simply and cost-effectively analyze over 22 billion records of edge delivery logs on a daily basis to provide reliable, top performing content recommendations.

The challenges were overwhelming for the Taboola team. Ariel Pisetzky, VP of IT at Taboola, said, he and his teams are frequently “overwhelmed with logs, pinned down by the rate and volume of them”. Their job is to constantly consult the logs, manage them, work out how and where to store them, and process them in as insightful a way as possible.

According to Taboola, the Fastly/ADX solution outperforms its former one with both “a faster update time” and “an intuitive interactive interface”. Ariel Pisetzky, VP of IT at Taboola, said, “The solution was so simple that we were up and running in a week, ingesting and analyzing 17 TB of data per day”. Pisetzky also said the company was now able to run queries “constantly against the system”, which was not something it could do previously. The Taboola R&D team is also finding ways to leverage the instant insights from the Fastly/ADX solution to help them make forward strides in their product, both to generate new ideas and design improved algorithms to better analyze performance, identify trends and anomalies, and troubleshoot problems in near real-time.

Other customers already using ADX include Siemens Healthineers, Ecolab, SnelStart and DocuSign.

What does ADX Do?

Azure Data Explorer (ADX) describes itself as a “fast and highly scalable data exploration service”. The service was built in response to the worldwide big data problem. 2.5 billion gigabytes of data are captured on a daily basis, and as more web-connected devices are unrolled (from smartphones to home security cameras), this number rises daily. ADX positions itself as a “big data interactive analytics service” to help businesses maintain a competitive advantage by being a powerful tool to organize, analyze and act on insights gleaned from big data.

ADX offers’ assistance in four key areas:

  • Productivity – fast indexing and querying on large, diverse and fast data sets can lead to near real-time insights
  • Scalability – massive scalability quickly via the Azure global network
  • Simplicity – simplify data exploration with one solution that combines indexing, column store and time series capabilities
  • Flexibility – you can use ADX to build your own analytics solution

The service is still in public preview.

Related Azure Offerings

  • Azure Event Hubs – a streaming data platform that offers a managed, real-time data ingestion service. Event Hubs enables the streaming of millions of events per second from any desired source to construct dynamic data pipelines and respond immediately to unfolding business needs;
  • Azure HDInsight Kafka Clusters – Kafka is similar to Event Hubs, and works an open-source distributed streaming platform, which is used for the building of real-time streaming data pipelines and applications; Kafka on HDInsight is a managed service, which offers a simplified configuration process;
  • Azure Stream Analytics – an on-demand real-time analytics service that runs on IoT or non-IoT streams of data that uses SQL-like language;
  • Azure Databricks – an Apache Spark-based big data analytics service that allows you to quickly set up your Spark environment
  •  Azure SQL Database and Power BI – you can connect to an Azure SQL Database and use the Power BI Desktop to generate reports, identify trends and critical performance indicators.

What ADX Does Differently

ADX puts together all the related Azure capabilities, in addition to time series analytics features, to perform queries over large volumes of data yet offer the same kind of response times to a BI platform over relatively tiny small data sets. Microsoft claims ADX is capable of querying “billions of records in seconds”. Microsoft uses ADX to power its Azure Monitor and Azure Time Series Insights services.

The Fastly/ADX Partnership – Why It’s Innovative

According to ZDNet, the combination of “columnar storage and index (both of which are implemented by ADX) can produce stunning results”. The other big takeaway ZDNet found is that “time series analytics on truly big data can actually be straightforward”. The simplicity and ease-of-use of the ADX offering is indeed its power.

Custom solutions can be strung together, using for instance, Apache Kafka and Apache Spark’s Spark Streaming solutions, but they involve a range of skill sets and a great deal of complexity, in addition to the need to actively manage the solution to scale the infrastructure as required.

ADX, meanwhile, can be accessed-on demand, automatically scaled, and in terms of expertise, it requires little other than the need to learn its query language. As the source data lives in cloud storage, it is query-able by open source technologies such as Spark or Hadoop.

Fastly says its partnership with Microsoft Azure takes ADX’s capabilities “a step further” as it enables businesses the opportunity to collect data in real-time from the edge, and pair that with insights from other monitoring and analytics tools in one central, integrated platform. Fastly’s edge platform distinguishes the combined solution as the company is able to provide “100% of logs in real-time from the network edge, allowing businesses to monitor site performance and troubleshoot issues as they happen. In partnering with Microsoft, Fastly integrates its real-time data logging capabilities for automatic ingestion and analysis of application and engagement performance”.

Taboola’s Ariel Pisetsky said the solution allowed his team to spot a previously unseen problem. He said, “Our team was even able to instantly identify a problematic latency issue impacting the speed at which content was being served to customers and rectify the issue in real-time. In a time where speed of reaction and innovation is paramount to user experience, this capacity is indispensable.”

Edge Offering

“With Azure Data Explorer, engineers can instantly identify trends, patterns, or anomalies from their ever-growing and changing data. Joining forces with Fastly, we can provide innovative organizations with a solution to deeply understand across their edge workloads,” says Daniel Yu, Microsoft’s Director of Product Marketing. “Microsoft and Fastly share the same principles of speed, scalability, and flexibility, so we believe the integration of our technologies will provide a unique solution for our customers.”

“Fastly delivers more than a three million log events per second, empowering our customers to easily view their traffic, understand their site health, and make the changes they need as quickly as possible,” explained Dana Wolf, SVP of Product and Marketing at Fastly. “To this end, we embrace and integrate with a multitude of central cloud platforms. This partnership and the integrations we’re building with Microsoft are another investment in support of our customer-first philosophy, empowering businesses to get the most value out of their operations at the edge.”

Summary

Ease-of-implementation will need to be weighed against lock-in concerns when picking a big data solution. If you are using open source solutions that run in Kubernetes clusters, you will need portability across public clouds and into corporate data centers. However, when using purely open source solutions, time to market/value and project success factors can be more challenging.

The Fastly/Microsoft Azure offering is a customer-facing solution that offers real-time analysis on high-volume big data that comes in an easy-to-use package.