Cloudera and Intel recently announced the donation of their collaborated open source project Apache Spot to the Apache Software Foundation. Originally created by Intel back in February and launched under the name Open Network Insight (ONI) project, the effort has now been accepted into the Foundation’s Incubator.
The original ONI project utilized multiple open-source technologies, including the Hadoop big data platform, the Wireshark packet-sniffing platform, nfdump for NetFlow packet capture and the Jupyter project for reporting. Over the last several months, the overall project has expanded with additional capabilities, which are now integrated as part of Apache Spot, the project which has a focus on advanced threat detection with Big Data analytics and machine learning against cyberattacks.
Based on Cloudera’s big data platform, Apache Spot studies network traffic to characterize its unique behavior, leveraging Apache Hadoop for infinite log management and data storage scale along with Apache Spark to process data from deep packet inspection of domain name system (DNS) traffic, connections and log files from proxies. From there, Apache Spot utilizes machine learning as a filter to separate bad traffic from benign and to characterize network traffic behavior. The software can analyze billions of events in order to detect unknown and insider threats and provide new network visibility. Additional features include context enrichment, noise filtering, whitelisting and heuristics to produce a shortlist of most likely security threats.
The goal of Spot Apache is to “approach hard security problems” – detecting events such as lateral movement, side-channel data escapes, insider issues, or stealthy behavior in general,” said the github Overview page. “It can be deployed incrementally to realize immediate ROI, but is also meant to support an organization’s growth and maturity to achieve complete threat visibility as part of its protection strategy.”
By providing common open data models, Spot hopes that this approach will encourage more organizations will adopt these models and build upon their analytics and visualizations to prevent and weaken cyberattacks rather than relying on excessive systems to store security data.