Spell MLOps Platform Automates AI Experiment Reproduction

Categories

Spell, a NY-based startup has developed an innovative MLOps (machine learning operations) platform for the SMB market. Organizations can run ML experiments and workloads cost-effectively using features that fully automate the entire process. The platform leverages the latest in GPU tech from AWS, GCP, and Azure. Users can get started in minutes. Simply use a Jupiter notebook, if that’s your cup of tea, then launch and train models, automate workflows, and create APIs.

Spell founder Serkan Piantino worked at Facebook prior to the startup for a decade. Although he didn’t have much experience in AI initially, he got the opportunity to work with the world-renown AI scientist Yan LeCun. Together, the duo helped launched Facebook AI research. The experience of working in FB AI Research provided him with profound insight into AI hardware, workflows, algorithms, frameworks, tools, and so on. After accumulating a small “fortune” at Facebook, Serkan left to start his own company and self-funded the entire venture himself in the early stages. Fast forward to the present and Spell has raised a total of $15M.

Facebook AI

Digging deeper into the interworkings at Facebook AI Research, they’re a powerhouse in the areas of AI software and hardware. For starters, they’ve developed and open-sourced PyTorch, the popular framework that now rivals TensorFlow. On the hardware front, FB worked with suppliers like Nvidia, Graphcore, Habana Labs, (Intel), AMD, and others to develop custom ASICs, hardware components, and servers dedicated to running experiments and inference workloads. Their algorithms of choice – CNN, SparsNN, and LSTM. The best part, FB has open-sourced its AI hardware designs under the Open Compute Project (OPC) umbrella.

Initially, one of the issues FB encountered in running experiments was that “traditional pipeline” tools were inadequate for their workloads because they couldn’t “rerun pipelines with different inputs and mechanisms to explicitly capture outputs and/or side effects.” Therefore, they created their own workflow platform called FBLearner Flow, a game-changer for creating, managing, and maintaining AI-specific pipelines.

FBLearner enables every FB engineer, regardless of their ML expertise to run experiments with ease. Algorithms and processes can be reused and experiments can be replicated using different datasets. In addition, the entire process is fully automated, which really comes in handy because engineers are sometimes required to run hundreds of experiments before they strike gold. FBLearner drastically improved productivity so that thousands of experiments could be run simultaneously. The one great insight the FB team discovered running millions of experiments, “the largest improvements in accuracy often came from quick experiments, feature engineering, and model tuning rather than applying fundamentally different algorithms.” Unfortunately, the tool is internal and has not been open-sourced, yet.

MLOps

The term MLOps is relatively new but it follows in the footsteps of DevOps. DevOps is the process of automating the application lifecycle and that requires collaboration between teams and performing the CI/CD task for testing and delivery. The one caveat, MLOps is more difficult because “data is always changing and models are constantly learning, thus making the process more complex.” By using an MLOps platform, companies can bypass the need to build an internal MLOps team. For the SMB, that’s ideal because they don’t need to spend millions on testing and deploying AI, only thousands.

Background

Company: Spell Inc.
Started: 2017
HQ: New York
# of Employees: 28
Raised: $15M
Founders: Serkan Piantino (CEO) and Trey Lawrence (CTO)
Product: MLOps Platform