Welcome to the feature store newsletter brought to you by Logical Clocks, where we in a monthly cadence will highlight the latest news, events, and insights as we help make companies successful in their machine learning transformation journey and empower businesses to be applied-AI model-driven companies.
In this month’s edition, we highlight new blog posts, articles, and events related to feature stores for ML.
Editor’s Pick
A Key Missing Part of the Machine Learning Stack
Malo Marrec from cloudnative.fr wrote this long-form post motivating the emergence of feature stores. The post includes a nice motivating example of how a feature such as “the number of empty rooms in a hotel” can be reused across ML pipelines with the help of a feature store. He also interviewed some key new players in the feature store space and experienced practitioners. Worth a read!
Read more
A Feature Store for Databricks
This blog post by Fabio Buso at Logical Clocks describes how the managed Hopsworks platform (www.hopsworks.ai) can be seamlessly used from Databricks. It includes links to how to get started and the architectural issues that need to be addressed when connecting the Hopsworks Feature Store to Databricks (virtual network injection or peering).
Read here
ACID Data Lakes with Time-Travel Queries
Apache Hudi and Databricks Delta are used to build Feature Stores. In this medium article by Eric Sun, he provides a comprehensive comparison of the three new data lake frameworks that provide ACID properties and time-travel queries to data lakes. Time-travel queries are important for feature stores, as they enable reproducible creation of training data for models.
Read here
Microservices Suck for Machine Learning (and what we did about it)
StreamSQL wrote a blog post about how a Feature Store is a superior architecture for serving features at low latency than using a microservices architecture. It’s a great read for architects who can entertain the notion that microservices are not a universal panacea!
Read here
What Are Feature Stores and Why Are They Critical for Scaling Data Science?
In Towards Data Science, Adi Hirschstein of Iguazio describes the main tenants behind feature stores and how they help in MLOps.
Read here
Feature Store: A better way to implement Data Science and AI in and across your organization
In a concise and leader-oriented post in Towards Data Science, Chan Naseeb describes the motivation behind Feature Stores for ML and the architecture of Feature Stores.
Read here
Looking for a Feature Store? This Google Project Might Help
Jesus Rodriguez of Invector Labs wrote an overview of the GO-JEK Feature Store (which is actually not a Google project - they just helped GO-JEK bootstrap the project over 1 year ago).
Read here
The Architecture Used at LinkedIn to Improve Feature Management in Machine Learning Models
Jesus kept busy researching the space and wrote an overview of LinkedIn’s feature store for ML.
Read here
Logical Clocks Launches Hopsworks.ai: The World's First AI Cloud Platform with a Feature Store
We are thrilled to announce the launch of Hopsworks.ai, the world’s first managed cloud platform for AI with a feature store. With Hopsworks.ai, we bring to the cloud our open-source and award-winning Hopsworks platform. For companies, this translates into reduced time and costs to bring new machine learning models to production.
Until now, feature stores have been the privilege of only a small number of hyperscale AI companies, like Uber, Facebook, and Twitter. For enterprises who missed the first wave and have not yet built a feature store, Hopsworks.ai enables them to make the jump to becoming data-driven by providing them with a ready-made, secure and governed data infrastructure for AI.
The Hopsworks.ai platform enables machine learning teams the ability to develop, train and deploy AI applications at scale. Enterprises may use Hopworks.ai as either a Feature Store for existing data science platforms, such as AWS Sagemaker and Databricks, or as a stand-alone platform for designing and operating machine learning models at scale.
The platform offers two product tiers: a free Community version and an Enterprise version. The Community version will help individuals or organizations to get started with Hopsworks and the Feature Store, while the Enterprise version provides advanced features to support organizations in building production machine learning applications at scale.
Try Hopsworks.ai
Events
Every month we bring to you the most important events related to Feature Stores. See below the top event taking place in the upcoming weeks.
Data Engineering Melbourne Meetup April 2020
Thursday, April 30, 2020
Speaker: Jim Dowling
Topic: Hopswork Data-Intensive Machine Learning with a Feature Store
Read more
Feature Store Webinars
We are bringing a series of webinars to introduce the concept of a Feature Store and how it helps manage data for AI. During these webinars, we will walk-through the Hopsworks Feature Store, introducing its concepts and how you can use it from Hopsworks, Databricks, Sagemaker, Kubeflow and On-Premises Clusters (Hadoop) for feature engineering, as a feature registry, for creating train/test datasets for ML, and as an online Feature Store to build feature vectors for online applications with low latency.
Blog Article
Latest Video
The Hopsworks Feature Store integrates seamlessly with popular platforms for Data Science and cloud-based data lakes based on S3. Databricks users can write Python or Scala programs to compute features and register them in Hopsworks, browse and inspect features and create train/test datasets.
In this video we explain how to integrate Hopsworks with Databricks for feature engineering as a feature registry to create train/test datasets for ML and to build feature vectors for online applications with low latency.