Feature Stores Going Mainstream

Welcome to the feature store newsletter brought to you by Logical Clocks, where we in a monthly cadence will highlight the latest news, events, and insights as we help make companies successful in their machine learning transformation journey and empower businesses to be applied-AI model-driven companies.

In this month’s edition, we highlight new blog posts, articles, and events related to feature stores for ML.

Editor’s Pick

A Key Missing Part of the Machine Learning Stack

Malo Marrec from cloudnative.fr wrote this long-form post motivating the emergence of feature stores. The post includes a nice motivating example of how a feature such as “the number of empty rooms in a hotel” can be reused across ML pipelines with the help of a feature store. He also interviewed some key new players in the feature store space and experienced practitioners. Worth a read!

Read more

A Feature Store for Databricks

This blog post by Fabio Buso at Logical Clocks describes how the managed Hopsworks platform (www.hopsworks.ai) can be seamlessly used from Databricks. It includes links to how to get started and the architectural issues that need to be addressed when connecting the Hopsworks Feature Store to Databricks (virtual network injection or peering).

Read here

ACID Data Lakes with Time-Travel Queries

Apache Hudi and Databricks Delta are used to build Feature Stores. In this medium article by Eric Sun, he provides a comprehensive comparison of the three new data lake frameworks that provide ACID properties and time-travel queries to data lakes. Time-travel queries are important for feature stores, as they enable reproducible creation of training data for models.

Read here

Microservices Suck for Machine Learning (and what we did about it)

StreamSQL wrote a blog post about how a Feature Store is a superior architecture for serving features at low latency than using a microservices architecture. It’s a great read for architects who can entertain the notion that microservices are not a universal panacea!

Read here

What Are Feature Stores and Why Are They Critical for Scaling Data Science?

In Towards Data Science, Adi Hirschstein of Iguazio describes the main tenants behind feature stores and how they help in MLOps.

Read here

Feature Store: A better way to implement Data Science and AI in and across your organization

In a concise and leader-oriented post in Towards Data Science, Chan Naseeb describes the motivation behind Feature Stores for ML and the architecture of Feature Stores.

Read here

Looking for a Feature Store? This Google Project Might Help

Jesus Rodriguez of Invector Labs wrote an overview of the GO-JEK Feature Store (which is actually not a Google project - they just helped GO-JEK bootstrap the project over 1 year ago).

Read here

The Architecture Used at LinkedIn to Improve Feature Management in Machine Learning Models

Jesus kept busy researching the space and wrote an overview of LinkedIn’s feature store for ML.

Read here

Logical Clocks Launches Hopsworks.ai: The World's First AI Cloud Platform with a Feature Store

We are thrilled to announce the launch of Hopsworks.ai, the world’s first managed cloud platform for AI with a feature store. With Hopsworks.ai, we bring to the cloud our open-source and award-winning Hopsworks platform. For companies, this translates into reduced time and costs to bring new machine learning models to production.

Until now, feature stores have been the privilege of only a small number of hyperscale AI companies, like Uber, Facebook, and Twitter. For enterprises who missed the first wave and have not yet built a feature store, Hopsworks.ai enables them to make the jump to becoming data-driven by providing them with a ready-made, secure and governed data infrastructure for AI.

The Hopsworks.ai platform enables machine learning teams the ability to develop, train and deploy AI applications at scale. Enterprises may use Hopworks.ai as either a Feature Store for existing data science platforms, such as AWS Sagemaker and Databricks, or as a stand-alone platform for designing and operating machine learning models at scale.

The platform offers two product tiers: a free Community version and an Enterprise version. The Community version will help individuals or organizations to get started with Hopsworks and the Feature Store, while the Enterprise version provides advanced features to support organizations in building production machine learning applications at scale.

Try Hopsworks.ai


Every month we bring to you the most important events related to Feature Stores. See below the top event taking place in the upcoming weeks.

Data Engineering Melbourne Meetup April 2020

Thursday, April 30, 2020

Speaker: Jim Dowling

Topic: Hopswork Data-Intensive Machine Learning with a Feature Store

Read more

Feature Store Webinars

We are bringing a series of webinars to introduce the concept of a Feature Store and how it helps manage data for AI. During these webinars, we will walk-through the Hopsworks Feature Store, introducing its concepts and how you can use it from Hopsworks, Databricks, Sagemaker, Kubeflow and On-Premises Clusters (Hadoop) for feature engineering, as a feature registry, for creating train/test datasets for ML, and as an online Feature Store to build feature vectors for online applications with low latency.

Blog Article

Latest Video

The Hopsworks Feature Store integrates seamlessly with popular platforms for Data Science and cloud-based data lakes based on S3. Databricks users can write Python or Scala programs to compute features and register them in Hopsworks, browse and inspect features and create train/test datasets.

In this video we explain how to integrate Hopsworks with Databricks for feature engineering as a feature registry to create train/test datasets for ML and to build feature vectors for online applications with low latency.

Get Started with Hopsworks