No items found.

Feature Store for ML
The solution for Data Engineers

The Hopsworks Feature Store makes model deployment & re-use faster,
Lowers model biases and increases the overall efficiency, speed and reliability of your ML pipeline.

In a Nutshell

A feature store is a central vault for storing documented, curated, and access-controlled features. Here we discuss the state-of-the-art in data management for deep learning and present the first open-source and cloud-native feature store, available in Hopsworks.

what is a Feature Store?

The concept of a feature store was introduced by Uber in 2017. The feature store is a central place to store curated features within an organization. A feature is a measurable property of some data-sample. It could be for example an image-pixel, a word from a piece of text, the age of a person, a coordinate emitted from a sensor, or an aggregate value like the average number of purchases within the last hour. Features can be extracted directly from files and database tables, or can be derived values, computed from one or more data sources.

Features are the fuel for AI systems, as we use them to train machine learning models so that we can make predictions for feature values that we have never seen before.

The need for a Feature Store

Machine learning is an extremely powerful method that has the potential to help us move from a historical understanding of the world to a predictive modeling of the world around us. However, building machine learning systems is hard and requires specialized platforms and tools.

Although ad-hoc feature engineering and training pipelines is a quick way for Data Scientists to experiment with machine learning models, such pipelines have a tendency to become complex over time. As the number of models increase, it quickly becomes a pipeline jungle that is hard to manage. This motivates the usage of standardized methods and tools for the feature engineering process, helping reduce the cost of developing new predictive models. The feature store is a service designed for this purpose.

Practical benefits & economy at scale

A frequent pitfall for organizations that apply machine learning is to think of data science teams as individual groups that work independently with limited collaboration.

Having this mindset results in machine learning workflows where there is no standardized way to share features across different teams and machine learning models. Not being able to share features across models and teams is limiting Data Scientist's productivity and makes it harder to build new models.

By using a shared feature store, organizations can achieve an economies-of-scale effect. When the feature store is built up with more features, it becomes easier and cheaper to build new models as the new models can re-use features that exist in the feature store.

Integrates with Sagemaker,
Databricks, Kubeflow, Cloudera

The Hopsworks Feature Store integrates seamlessly with popular platforms for Data Science, such as AWS Sagemaker and Databricks. It also integrates with backend datalakes, such as S3 and Hadoop. Whether you deploy on‑premises or at your preferred cloud provider, the Hopsworks Feature Store will provide consistent, dependable functionality.

Hopsworks at a glance

Efficiency & Performance

Development & Operations

Governance & Compliance

Feature Store
Data warehouse for ML
Distributed Deep Learning
Faster with more GPUs
HopsFS
NVMe Speed with Big Data
Horizontally Scalable
Ingestion, Dataprep, training, Serving
Notebooks For development
First-class Python Support
Version Everything
Code, Infrastructure, Data
Model Serving on Kubernetes
TF Serving, MLeap, SkLearn
End-to-End ML Pipelines
Orchestrated by Airflow
Secure Multi-tenancy
Project-based restricted Access
Encription At-rest, In-Motion
TLS/SSL everywhere
AI-Asset Governance
Models, Experiment, data, GPUs
Data/Model/Feature Lineage
Discover/track dependencies

Efficiency & Performance

Feature Store
Data warehouse for ML
Distributed Deep Learning
Faster with more GPUs
HopsFS
NVMe Speed with Big Data
Horizontally Scalable
Ingestion, Dataprep, training, Serving

Development & Operations

Notebooks For development
First-class Python Support
Version Everything
Code, Infrastructure, Data
Model Serving on Kubernetes
TF Serving, MLeap, SkLearn
End-to-End ML Pipelines
Orchestrated by Airflow

Governance & Compliance

Secure Multi-tenancy
Project-based restricted Access
Encription At-rest, In-Motion
TLS/SSL everywhere
AI-Asset Governance
Models, Experiment, data, GPUs
Data/Model/Feature Lineage
Discover/track dependencies

More informations

Get an introduction to Hopsworks and Hopsworks Feature Store for your Machine Learning projects together with one of our engineers.

A comprehensive walk-through
Let us know if your specific wishes and pre-requisites for your personal demonstration.