Hopsworks
The most advanced feature store, within the most complete MLOps platform: managed and open-source.

How to transform Amazon Redshift data into features with Hopsworks Feature Store

Connect the Hopsworks Feature Store to Amazon Redshift to transform your data into features to train models and make predictions.

Read more

How to Engineer and Use Features in Azure ML Studio with the Hopsworks Feature Store

Learn how to design and ingest features, browse existing features, create training datasets as DataFrames or as files on Azure Blob storage.

Read more

Comparing RonDB on AWS, Azure and GCP using Sysbench

RonDB 21.04.0 has integrated benchmark scripts to execute various benchmarks towards RonDB. We bring now the results of RonDB benchmark on AWS, Azure, and GCP.

Read more

Hopsworks is the first open source MLOps platform with an Enterprise-ready feature store. Its infrastructure has been developed from the ground up to create a cohesive ecosystem, in which all tools and services work together seamlessly.

Providing an open ecosystem that connects with the largest number of data storage, data pipelines, and data science platforms, enabling Snowflake users to use their preferred  tools (Notebooks, Kafka, Spark, Airflow, Github, Jenkins…) and the languages (-R, Python, Java, Scala)  at every point of the ML journey.

Feature groups & training datasets

In order for features to be managed efficiently across projects and roles, they are stored in well-defined groups and training datasets.

Project Management

The system is fully compliant with GDPR and HIPPA and provides a powerful tool for managing projects and assigning data owners and users with access to data.

Expectation rules for data validation

Hopsworks' powerful expectation rules system includes warnings and alerts as part of a powerful data management solution to improve data validation in batch or real-time.

Exploratory Data Analysis (EDA)

Data and feature previews with just a few clicks in the feature groups, allowing fast exploration and analysis of the data stored in the feature store.

Provenance, Activity and more

Activity and provenance log lists events triggered by activities that affect your organization with rules, alerts, ingestions, and jobs to follow the lifecycle of a project and auditing.

Feature Store
Data warehouse for ML
Distributed Deep Learning
Faster with more GPUs
HopsFS
World-leading HDFS filesystem
Horizontally Scalable
Ingestion, Dataprep, training, Serving
Data-Intensive AI

Hopsworks manages and processes your data at scale for AI. The Hopsworks Feature Store manages features to be used in both training and serving models. It builds on HopsFS, the world's most scalable hierarchical HDFS- filesystem.

You can scale out training and hyperparameter optimization with as many GPUs as you can put in your cluster. And we provide framework support (Maggy and PySpark) to make distributed ML as Pythonic as possible.

Notebooks for development
Run notebooks in ML pipelines
Version Everything
Code, Infrastructure, Data
Model Serving on Kubernetes
TF Serving, SkLearn, PyTorch
End-to-End ML Pipelines
Orchestrated by Airflow
Development & Operations

ML pipelines have become the defacto way to productionize ML models. Hopsworks uses Airflow to orchestrate pipelines consisting of anything from (Py)Spark jobs, to Python programs on K8s, to Jupyter notebooks, to TensorFlow Extended (Beam/Flink).

JupyterLab is provided as a collaborative development environment, while jobs can also be deployed as programs: Python, PySpark, Scala/Java Spark, Beam/Flink.

Project-based Multi-tenancy
Secure, collaboration in a share cluster
Encryption At-rest, In-Motion
TLS/SSL everywhere
AI-Asset Governance
Models, Experiments, Features, GPUs
Data/Model/Feature Lineage
Discover/track dependencies
Governance & Compliance

Hopsworks can version all ML artifacts in ML pipelines:  features in the feature store, train/test datasets, programs and pipelines in Github, and models in the model repository. Hopsworks also provides industry-leading support for provenance in ML pipelines: debug and explore lineage between processing steps and ML artifacts.

Unlike MLFlow and TFX, you do not need to re-write your piplines to add provenance: it is implicitly captured by our unique change-data-capture technology.

Feature Store
Data warehouse for ML
Distributed Deep Learning
Faster with more GPUs
HopsFS
World-leading HDFS filesystem
Scalable
Ingestion, Dataprep, training, Serving
Notebooks
Run notebooks in ML pipelines
Versioning
Code, infrastructure,
Data
Model Serving on Kubernetes
TF Serving,
SkLearn, PyTorch
End-to-End ML
Orchestrated by Airflow
Multi-tenancy
Secure, collaboration
in a share cluster
Encryption
TLS/SSL everywhere
Governance
Models, Experiments, 
Features, GPUs
Lineage
Discover/track dependencies