The Hopsworks Blog

// Latest Posts

April 17, 2024

Job Scheduling & Orchestration using Hopsworks and Airflow

This article covers the different aspects of Job Scheduling in Hopsworks including how simple jobs can be scheduled through the Hopsworks UI by non-technical users

Ehsan Heydari

Build Your Own pdf.ai: Using both RAG and Fine-Tuning in one Platform

Build Your Own pdf.ai: Using both RAG and Fine-Tuning in one Platform

A summary from our LLM Makerspace event where we built our own pdf.ai using RAG and fine-tuning in one platform. Follow along the journey to build a LLM application from scratch.

Jim Dowling

Build Vs Buy: For Machine Learning/AI Feature Stores

April 10, 2024

17 min

Read

Build Vs Buy: For Machine Learning/AI Feature Stores

On the decision of building versus buying a feature store there are strategic and technical components to consider as it impacts both cost and technological debt.

Rik Van Bruggen

// Feature Store

Feature Store: The missing data layer for Machine Learning pipelines?

In this blog, we discuss the state-of-the-art in data management and machine learning pipelines (within the wider field of MLOps) and present the first open-source feature store, Hopsworks.

Jim Dowling

ML Engineer Guide: Feature Store vs Data Warehouse

The feature store is a data warehouse of features for machine learning (ML). Architecturally, it differs from the traditional data warehouse in that it is a dual-database.

Jim Dowling

How to build your own Feature Store

We have many conversations with companies and organizations who are deciding between building their own feature store and buying one. We thought we would share our experience of building one.

Jim Dowling

// MLOps

MLOps

July 11, 2023

Read

MLOps vs. DevOps: Best Practices, Challenges and Differences

Explore the convergence of MLOps and DevOps. Learn about their purposes, differences, and areas of integration and discover best practices, challenges, and their future potential.

Prithivee Ramalingam

MLOps Wars: Versioned Feature Data with a Lakehouse

With support to Apache Hudi, the Hopsworks Feature Store offers lakehouse capabilities to improve automated feature pipelines and training pipelines (MLOps).

Davit Bzhalava

MLOps with a Feature Store

This blog introduces platforms and methods for continuous integration (CI), delivery (CD), and training (CT) with ML platforms, with details on how to do CI/CD MLOps with a Feature Store.

Fabio Buso

// Data Engineering

Pandas2 and Polars for Feature Engineering

Pandas2 and Polars for Feature Engineering

We review Python libraries, such as Pandas, Pandas2 and Polars, for Feature Engineering, evaluate their performance and explore how they power machine learning use cases.

Haziqa Sajid

Feature Engineering with Apache Airflow

Unlock the power of Apache Airflow in the context of feature engineering. We will delve into building a feature pipeline using Airflow, focusing on two tasks: feature binning and aggregations.

Prithivee Ramalingam

Building Feature Pipelines with Apache Flink

Building Feature Pipelines with Apache Flink

Find out how to use Flink to compute real-time features and make them available to online models within seconds using Hopsworks.

Fabio Buso

// Data Science

Facebook Prophet for Time-Series Machine Learning

Facebook Prophet for Time-Series Machine Learning

Time-series data consists of records in chronological order and is crucial for forecasting trends and patterns. In this blog, we take a look at how to use Facebook Prophet for time-series forecasting.

Kais Laribi

Elasticsearch is dead, long live Open Distro for Elasticsearch

Elasticsearch is dead, long live Open Distro for Elasticsearch

In this blog, we describe how we leverage the authentication and authorization support in Open Distro for Elasticsearch to make elasticsearch a project-based multi-tenant service in Hopsworks.

Mahmoud Ismail

Deep Learning for Anti-Money Laundering with a feature store

Deep Learning for Anti-Money Laundering with a feature store

Deep learning is now the state-of-the-art technique for identifying financial transactions suspected of money laundering. It delivers a lower number of false positives and with higher accuracy.

Jim Dowling

// 5-minute Interviews

5-minute interview Abi Aryan

Meet Abi Aryan, ML engineer and founder of Abide AI. She tells us about her background in research but also why she decided to go back to working in the AI industry.

Hopsworks Team

5-minute interview Jiri Steuer

In this episode we meet Data Architect Jiri Steuer. This week we talk about why he thinks MLOps and feature stores are indispensable when it comes to building machine learning applications.

Hopsworks Team

5-minute interview Antonios Kouzoupis

Once again we get to meet a Hopsworks team member, this time we’re introduced to Software Engineer Antonios Kouzoupis. He talks about which part of his work with feature stores he finds most exciting.

Hopsworks Team

// All Categories

Job Scheduling & Orchestration using Hopsworks and Airflow

This article covers the different aspects of Job Scheduling in Hopsworks including how simple jobs can be scheduled through the Hopsworks UI by non-technical users

Ehsan Heydari

Build Your Own pdf.ai: Using both RAG and Fine-Tuning in one Platform

A summary from our LLM Makerspace event where we built our own pdf.ai using RAG and fine-tuning in one platform. Follow along the journey to build a LLM application from scratch.

Jim Dowling

April 10, 2024

17 min

Read

Build Vs Buy: For Machine Learning/AI Feature Stores

On the decision of building versus buying a feature store there are strategic and technical components to consider as it impacts both cost and technological debt.

Rik Van Bruggen

Unlocking the Power of Function Calling with LLMs

Unlocking the Power of Function Calling with LLMs

This is a summary of our latest LLM Makerspace event where we pulled back the curtain on a exciting paradigm in AI – function calling with LLMs.

Jim Dowling

Doubling Down on Open Source: How RonDB Upholds the Principles Redis Left Behind

Doubling Down on Open Source: How RonDB Upholds the Principles Redis Left Behind

Redis will no longer be open source. Our own project, RonDB, will continue being open source in order to uphold the principles that keeps the technology advancing.

Mikael Ronström

The Enterprise Journey to introducing a Software Factory for AI Systems

March 21, 2024

19 min

Read

The Enterprise Journey to introducing a Software Factory for AI Systems

In this article we describe the software factory approach to building and maintaining AI systems.

Jim Dowling

GenAI comes to Hopsworks with Vector Similarity Search

GenAI comes to Hopsworks with Vector Similarity Search

Hopsworks has added support for approximate nearest neighbor (ANN) indexing and vector similarity search for vector embeddings stored in its feature store.

Kenneth Mak

Delta Lake comes to Hopsworks

Hopsworks has added support for Delta Lake to accelerate our mission to build the Python-Native Data for AI platform.

Jim Dowling

Federated Data with the Hopsworks Feature Query Service

Federated Data with the Hopsworks Feature Query Service

A tutorial of the Hopsworks Feature Query Service which efficiently queries and joins features from multiple platforms such as Snowflake, BigQuery and Hopsworks without data any duplication.

Steffen Grohsschmiedt

5 Machine Learning Myths Debunked

The rapid development pace in AI is the cause for a lot of misconceptions surrounding ML and MLOps. In this post we debunk a few common myths about MLOps, LLMs and machine learning in production.

Carolin Svenberg

F.A.I.R. Principles in Data for AI

At Hopsworks the F.A.I.R principles have been a cornerstone of our approach in designing a platform for managing machine learning data and infrastructure.

Lex Avstreikh

Common Error Messages in Pandas

We go through the most common errors messages in Pandas and offer solutions to these errors as well as provide efficiency tips for Pandas code.

Haziqa Sajid

Multi-Region High Availability Comes to Feature Stores

Following our previous blog, we expand on the architecture to fit a Tier 1 classification where all components of Hopsworks are replicated in a different geographical region.

Antonios Kouzoupis

Feature Store Benchmark Comparison: Hopsworks and Feast

A comparison of the online feature serving performance for Hopsworks and Feast feature stores, contrasting the approaches to building a feature store.

Dhananjay Mukhedkar

The Guardrails for High Risk AI Required by the EU AI Act

December 19, 2023

Read

The Guardrails for High Risk AI Required by the EU AI Act

As a continuation to our last blog on the EU AI Act, this blog explores the implications of how Hopsworks machine learning systems can address the act's requirements for high risk AI applications.

Rik Van Bruggen

Multi-Region High Availability Comes to Feature Stores

Learn how Hopsworks achieves high availability for the Online and Offline Feature Store and a Metadata layer, making it an operational ML system both for on-premise and cloud infrastructure.

Antonios Kouzoupis

December 13, 2023

9 min

Read

High Risk AI in the EU AI Act

What is the corporate and societal significance of the EU AI Act and how does it impact organizations with high risk AI systems?

Rik Van Bruggen

December 8, 2023

8 min

Read

From BI to AI: A Data-Driven Journey

Data is evolving from traditional Business Intelligence to Artificial Intelligence and Machine Learning for predictive analytics, creating new requirements for how businesses operationalises.

Rik Van Bruggen

Feature Engineering with DBT for Data Warehouses

Feature Engineering with DBT for Data Warehouses

Read about the advantages of using DBT for data warehouses and how it's positioned as a preferred solution for many data analytics and engineering teams.

Kais Laribi

What is MLOps?

This blog explores MLOps principles, with a focus on versioning, and provides a practical example using Hopsworks for both data and model versioning.

Haziqa Sajid

Pandas2 and Polars for Feature Engineering

We review Python libraries, such as Pandas, Pandas2 and Polars, for Feature Engineering, evaluate their performance and explore how they power machine learning use cases.

Haziqa Sajid

How to Build a Python Environment with Custom Docker Commands

How to Build a Python Environment with Custom Docker Commands

With the latest version of Hopsworks we introduce new capabilities such as running custom bash commands and a improved UI which shows you the history of the python environment.

Gibson Chikafa

Machine Learning Embeddings as Features for Models

Machine Learning Embeddings as Features for Models

Delve into the profound implications of machine learning embeddings, their diverse applications, and their crucial role in reshaping the way we interact with data.

Prithivee Ramalingam

Facebook Prophet for Time-Series Machine Learning

Kais Laribi

Bring Your Own Kafka Cluster to Hopsworks

Bring Your Own Kafka Cluster to Hopsworks

A tutorial of how to use our latest Bring Your Own Kafka (BYOK) capability in Hopsworks. It allows you to connect your existing Kafka clusters to your Hopsworks cluster.

Ralfs Zangis

Feature Engineering with Apache Airflow

Unlock the power of Apache Airflow in the context of feature engineering. We will delve into building a feature pipeline using Airflow, focusing on two tasks: feature binning and aggregations.

Prithivee Ramalingam

Automated Feature Engineering with FeatureTools

Automated Feature Engineering with FeatureTools

An ML model’s ability to learn and read data patterns largely depend on feature quality. With frameworks such as FeatureTools ML practitioners can automate the feature engineering process.

Haziqa Sajid

MLOps

July 11, 2023

Read

MLOps vs. DevOps: Best Practices, Challenges and Differences

Explore the convergence of MLOps and DevOps. Learn about their purposes, differences, and areas of integration and discover best practices, challenges, and their future potential.

Prithivee Ramalingam

Building Feature Pipelines with Apache Flink

Find out how to use Flink to compute real-time features and make them available to online models within seconds using Hopsworks.

Fabio Buso

Feature Engineering for Categorical Features with Pandas

Feature Engineering for Categorical Features with Pandas

Explore the power of feature engineering for categorical features using Pandas. Learn essential techniques for handling categorical variables, and creating new features.

Prithivee Ramalingam

Hopsworks meets SOC2 Type II standards for data security and privacy

April 25, 2023

2 min

Read

Hopsworks meets SOC2 Type II standards for data security and privacy

Hopsworks has successfully completed the AICPA Service Organization Control (SOC) 2 Type II audit.

Carolin Svenberg

Hopsworks receives ISO 27001 Certification for Data Security

April 4, 2023

2 min

Read

Hopsworks receives ISO 27001 Certification for Data Security

Hopsworks has received an ISO 27001 certification, the globally recognized standard for establishing, implementing, maintaining, and continually improving an information security management system.

Carolin Svenberg

ROI of Feature Stores

This blog analyses the cost-benefits of Feature Stores for Machine Learning and estimates your return on investment with our Feature Store ROI Calculator.

Jim Dowling

Hopsworks 3.1 Product Updates: Feature Store & UI Improvements

February 2, 2023

4 min

Read

Hopsworks 3.1 Product Updates: Feature Store & UI Improvements

Read about Hopsworks 3.1 and the new improvements in the feature store (time-series splits for training data, support for managing thousands of models), stability and user-interface.

Jim Dowling

Feature Store: The missing data layer for Machine Learning pipelines?

In this blog, we discuss the state-of-the-art in data management and machine learning pipelines (within the wider field of MLOps) and present the first open-source feature store, Hopsworks.

Jim Dowling

Optimize your MLOps Workflow with a Feature Store CI/CD and Github Actions

In this blog we present an end to end Git based workflow to test and deploy feature engineering, model training and inference pipelines.

Fabio Buso

Data Validation for Enterprise AI: Using Great Expectations with Hopsworks

Data Validation for Enterprise AI: Using Great Expectations with Hopsworks

Learn more about how Hopsworks stores both data and validation artifacts, enabling easy monitoring on the Feature Group UI page.

Victor Jouffrey

How to use external data stores as an offline feature store in Hopsworks with Connector API

How to use external data stores as an offline feature store in Hopsworks with Connector API

In this blog, we introduce Hopsworks Connector API that is used to mount a table in an external data source as an external feature group in Hopsworks.

Dhananjay Mukhedkar

Great Models Require Great MLOps: Using Weights & Biases with Hopsworks

Discover how you can easily make the journey from ML models to putting prediction services in production by choosing best-of-breed technologies.

Moritz Meister

From Pandas to Features to Models to Predictions - A deep dive into the Hopsworks APIs

From Pandas to Features to Models to Predictions - A deep dive into the Hopsworks APIs

Learn how the Hopsworks feature store APIs work and what it takes to go from a Pandas DataFrame to features used by models for both training and inference.

Fabio Buso

Introducing the Serverless Feature Store

Hopsworks Serverless is the first serverless feature store for ML, allowing you to manage features and models seamlessly without worrying about scaling, configuration or management of servers.

Jim Dowling

Hopsworks 3.0: The Python-Centric Feature Store

Hopsworks is the first feature store to extend its support from the traditional Big Data platforms to the Pandas-sized data realm, where Python reigns supreme. A new Python API is also provided.

Jim Dowling

Hopsworks 3.0 - Connecting Python to the Modern Data Stack

Hopsworks 3.0 - Connecting Python to the Modern Data Stack

Hopsworks 3.0 is a new release focused on best-in-class Python support, Feature Views unifying Offline and Online read APIs to the Feature Store, Great Expectations support, KServe and a Model serving

Jim Dowling

A Spark Join Operator for Point-in-Time Correct Joins

A Spark Join Operator for Point-in-Time Correct Joins

In this blog post we showcase the results of a study that examined point-in-time join optimization using Apache Spark in Hopsworks.

Axel Pettersson

The EU AI Act: Time to Govern your AI or Turn it off

June 23, 2022

10 min

Read

The EU AI Act: Time to Govern your AI or Turn it off

An introduction to the EU AI Act and how Feature Stores provide a great solution to the obligations imposed by the regulation.

Geoff Burne

Testing feature logic, transformations, and feature pipelines with pytest

Testing feature logic, transformations, and feature pipelines with pytest

Operational machine learning requires the offline and online testing of both features and models. In this article, we show you how to design, build, and run test for features.

Jim Dowling

Hopsworks 2.5 Product Updates: Collaboration & Scalability

March 4, 2022

3 min

Read

Hopsworks 2.5 Product Updates: Collaboration & Scalability

We go through the new features and developments in Hopsworks 2.5 that will benefit open-source users and customers alike.

Fabio Buso

Model analysis on the What-If framework for TensorFlow on Hopsworks

Model analysis on the What-If framework for TensorFlow on Hopsworks

We introduce how to use the What-If Tool as a Jupyter plugin on Hopsworks to build better machine learning models by making it easier to ask counterfactual questions about your model’s behaviour.

Anastasiia Andriievska

How to Transform Snowflake Data into Features with Hopsworks

How to Transform Snowflake Data into Features with Hopsworks

Learn how to connect Hopsworks to Snowflake and create features and make them available both offline in Snowflake and online in Hopsworks.

Fabio Buso

Show me the code; how we linked notebooks to features

Show me the code; how we linked notebooks to features

We are introducing a new feature in Hopsworks UI - feature code preview - ability to view the notebook used to create a Feature Group or Training Dataset.

Jim Dowling

Receiving Alerts in Slack/Email/PagerDuty from Hopsworks

Receiving Alerts in Slack/Email/PagerDuty from Hopsworks

Learn how to set up customized alerts in Hopsworks for different events that are triggered as part of the ingestion pipeline.

Ermias Gebremeskel

End-to-end Deep Learning Pipelines with Earth Observation Data in Hopsworks

End-to-end Deep Learning Pipelines with Earth Observation Data in Hopsworks

In this blog post we demonstrate how to build such a pipeline with real-world data in order to develop an iceberg classification model.

Theofilos Kakantousis

Using an External Python Kafka Client to Interact with a Hopsworks Cluster

Using an External Python Kafka Client to Interact with a Hopsworks Cluster

Learn how to publish (write) and subscribe to (read) streams of events and how to interact with the schema registry and use Avro for data serialization.

Ahmad Al-Shishtawy

MLOps Wars: Versioned Feature Data with a Lakehouse

With support to Apache Hudi, the Hopsworks Feature Store offers lakehouse capabilities to improve automated feature pipelines and training pipelines (MLOps).

Davit Bzhalava

AI Software Architecture for Copernicus Data with Hopsworks

AI Software Architecture for Copernicus Data with Hopsworks

Hopsworks brings support for scale-out AI with the ExtremeEarth project which focuses on the most concerning issues of food security and sea mapping.

Theofilos Kakantousis

Hopsworks Online Feature Store: Fast Access to Feature Data for AI Applications

Read about how the Hopsworks Feature Store abstracts away the complexity of a dual database system, unifying feature access for online and batch applications.

Moritz Meister

How to build ML models with fastai and Jupyter in Hopsworks

How to build ML models with fastai and Jupyter in Hopsworks

This tutorial gives an overview of how to work with Jupyter on the platform and train a state-of-the-art ML model using the fastai python library.

Robin Andersson

Scalable metadata: the new breed of file systems (em)powering big data companies

Scalable metadata: the new breed of file systems (em)powering big data companies

HopsFS is an open-source scaleout metadata file system, but its primary use case is not Exabyte storage, rather customizable consistent metadata.

Jim Dowling

Distributed ML Experiments on Databricks with Maggy

Distributed ML Experiments on Databricks with Maggy

Learn how to train a ML model in a distributed fashion without reformatting our code on Databricks with Maggy, open source tool available on Hopsworks.

Riccardo Grigoletto

How to manage Python libraries in Hopsworks

How to manage Python libraries in Hopsworks

This tutorial will show an overview of how to install and manage Python libraries in the platform.

Robin Andersson

Beyond Brainless AI with a Feature Store

Beyond Brainless AI with a Feature Store

Evolve your models from stateless AI to Total Recall AI with the help of a Feature Store.

Jim Dowling

From 100 to ZeRO: PyTorch and DeepSpeed ZeRO on any Spark Cluster with Maggy

From 100 to ZeRO: PyTorch and DeepSpeed ZeRO on any Spark Cluster with Maggy

Use open-source Maggy to write and debug PyTorch code on your local machine and run the code at scale without changing a single line in your program.

Moritz Meister

Detecting Financial Fraud Using GANs at Swedbank with Hopsworks and NVIDIA GPUs

Detecting Financial Fraud Using GANs at Swedbank with Hopsworks and NVIDIA GPUs

Recently, one of Sweden’s largest banks trained generative adversarial neural networks (GANs) using NVIDIA GPUs as part of its fraud and money-laundering prevention strategy.

Jim Dowling

AI/ML needs a Key-Value store, and Redis is not up to it

AI/ML needs a Key-Value store, and Redis is not up to it

Seeing how Redis is a popular open-source feature store with features significantly similar to RonDB, we compared the innards of RonDB’s multithreading architecture to the commercial Redis products.

Mikael Ronström

How to engineer and use Features in Azure ML Studio with the Hopsworks Feature Store

Learn how to design and ingest features, browse existing features, create training datasets as DataFrames or as files on Azure Blob storage.

Moritz Meister

How to transform Amazon Redshift data into features with Hopsworks Feature Store

Connect the Hopsworks Feature Store to Amazon Redshift to transform your data into features to train models and make predictions.

Ermias Gebremeskel

Elasticsearch is dead, long live Open Distro for Elasticsearch

In this blog, we describe how we leverage the authentication and authorization support in Open Distro for Elasticsearch to make elasticsearch a project-based multi-tenant service in Hopsworks.

Mahmoud Ismail

HopsFS file system: 100X Times Faster than AWS S3

HopsFS file system: 100X Times Faster than AWS S3

Many developers believe S3 is the "end of file system history". It is impossible to build a file/object storage system on AWS that can compete with S3 on cost. But what if you could build on top of S3

Mahmoud Ismail

Feature Store for MLOps? Feature reuse means JOINs

Use JOINs for feature reuse to save on infrastructure and the number of feature pipelines needed to maintain models in production.

Jim Dowling

ML Engineer Guide: Feature Store vs Data Warehouse

The feature store is a data warehouse of features for machine learning (ML). Architecturally, it differs from the traditional data warehouse in that it is a dual-database.

Jim Dowling

One Function is All you Need: Machine Learning Experiments with Hopsworks

One Function is All you Need: Machine Learning Experiments with Hopsworks

Hopsworks supports machine learning experiments to track and distribute ML for free and with a built-in TensorBoard.

Robin Andersson

How we secure your data with Hopsworks

Integrate with third-party security standards and take advantage from our project-based multi-tenancy model to host data in one single shared cluster.

Jim Dowling

Beyond Self-Driving Cars

This blog introduces the feature store as a new element in automotive machine learning (ML) systems and as a new data science tool and process for building and deploying better Machine learning models

Remco Frijling

Unifying Single-host and Distributed Machine Learning with Maggy

Unifying Single-host and Distributed Machine Learning with Maggy

Try out Maggy for hyperparameter optimization or ablation studies now on Hopsworks.ai to access a new way of writing machine learning applications.

Moritz Meister

Manage your own Feature Store on Kubeflow with Hopsworks

Learn how to integrate Kubeflow with Hopsworks and take advantage of its Feature Store and scale-out deep learning capabilities.

Jim Dowling

How to build your own Feature Store

We have many conversations with companies and organizations who are deciding between building their own feature store and buying one. We thought we would share our experience of building one.

Jim Dowling

Hopsworks Feature Store for AWS SageMaker

Integrate AWS SageMaker with Hopsworks to manage, discover and use features for creating training datasets and for serving features to operational models.

Fabio Buso

Hopsworks Feature Store for Databricks

This blog introduces the Hopsworks Feature Store for Databricks, and how it can accelerate and govern your model development and operations on Databricks.

Fabio Buso

ExtremeEarth scales AI to the Earth Observation Community with Hopsworks

ExtremeEarth scales AI to the Earth Observation Community with Hopsworks

Read how ExtremeEarth brings Large-scale AI to the Earth Observation Community with Hopsworks, the Data-intensive AI Platform.

Theofilos Kakantousis

Towards better AI-models in the betting industry with a Feature Store

Introducing the feature store which is a new data science tool for building and deploying better AI models in the gambling and casino business.

Jim Dowling

MLOps with a Feature Store

This blog introduces platforms and methods for continuous integration (CI), delivery (CD), and training (CT) with ML platforms, with details on how to do CI/CD MLOps with a Feature Store.

Fabio Buso

Deep Learning for Anti-Money Laundering with a feature store

Deep learning is now the state-of-the-art technique for identifying financial transactions suspected of money laundering. It delivers a lower number of false positives and with higher accuracy.

Jim Dowling

Guide to File Formats for Machine Learning

Guide to File Formats for Machine Learning

This blog is a guide to the popular file formats used in open source frameworks for machine learning in Python, including TensorFlow/Keras, PyTorch, Scikit-Learn, and PySpark.

Jim Dowling

Hello Asynchronous Search for PySpark

Read how Hopsworks supports easy hyperparameter optimization (both synchronous and asynchronous search), distributed training using PySpark.

Moritz Meister

Deep Learning: Use a Cluster Manager for GPUs

Deep Learning: Use a Cluster Manager for GPUs

If you are employing a team of Data Scientists for Deep Learning, a cluster manager to share GPUs between your team will maximize utilization of your GPUs.

Jim Dowling

Goodbye Horovod, Hello TensorFlow

Hopsworks is replacing Horovod with Keras/TensorFlow’s new CollectiveAllReduceStrategy, a part of Keras/TensorFlow Estimator framework.

Jim Dowling

What is Distributed File Systems (DFS) and why you need it for Deep Learning

What is Distributed File Systems (DFS) and why you need it for Deep Learning

Why HopsFS is a great choice as a distributed file system (DFS) in a time when DFS is becoming increasingly indispensable as a central store for training data, logs, model serving, and checkpoints.

Jim Dowling

NVMe now in HopsFS

Read how HopsFS addresses this issue of storing and accessing datasets as image files stored in a distributed file system (DFS) at large scale.

Jim Dowling

From MLOps to ML Systems with Feature/Training/Inference Pipelines

Explore by Category

Build Your Own pdf.ai: Using both RAG and Fine-Tuning in one Platform

Build Vs Buy: For Machine Learning/AI Feature Stores

// Latest Posts

Job Scheduling & Orchestration using Hopsworks and Airflow

Build Your Own pdf.ai: Using both RAG and Fine-Tuning in one Platform

Build Vs Buy: For Machine Learning/AI Feature Stores

// Feature Store

Feature Store: The missing data layer for Machine Learning pipelines?

ML Engineer Guide: Feature Store vs Data Warehouse

How to build your own Feature Store

// MLOps

MLOps vs. DevOps: Best Practices, Challenges and Differences

MLOps Wars: Versioned Feature Data with a Lakehouse

MLOps with a Feature Store

// Data Engineering

Pandas2 and Polars for Feature Engineering

Feature Engineering with Apache Airflow

Building Feature Pipelines with Apache Flink

// Data Science

Facebook Prophet for Time-Series Machine Learning

Elasticsearch is dead, long live Open Distro for Elasticsearch

Deep Learning for Anti-Money Laundering with a feature store

// 5-minute Interviews

5-minute interview Abi Aryan

5-minute interview Jiri Steuer

5-minute interview Antonios Kouzoupis

// All Categories

Job Scheduling & Orchestration using Hopsworks and Airflow

Build Your Own pdf.ai: Using both RAG and Fine-Tuning in one Platform

Build Vs Buy: For Machine Learning/AI Feature Stores

Unlocking the Power of Function Calling with LLMs

Doubling Down on Open Source: How RonDB Upholds the Principles Redis Left Behind

The Enterprise Journey to introducing a Software Factory for AI Systems

GenAI comes to Hopsworks with Vector Similarity Search

Delta Lake comes to Hopsworks

Federated Data with the Hopsworks Feature Query Service

5 Machine Learning Myths Debunked

F.A.I.R. Principles in Data for AI

Common Error Messages in Pandas

Multi-Region High Availability Comes to Feature Stores

Feature Store Benchmark Comparison: Hopsworks and Feast

The Guardrails for High Risk AI Required by the EU AI Act

Multi-Region High Availability Comes to Feature Stores

High Risk AI in the EU AI Act

From BI to AI: A Data-Driven Journey

Feature Engineering with DBT for Data Warehouses

What is MLOps?

Pandas2 and Polars for Feature Engineering

How to Build a Python Environment with Custom Docker Commands

Machine Learning Embeddings as Features for Models

Facebook Prophet for Time-Series Machine Learning

Bring Your Own Kafka Cluster to Hopsworks

Feature Engineering with Apache Airflow

Automated Feature Engineering with FeatureTools

MLOps vs. DevOps: Best Practices, Challenges and Differences

Building Feature Pipelines with Apache Flink

Feature Engineering for Categorical Features with Pandas

Hopsworks meets SOC2 Type II standards for data security and privacy

Hopsworks receives ISO 27001 Certification for Data Security

ROI of Feature Stores

Hopsworks 3.1 Product Updates: Feature Store & UI Improvements

Feature Store: The missing data layer for Machine Learning pipelines?

Optimize your MLOps Workflow with a Feature Store CI/CD and Github Actions

Data Validation for Enterprise AI: Using Great Expectations with Hopsworks

How to use external data stores as an offline feature store in Hopsworks with Connector API

Great Models Require Great MLOps: Using Weights & Biases with Hopsworks

From Pandas to Features to Models to Predictions - A deep dive into the Hopsworks APIs

Introducing the Serverless Feature Store

Hopsworks 3.0: The Python-Centric Feature Store

Hopsworks 3.0 - Connecting Python to the Modern Data Stack

A Spark Join Operator for Point-in-Time Correct Joins

The EU AI Act: Time to Govern your AI or Turn it off

Testing feature logic, transformations, and feature pipelines with pytest

Hopsworks 2.5 Product Updates: Collaboration & Scalability

Model analysis on the What-If framework for TensorFlow on Hopsworks

How to Transform Snowflake Data into Features with Hopsworks

Show me the code; how we linked notebooks to features

Receiving Alerts in Slack/Email/PagerDuty from Hopsworks

NVMe now in HopsFS