BUILD FASTER MODELS FOR MORE ACCURATE BETTING PREDICTIONS WITH A FEATURE STORE

Challenges: Visualize and explore features, replace SQL-based feature pipelines with Python, and improve speed of feature engineering

Paddy Power Betfair determines betting prices with the help of predictions generated from machine learning models.

Data scientists were unable to easily discover and experiment with existing features and pipelines.
Sharing features across models was not possible.
Infrastructure depending heavily on a small and dedicated team to maintain it.
The data warehouse did not provide feature statistics or metadata, slowing down the process of feature engineering.
Python, the preferred programming language choice of most data scientists, is not supported in Redshift.

Key Results

Improved Feature Quality

Improved models that generate more revenue.

‍

Faster Feature Engineering

Access to statistics and metadata, decreasing the time to generate training datasets.

‍

Exploratory Data Analysis

Discover pre-computed features, types of those features, descriptive statistics and the distribution of feature values.

‍

Feature Reusability

Previously engineered and quality-assured features become available to be reused - ready for training.

Solution: Hopsworks Feature Store with Python and exploratory feature analysis.

We integrated the Hopsworks Feature Store as a repository of features ready to be used for training models with the existing AWS SageMaker architecture. Data scientists and analysts can now browse available features, inspect their metadata, investigate pre-computed statistics, and preview sample feature values.

‍

The Hopsworks Feature Store not only helps data scientists to become more productive but also improves feature quality which results in models that generate higher revenues. In addition, the Feature Store provides users with statistics and metadata, providing faster development of new betting models for new types of bets.

‍

Hopsworks platform’s key capabilities used:

Hopsworks Offline Feature Store to discover, explore and reuse features.
AWS SageMaker integration to join features together and visualize them.
Support for Python and PySpark to build feature engineering pipelines.

Hopsworks Feature Store for Live Betting Predictions

Live betting is the process of placing a bet after a sporting or racing event has started. We help organisations to reduce operational costs, improve accuracy of prediction models, and increase revenue by implementing the Hopsworks Feature Store to serve low latency features.

LOW LATENCY AND REDUCED OPERATIONAL COSTS FOR LIVE BETTING PREDICTIONS

Challenges: Support live betting with online models.

Similar to many AI-backed Internet services, companies offering live betting can benefit from online models generating predictions that can be used to help determine live odds in races and sports betting.

‍

Online models require many input features to make accurate predictions, including low latency access to features that are computed from historical data. These types of features are often too complex to compute inside the online applications themselves and are impossible to reuse if they are embedded in applications.

When you implement a feature computation in the online application, you then have to ensure the consistency of the online feature implementation (in the app) with the feature implementation used to generate the train/test data for the model (the training data pipeline).

Key Results

Consolidated Feature Engineering Pipelines

Feature engineering code is not duplicated in applications, instead a single pipeline computes features for serving and training.

‍

Faster Models to Production

Data scientists can concentrate on improving models, and not on complex infrastructure for ensuring training and serving pipelines are kept in sync.

Solution: Hopsworks Online Feature Store - low latency features from a single pipeline

An online feature store can reuse features across training and serving, and provide low-latency access to features by online applications.

‍

The Hopsworks Online Feature Store acts as an enterprise-wide marketplace for different teams with different remits. It serves pre-computed features to operational models used by online applications in single-digit milliseconds using either Python/Scala/Java clients or language independent JDBC.

It also enables the reuse of features and use case-specific ML-features for predictive betting models where features are reused across different models.

Hopsworks’ key capabilities for developing and operation online models:

Hopsworks Online Feature Store to serve low latency features;
HOF framework - real-time feature engineering using (Py)Spark;
Python, PySpark, Spark, or Flink for historical feature engineering;
S3, Hadoop, Delta Lake, JDBC integrations;
Custom Metadata Designer to manage, govern, and search for feature data with free-text search;
AWS SageMaker, Databricks, Kubeflow integration - train models using data from the Feature Store;
Single sign-on with ActiveDirectory, LDAP, and OAuth2;

Scaling ML with the hopsworks feature store

Hopsworks is the world’s first horizontally scalable data platform for machine learning to provide a feature store. It aids in the cleaning of data and preparation of features, and it makes features reusable by other teams.

The Hopsworks Feature Store acts as an effective API between team members who are working on data engineering (and pulling data from backend data warehouses and data lakes) versus those working on data science (model building, training, and evaluation).

Security by design: Data scientists can be given sandboxed access to sensitive data, complying with GDPR and stronger security requirements.

Scale-out deep learning: Distributed Deep Learning over 10s or 100s of GPUs for parallel experiments and distributed training.

Provenance support for ML pipelines: Enables fully reproducible models, easier debugging, and comprehensive data governance for pipelines.

Integration with third party platforms: Seamless integration with data science platforms, such as AWS Sagemaker , Databricks and Kubeflow. Hopsworks also integrates with datalakes, such as S3, Hadoop, and Delta Lake. Hopsworks also supports single sign-on for ActiveDirectory, LDAP, and OAuth2.

Download the Hopsworks
Feature Store White Paper

Sports Betting

Integrating Redshift with Hopsworks Feature Store

BUILD FASTER MODELS FOR MORE ACCURATE BETTING PREDICTIONS WITH A FEATURE STORE

Challenges: Visualize and explore features, replace SQL-based feature pipelines with Python, and improve speed of feature engineering

Key Results

Solution: Hopsworks Feature Store with Python and exploratory feature analysis.

Hopsworks Feature Store for Live Betting Predictions

LOW LATENCY AND REDUCED OPERATIONAL COSTS FOR LIVE BETTING PREDICTIONS

Challenges: Support live betting with online models.

Key Results

Solution: Hopsworks Online Feature Store - low latency features from a single pipeline

Scaling ML with the hopsworks feature store

Hopsworks at a glance

Efficiency & Performance

Development & Operations

Governance & Compliance

Efficiency & Performance

Development & Operations

Governance & Compliance

Products

Resources

Company