PaddyPower Betfair is the main brand of the Flutter Entertainment Group, the world’s largest operator of sports betting, poker, and casino, across online, mobile and retail channels.
Data warehouses are not feature stores and do not provide ready-made features that can be used directly for serving/training. We helped PaddyPower Betfair integrate Redshift with Hopsworks Feature Store, enabling data engineers and data scientists to write feature engineering pipelines with Python and PySpark, instead of SQL.
Paddy Power Betfair determines betting prices with the help of predictions generated from machine learning models.
Improved Feature Quality
Improved models that generate more revenue.
Faster Feature Engineering
Access to statistics and metadata, decreasing the time to generate training datasets.
Exploratory Data Analysis
Discover pre-computed features, types of those features, descriptive statistics and the distribution of feature values.
Previously engineered and quality-assured features become available to be reused - ready for training.
We integrated the Hopsworks Feature Store as a repository of features ready to be used for training models with the existing AWS SageMaker architecture. Data scientists and analysts can now browse available features, inspect their metadata, investigate pre-computed statistics, and preview sample feature values.
The Hopsworks Feature Store not only helps data scientists to become more productive but also improves feature quality which results in models that generate higher revenues. In addition, the Feature Store provides users with statistics and metadata, providing faster development of new betting models for new types of bets.
Hopsworks platform’s key capabilities used:
Live betting is the process of placing a bet after a sporting or racing event has started. We help organisations to reduce operational costs, improve accuracy of prediction models, and increase revenue by implementing the Hopsworks Feature Store to serve low latency features.
Similar to many AI-backed Internet services, companies offering live betting can benefit from online models generating predictions that can be used to help determine live odds in races and sports betting.
Online models require many input features to make accurate predictions, including low latency access to features that are computed from historical data. These types of features are often too complex to compute inside the online applications themselves and are impossible to reuse if they are embedded in applications.
When you implement a feature computation in the online application, you then have to ensure the consistency of the online feature implementation (in the app) with the feature implementation used to generate the train/test data for the model (the training data pipeline).
Consolidated Feature Engineering Pipelines
Feature engineering code is not duplicated in applications, instead a single pipeline computes features for serving and training.
Faster Models to Production
Data scientists can concentrate on improving models, and not on complex infrastructure for ensuring training and serving pipelines are kept in sync.
An online feature store can reuse features across training and serving, and provide low-latency access to features by online applications.
The Hopsworks Online Feature Store acts as an enterprise-wide marketplace for different teams with different remits. It serves pre-computed features to operational models used by online applications in single-digit milliseconds using either Python/Scala/Java clients or language independent JDBC.
It also enables the reuse of features and use case-specific ML-features for predictive betting models where features are reused across different models.
Hopsworks’ key capabilities for developing and operation online models:
Hopsworks is the world’s first horizontally scalable data platform for machine learning to provide a feature store. It aids in the cleaning of data and preparation of features, and it makes features reusable by other teams.
The Hopsworks Feature Store acts as an effective API between team members who are working on data engineering (and pulling data from backend data warehouses and data lakes) versus those working on data science (model building, training, and evaluation).
Security by design: Data scientists can be given sandboxed access to sensitive data, complying with GDPR and stronger security requirements.
Scale-out deep learning: Distributed Deep Learning over 10s or 100s of GPUs for parallel experiments and distributed training.
Provenance support for ML pipelines: Enables fully reproducible models, easier debugging, and comprehensive data governance for pipelines.
Integration with third party platforms: Seamless integration with data science platforms, such as AWS Sagemaker , Databricks and Kubeflow. Hopsworks also integrates with datalakes, such as S3, Hadoop, and Delta Lake. Hopsworks also supports single sign-on for ActiveDirectory, LDAP, and OAuth2.