No items found.

Finance & Banking

Leverage Deep Learning to improve anomaly detection rates and reduce costs associated with financial crime with Hopsworks.

Semi-supervised deep learning to identify fraudulent transactions

Swedbank AB is the largest financial centre in Scandinavia offering retail banking, asset management, and other financial services for 7 million private customers and 546,000 companies.

Leading financial institutions are now investing in ML to detect and prevent fraud. We helped Swedbank to introduce semi-supervised deep learning and  Generative Adversarial Networks using the Hopsworks platform, reducing costs associated with false positive fraud alerts.

THE PROMISE OF DEEP LEARNING FOR FIGHTING FRAUD

Challenge: Increase the detection rate and the reduce costs of transactions associated with financial crime.

Most banks employ rule-based systems, consisting of thousands of rules, that generate alerts when transactions are suspected of being fraudulent.  However, fraud schemes are complex and continually evolve to prevent existing rule-based pattern matching approaches from identifying and stopping financial crime - despite the best efforts by regulators and banks.

Rule-based systems generate a large number of false-positive alerts (alerts where the transaction did not involve fraud) that are very time-consuming and expensive to process. Rule-based systems also miss many fraudulent transactions (false-negative results), as fraudsters quickly learn new strategies that circumvent the slowly changing set of rules. Missing fraudulent transactions is very serious as it can lead to significant financial penalties and loss of reputation.

Key Results

Reduce the costs of alerts

Effective deep anomaly detection model that reduced the number of false-positives by 99% compared to a rule-based engine.

Faster Feature Engineering

Increased processing performance of data preparation and exploration, and building efficient feature pipelines that produced over 40TBs of feature data.

Faster Model Training

Distributed, multi-GPU training and parallel hyperparameter tuning.

Solution: Semi-supervised Deep Learning with Generative Adversarial Networks on Hopsworks

In Swedbank’s case, the rule-based system generates up to 99 false-positives for every 100 alerts. In pre-production evaluation, with Hopsworks’ semi-supervised deep learning model, Swedbank was able to reduce this to only 1 false-positive for every 2 alerts.

The key insight of deep learning for fraud detection is that deep neural networks (DNNs) can generalize from training data to identify patterns in transactions that are indicative of fraud. That is, having been shown some patterns in real fraudulent situations, DNNs can generalize from the examples to identify similar and modified patterns that could get around the static rules, but are similar enough to the old pattern that they are caught by the DNN. This makes it harder for fraudsters to avoid detection.  They will no longer be able to make small adjustments in how they launder their money to get around a relatively static set of rules.


Hopsworks’ key capabilities that we used are:

  • Spark for feature engineering;
  • Hopsworks Feature Store to manage and re-use features;
  • TensorFlow/GPUs to train a GAN as a binary classifier, with distributed hyperparameter tuning and distributed training support in Hopsworks;
  • Hopsworks model registry to manage models;
  • Integration with existing security infrastructure (ActiveDirectory) with single sign-on for users;
  • Integration with a Hadoop Data Lake.
Solve Fraud Challenges with Graph Network and Deep Learning in Hopsworks
Hopswork Data-Intensive Machine Learning with a Feature Store
Download the AML with Deep Learning White Paper

Scaling ML with the hopsworks feature store

Hopsworks is the world’s first horizontally scalable data platform for machine learning to provide a feature store. It aids in the cleaning of data and preparation of features, and it makes features reusable by other teams.

The Hopsworks Feature Store acts as an effective API between team members who are working on data engineering (and pulling data from backend data warehouses and data lakes) versus those working on data science (model building, training, and evaluation).


Security by design: Data scientists can be given sandboxed access to sensitive data, complying with GDPR and stronger security requirements.


Scale-out deep learning: Distributed Deep Learning over 10s or 100s of GPUs for parallel experiments and distributed training.


Provenance support for ML pipelines: Enables fully reproducible models, easier debugging, and comprehensive data governance for pipelines.


Integration with third party platforms: Seamless integration with data science platforms, such as AWS Sagemaker , Databricks and Kubeflow. Hopsworks also integrates with datalakes, such as S3, Hadoop, and Delta Lake. Hopsworks also supports single sign-on for ActiveDirectory, LDAP, and OAuth2.

Download the Hopsworks
Feature Store
White Paper

Hopsworks at a glance

Efficiency & Performance

Development & Operations

Governance & Compliance

Feature Store
Data warehouse for ML
Distributed Deep Learning
Faster with more GPUs
HopsFS
NVMe Speed with Big Data
Horizontally Scalable
Ingestion, Dataprep, training, Serving
Notebooks For development
First-class Python Support
Version Everything
Code, Infrastructure, Data
Model Serving on Kubernetes
TF Serving, MLeap, SkLearn
End-to-End ML Pipelines
Orchestrated by Airflow
Secure Multi-tenancy
Project-based restricted Access
Encription At-rest, In-Motion
TLS/SSL everywhere
AI-Asset Governance
Models, Experiment, data, GPUs
Data/Model/Feature Lineage
Discover/track dependencies

Efficiency & Performance

Feature Store
Data warehouse for ML
Distributed Deep Learning
Faster with more GPUs
HopsFS
NVMe Speed with Big Data
Horizontally Scalable
Ingestion, Dataprep, training, Serving

Development & Operations

Notebooks For development
First-class Python Support
Version Everything
Code, Infrastructure, Data
Model Serving on Kubernetes
TF Serving, MLeap, SkLearn
End-to-End ML Pipelines
Orchestrated by Airflow

Governance & Compliance

Secure Multi-tenancy
Project-based restricted Access
Encription At-rest, In-Motion
TLS/SSL everywhere
AI-Asset Governance
Models, Experiment, data, GPUs
Data/Model/Feature Lineage
Discover/track dependencies