Enterprise Hops

The Data Science Platform for Hadoop

Multi-Tenancy with Hopsworks

Hopsworks is both a UI and Rest-API platform for privacy-by-design Data Science on Hops Hadoop. Uniquely among Hadoop platforms, even sensitive data can be processed/stored in the Data Lake.

Batch, Streaming, SQL

Batch, Streaming, SQL

Apache Spark support for batch analytics, SparkSQL/Parquet, Spark Streaming, GraphX.

TensorFlow/Keras

TensorFlow/Keras

Train, deploy, and debug your models on clusters of GPUs with TensorFlow/Keras/PyTorch and debug with TensorBoard. One-click deployment of models to TensorFlow Serving.

Python-First

Python-First

The only Hadoop stack with full Conda and Pip support. Hopsworks Projects have their own their own conda environments in the data lake -Data Scientists can choose their own libraries.

Notebooks

Notebooks

Jupyter and Zeppelin Notebooks. Jupyter supports Python, Hive, and Sparkmagic kernels, for TensorFlow/Python/PySpark/Scala/Hive.

Apache Hive LLAP

Apache Hive LLAP

Petabyte scale data warehousing with Apache Hive LLAP. Zeppelin Interpreter support for interactive analytics and visualizations. UI-driven starting/stopping of LLAP clusters.

Elastic/Logstash/Kibana

Elastic/Logstash/Kibana

The ELK stack is integrated with Spark/TensorFlow applications for realtime logging, visualizations, and search.

InfluxDB/Telgegraf/Grafana

InfluxDB/Telgegraf/Grafana

Spark applications and Hops services are monitored and monitoring data is stored in the time-series database, InfluxDB. Time-series data is graphed with Grafana.

TLS Security

TLS Security

Hops is the only Hadoop distribution with a TLS certificate-based security model. Certificate management is more scalable than Kerberos' KDC, they enable external systems easier integrate of external devices, and enable multi-tenancy feature in Hopsworks.

Why Hops?

The award-winning platform
Adrian Colyer

Adrian Colyer

The Morning Blog

“If you’re working with big data and Hadoop, this could repay your investment ... many times over...Think of the capital and operational costs of having to stand up a second large-scale Hadoop cluster because your existing one is capacity or throughput limited. HopFS is a huge win if it eliminates your need to do that.”

IEEE Scale Prize Winner

IEEE Scale Prize Winner

HopsFS won the IEEE Scale Challenge for 2017. HopsFS is the world's most scalable hierarchical distributed filesystem.

Winner of AI Startup Battle at PAPIs 2018

Winner of AI Startup Battle at PAPIs 2018

Logical Clocks won the heavily contested AI startup battle at the 2018 PAPIs conference in London. Logical Clocks scale-out platform for AI  beat off competition, with four finalists competing on stage.

Blog

Big Data and Machine Learning
Hadoop

M is for Metadata

On the Importance of Metadata for a Usable Security Model in Hadoop Why Hadoop’s security model is wrong, and how to make an understandable and usable security for Hadoop using Projects and TLS. tl;dr Hops Read more

Hops

Introducing Hops Hadoop

Introducing Hops Hadoop tl;dr Hops is a new distribution of Apache Hadoop that uncorks Hadoop’s metadata bottleneck by introducing a scale-out distributed metadata layer. Hops can be scaled out at runtime by adding new nodes Read more

Offices

Stockholm

Box 1263, Isafjordsgatan 22
Stockholm
Sweden SE-16429

Silicon Valley

470 Ramona St
Palo Alto
California 94301

Contact Us