Logical Clocks was founded by researchers and we continue our research commitment within our field. Our areas include large-scale distributed computer systems and artificial intelligence. Here, we publish some of the research papers authored by our founders and staff.
Time Travel and Provenance for Machine Learning Pipelines
Nov 19, 2020
Implicit model for provenance can be used next to a feature store with versioned data to build reproducible and more easily debugged ML pipelines. We provide development tools and visualization support that can help developers more easily navigate and re-run pipelines .
Alexandru A. Ormenisan, Moritz Meister, Fabio Buso, Robin Andersson, Seif Haridi, Jim Dowling
Maggy: Scalable Asynchronous Parallel Hyperparameter Search
Nov 19, 2020
Maggy is an extension to Spark’s synchronous processing model to allow it to run asynchronous ML trials, enabling end-to-end state-of-the-art ML pipelines to be run fully on Spark. Maggy provides programming support for defining, optimizing, and running parallel ML trials.
Moritz Meister, Sina Sheikholeslami, Amir H. Payberah, Vladimir Vlassov, Jim Dowling
HopsFS-S3: Extending Object Stores with POSIX-like Semantics and more (industry track)
Nov 19, 2020
HopsFS-S3 is a hybrid cloud-native distributed hierarchical file system that is available across availability zones, has the same cost as S3, but has 100X the performance of S3 for file move/rename operations, and 3.4X the read throughput of S3 (EMRFS) for the DFSIO Benchmark.
Mahmoud Ismail, Salman Niazi, Gautier Berthou, Mikael Ronström, Seif Haridi, Jim Dowling
Distributed Hierarchical File Systems strike back in the Cloud
Nov 19, 2020
HopsFS-CL is a highly available distributed hierarchical file system with native support for AZ awareness using synchronous replication protocols.
Mahmoud Ismail, Salman Niazi, Mauritz Sundell, Mikael Ronstrom, Seif Haridi, and Jim Dowling
Implicit Provenance for Machine Learning Artifacts
Feb 24, 2020
Implicit provenance allows us to capture full lineage for ML programs, by only instrumenting the distributed file system and APIs and with no changes to the ML code.
Alexandru A. Ormenisan, Mahmoud Ismail, Seif Haridi, Jim Dowling
Towards Distribution Transparency for Supervised ML With Oblivious Training Functions
Feb 24, 2020
The distribution oblivious training function allows ML developers to reuse the same training function when running a single host Jupyter notebook or performing scale-out hyperparameter search and distributed training on clusters.
Moritz Meister, Sina Sheikholeslami, Robin Andersson, Alexandru A. Ormenisan, Jim Dowling
Scalable Block Reporting for HopsFS - Best Student Paper award at IEEE BigDataCongress’19
Jul 5, 2019
New version of block reporting protocol for HopsFS that uses up to 1/1000th of the resources of HDFS' block reporting protocol. IEEE BigDataCongress’19.
Mahmoud Ismail, August Bonds, Salman Niazi, Seif Haridi, Jim Dowling.
ePipe: Near Real-Time Polyglot Persistence of HopsFS Metadata
May 22, 2019
Change Data Capture paper for HopsFS (ePipe). CCGRID’19.
Mahmoud Ismail, Mikael Ronström, Seif Haridi, Jim Dowling.
Horizontally Scalable ML Pipelines with a Feature Store
Mar 20, 2019
Paper description of a demo given for Hopsworks ML pipeline at SysML 2019.
Alexandru A. Ormenisan, Mahmoud Ismail, Kim Hammar, Robin Andersson, Ermias Gebremeskel, Theofilos Kakantousis, Antonios Kouzoupis, Fabio Buso, Gautier Berthou, Jim Dowling, Seif Haridi.
Size Matters: Improving the Performance of Small Files in Hadoop
Dec 18, 2018
Describes how HopsFS supports small files in metadata on NVMe disks. Middleware 2018.
Salman Niazi, Seif Haridi, Mikael Ronström, Jim Dowling.
Scaling HDFS to more than 1 million operations per second with HopsFS
May 24, 2017
IEEE Scale Prize Winning submission, May 2017. Heavy on database optimizations in HopsFS' metadata layer.
Salman Niazi, Mahmoud Ismail, Mikael Ronström, Seif Haridi, Jim Dowling.
HopsFS: Scaling Hierarchical File System Metadata Using NewSQL Databases
Feb 7, 2017
First main paper on HopsFS at USENIX FAST 2017.
Salman Niazi, Mahmoud Ismail, Mikael Ronström, Steffen Grohsschmiedt, Seif Haridi, Jim Dowling.
Leader Election Using NewSQL Database Systems
Jun 18, 2015
HopsFS' leader election protocol that uses NDB as a backend. DAIS 2015: 158-172.
Salman Niazi, Mahmoud Ismail, Gautier Berthou, Jim Dowling.
Logical Clocks AB
are the makers of Hopsworks, a data-intensive AI platform with a Feature Store.
Terms of Service
Copyright © 2021 Logical Clocks AB. All Rights Reserved. Apache, Hadoop, Sqoop, Kafka, Hive, Spark, Ranger, ZooKeeper, Zeppelin, Slider, MapReduce, HDFS, YARN, Databricks, SageMaker and the Hadoop elephant and Apache project logos are either registered trademarks or trademarks of the
Apache Software Foundation
in the United States or other countries.