Published by Theofilos Kakantousis on

Introducing Hopsworks 0.10

Hopsworks 0.10 brings the latest features, improvements and bug fixes. It is the biggest release done so far, made up of 191 JIRAs including many new features. Also, this version marks the last of the 0.x series, as Hopsworks is gearing up towards its 1.x series starting with 1.0 end of Q3 2019.

Highlights

With HOPSWORKS-954, Hopsworks becomes the first Big Data & AI platform to support AMD’s ROCm open source Deep Learning framework. Users of a Hopsworks cluster that is equipped with AMD GPUs, can now run their Deep Learning TensorFlow/PyTorch etc. applications on AMD GPUs from within the familiar environment of a Hopsworks project. You can find more information in the 4th HopsML meetup talk AMD/ROCm for Hopsworks and in the Spark Summit talk, ROCm and Distributed Deep Learning on Spark and TensorFlow.

Expanding support for ML model serving, HOPSWORKS-751 brings support for scikit-learn serving, by running Flask servers as local processes or in Kubernetes scaling up and down dynamically based on load. Users can manage their scikit-learn serving from within the Model Serving service of their Hopsworks projects. Documentation for this new feature is available in our readthedocs pages.

HOPSWORKS-1089 introduces Maggy, a framework for efficient asynchronous optimization of expensive black-box functions on top of Apache Spark. More information is available at maggy.readthedocs.io pages, on Maggy’s GitHub repository and in the Maggy talk at the 4th HopsML meetup in Stockholm.

HOPSWORKS-852 integrates Petastorm in Hopsworks and its Feature Store. Petastorm, is an open source data access library that enables single machine or distributed training and evaluation of deep learning models directly from datasets in Apache Parquet format. You can get started with these notebooks in your Hopsworks instance!

Further highlights include:

  • HOPSWORKS-18: Apache Hive with PyHive are now fully supported in Hopsworks, you can start by running this notebook or by following our Apache Hive readthedocs page.
  • HOPSWORKS-923: Availability Zone Awareness in HopsFS, that is HopsFS will be enabled to run across data centers in the cloud within the same region, and will be made data-center tolerant. More information is available in slides of our BerlinBuzzworks 2019 talk.
  • HOPSWORKS-1110: There is now a Hopsworks Airflow operator that can create, update, start and stop a model serving instance localhost or on Kubernetes depending on the deployment type.
  • HOPSWORKS-966: Hopsworks integration with Apache Sqoop. Users can transfer data from their relational databases into their Hopsworks datalake, by using the HopsworksSqoop operation in an Airflow workflow.
  • HOPSWORKS-1019: Hopsworks now supports authenticating with OAuth2, an open standard for access delegation.

You can get started with Hopsworks by visiting our getting started guide.

Release Notes

Breaking changes

Users looking to migrate from a previous Hopsworks version, need to read this guide first which contains all the steps and configurations required for a smooth upgrade process.

New Feature

Improvement

  • [HOPSWORKS-566] – Move Hive scratchdir cleaner from hiveCleaner to Hopsworks
  • [HOPSWORKS-716] – Unify Spark Configuration for Jobs service and Jupyter
  • [HOPSWORKS-759] – selenium integration test
  • [HOPSWORKS-875] – Add PyTorch, Torchvision and Matplotlib to python base environment
  • [HOPSWORKS-897] – Hopsworks-ca should set expiration date not validity in days
  • [HOPSWORKS-911] – [Featurestore] Cache metadata on client
  • [HOPSWORKS-912] – [Featurestore] Support image and avro data types for training datasets
  • [HOPSWORKS-916] – kagent reporting error message for failed Conda commands
  • [HOPSWORKS-926] – Airflow should install dependencies based on the platform_family not platform
  • [HOPSWORKS-942] – [Featurestore] Enable to specify decimal precision
  • [HOPSWORKS-950] – Make Pandas dataframes and numpy arrays easier to read/write from HopsFS
  • [HOPSWORKS-956] – Hops Python library: hops.hdfs.copy_to_local() returns immediately if files are already local and unchanged.
  • [HOPSWORKS-963] – Make tls module in hops-util-py usable from a python notebook
  • [HOPSWORKS-967] – hops artifacts access
  • [HOPSWORKS-991] – Add new window button to experiments
  • [HOPSWORKS-997] – Set envs_dirs and pkgs_dirs in .condarc to handle anaconda version upgrades
  • [HOPSWORKS-999] – AirflowJWTManager recovery
  • [HOPSWORKS-1001] – Experiments service refactor certificates cleanup
  • [HOPSWORKS-1002] – MirroredStrategy support for multi-worker
  • [HOPSWORKS-1006] – Membrane proxy should not rename threads
  • [HOPSWORKS-1008] – Add activated date to user administration
  • [HOPSWORKS-1024] – [featurestore] Optimize metadata for clients
  • [HOPSWORKS-1025] – add Hive Partitioning support for feature groups
  • [HOPSWORKS-1029] – [Featurestore] Revise project roles permissions to feature store
  • [HOPSWORKS-1033] – Redesign service JWTs
  • [HOPSWORKS-1039] – [Featurestore] Make featurestore/training datasets folder more visible
  • [HOPSWORKS-1064] – After HOPS-1215, small files are a storage type
  • [HOPSWORKS-1070] – Refactor HdfsLeDescriptorsFacade to return a random NN for bootstrapping the client
  • [HOPSWORKS-1074] – Bump up airborne dependency
  • [HOPSWORKS-1083] – Python3.6 environment should be created by default
  • [HOPSWORKS-1091] – Add systemd dependencies to unit files to enable clean host restarts
  • [HOPSWORKS-1097] – [Featurestore] Parameterize dataset-dir in tour job
  • [HOPSWORKS-1099] – Get leader NN http address automatically from hdfs_le_descriptors
  • [HOPSWORKS-1100] – [Featurestore] update hops-petastorm to track petastorm 0.7.4
  • [HOPSWORKS-1104] – [featurestore] visualization of feature stats in %%local
  • [HOPSWORKS-1115] – Remove redundant code from Hadoop clients factory
  • [HOPSWORKS-1129] – Use stereotypes instead of alternatives for serving integration
  • [HOPSWORKS-1131] – [featurestore] verify that sparkSQL session has hive enabled
  • [HOPSWORKS-1134] – Add extra check when initializing Hopsworks db
  • [HOPSWORKS-1153] – [sparkmagic] Progress bar for multiple experiments from same app_id
  • [HOPSWORKS-1157] – Increase flyway migrate timeout for hops migrations

Bug

  • [HOPSWORKS-197] – Turning on and off tours is a terrible user experience
  • [HOPSWORKS-715] – Hopsworks singletons are not single
  • [HOPSWORKS-770] – Disable http port and TLS 1.0 on glassfish
  • [HOPSWORKS-849] – kagent – conda can get stuck
  • [HOPSWORKS-903] – Jupyter cleanup and timer bugfixes
  • [HOPSWORKS-904] – Users should be able to specify version of Python library to install when behind PyPi proxy
  • [HOPSWORKS-905] – remote_material_references is not cleaned up during project deletion
  • [HOPSWORKS-913] – Feature store quota missing from admin UI in hopsworks
  • [HOPSWORKS-917] – Non HA admin UI when HA namenodes
  • [HOPSWORKS-920] – Job monitor thread gets stuck if job is killed while running
  • [HOPSWORKS-927] – TensorBoard could not bind to an unsupported address family
  • [HOPSWORKS-928] – [Featurestore] Bug: nested spark schemas are not automatically translatable to Hive
  • [HOPSWORKS-929] – [Featurestore] Don’t allow hyphen in featuregroup names because Hive do not allow it
  • [HOPSWORKS-930] – [Featurestore] bug, in API retrieval code in feature-registry featurestore is undefined
  • [HOPSWORKS-932] – Jobs UI date selector should persist user-selected values
  • [HOPSWORKS-937] – Adjust Cgroup configuration to account for longer Cgroup deletion and make CPU utilization configurable
  • [HOPSWORKS-938] – Adding pyFiles to Jupyter does not add it to python path
  • [HOPSWORKS-941] – Fix dataset access request and related tests
  • [HOPSWORKS-945] – Zipping and unzipping is broken with Hops TLS
  • [HOPSWORKS-948] – [hops-hadoop-chef] Copy hadoop logs utility is not removed from crontab
  • [HOPSWORKS-949] – [airflow-chef] Install recipe does not create root hops installation directory
  • [HOPSWORKS-951] – sqoop and airflow bugfixes
  • [HOPSWORKS-953] – [kagent-chef] kagent falsely assumes service is dead when it restarts
  • [HOPSWORKS-955] – [airflow-chef] Use assigned roles in JWT to determine if a user is admin
  • [HOPSWORKS-958] – AirflowJWTManager randomly does not renew tokens
  • [HOPSWORKS-962] – copy_to_hdfs is broken in hops-util-py
  • [HOPSWORKS-965] – Certificate enddate should always be UTC
  • [HOPSWORKS-968] – Dataset browser pagination and select all not working
  • [HOPSWORKS-970] – upgrade chef ulimit cookbook
  • [HOPSWORKS-972] – Missing sasl/sasl.h breaks PyHopsHive installation
  • [HOPSWORKS-973] – Dela install should create the hops group before adding members to it
  • [HOPSWORKS-974] – Blank padded day in rspec breaks tests
  • [HOPSWORKS-975] – [airflow-chef] Change restart policy of airflow scheduler
  • [HOPSWORKS-976] – Airflow file manager does not refresh secret directory when navigate to another project
  • [HOPSWORKS-977] – Pin tornado version to 5.1.1
  • [HOPSWORKS-978] – dela::install should create /srv/hops/ before creating sub directories
  • [HOPSWORKS-979] – Cannot change default ‘admin’ password of admin@kth.se
  • [HOPSWORKS-980] – HOPSWORKS-950 breaks hops-util-py in python 2.7 environments
  • [HOPSWORKS-983] – [ePipe] Wrong type for long values in query condition handling
  • [HOPSWORKS-987] – User admin to slow and user search returning duplicated results
  • [HOPSWORKS-988] – Launching Jupyter server gradually gets slower and times out
  • [HOPSWORKS-990] – uploading big files from web ui can cause ‘invalidated token’
  • [HOPSWORKS-992] – [kagent-chef] Invalid command in kagent’s service status script
  • [HOPSWORKS-993] – [Feturestore] tf record schema infer for float arrays is not working correctly
  • [HOPSWORKS-994] – TensorBoard in experiments service fails to start if tensorflow-gpu is in the project environment
  • [HOPSWORKS-996] – Jobs monitor threads may saturate the system
  • [HOPSWORKS-1000] – Parameter servers in ParameterServerStrategy do not return Spark task when all workers are finished
  • [HOPSWORKS-1003] – Jupyter convert notebook should inject SparkSession and handle spaces in notebook name
  • [HOPSWORKS-1004] – YarnJobMonitor does not have a full list of Yarn running states
  • [HOPSWORKS-1007] – Install version specific kernel packages, not generic ones
  • [HOPSWORKS-1009] – Non-websocket requests in Jupyter notebook are blocked due to cross-origin policy
  • [HOPSWORKS-1011] – hops-metadata-dal-impl-ndb schema update renames hdfs_inode_attributes to hdfs_directory_with_quota_feature
  • [HOPSWORKS-1012] – Metadata designer does not work
  • [HOPSWORKS-1013] – Descrease websocket_ping_interval to avoid proxied websocket connections being dropped
  • [HOPSWORKS-1014] – ui bugs
  • [HOPSWORKS-1017] – Date in field “feature store created” is NaN
  • [HOPSWORKS-1022] – NOT_START is not a final state for Log Aggregation
  • [HOPSWORKS-1030] – [hops-util-py] copy_to_local bugfixes and removal of localize
  • [HOPSWORKS-1031] – Show enable conda message for PySpark Jobs if conda is disabled
  • [HOPSWORKS-1036] – JobScheduler gets expunged when trying to start an execution for a job while the job is still running
  • [HOPSWORKS-1040] – assign unique ids to growl
  • [HOPSWORKS-1041] – Uploading template on Metadata designer does not work
  • [HOPSWORKS-1042] – Python kernels with capital letters are filtered out by Jupyter Notebook
  • [HOPSWORKS-1043] – addUserToGroup throws exception if user is added twice
  • [HOPSWORKS-1047] – Cannot see datasets if an HiveDb has been shared with the project
  • [HOPSWORKS-1048] – Kagent elastic logs fails to be created for projects with capital letters
  • [HOPSWORKS-1049] – io.hops.util.exceptions.FeaturestoreNotFound logged/thrown during the Kafka tour
  • [HOPSWORKS-1051] – [airflow-chef] Colliding JWT libraries
  • [HOPSWORKS-1056] – [ePipe] wrong key type for inode id
  • [HOPSWORKS-1058] – [ePipe] wrong event ordering during recovery
  • [HOPSWORKS-1060] – tensorflow and torch lib conflicts
  • [HOPSWORKS-1061] – SegFault when importing tensorflow with petastorm in the wrong order
  • [HOPSWORKS-1066] – [hops-hadoop-chef] Recipes should not try to template JWT tokens when Hopsworks is not available
  • [HOPSWORKS-1067] – Hopsworks does not build with cluster profile enabled
  • [HOPSWORKS-1069] – Fix chef attr for hopsworks/https/port
  • [HOPSWORKS-1071] – spark.executor.instances should default to spark.dynamicAllocation.minExecutors for Spark Dynamic configurations
  • [HOPSWORKS-1075] – [kagent-chef] Fail fast if FQDN is longer than 63 characters
  • [HOPSWORKS-1077] – Review overridable Spark configuration properties for SparkJobConfiguration
  • [HOPSWORKS-1086] – Fix nvidia install provider
  • [HOPSWORKS-1087] – Fix nested transaction exception when starting jupyter
  • [HOPSWORKS-1088] – FeaturegroupController doesn’t handle certificates correctly
  • [HOPSWORKS-1090] – TensorBoardController nested transaction exception
  • [HOPSWORKS-1093] – Using hops-util inside a transformation in spark job throws NoClassDefFoundError
  • [HOPSWORKS-1101] – Fix JWT tests after changing default user account
  • [HOPSWORKS-1106] – Conda search versions not ordered
  • [HOPSWORKS-1112] – Fix ssl-server.xml for client machines.
  • [HOPSWORKS-1113] – Remove PIP upgrade in tensorflow::install
  • [HOPSWORKS-1116] – hopsworks_port conf has been removed, use hopsworks_endpoint instead
  • [HOPSWORKS-1121] – Use pip binary in base environment for search
  • [HOPSWORKS-1122] – Invalid JobConfiguration when EXPERIMENT is set as default
  • [HOPSWORKS-1125] – Pip libraries with no release date
  • [HOPSWORKS-1126] – [hops-util-py] avro parsing only works in python 2.7
  • [HOPSWORKS-1127] – [hops-util-py] rest error messages are not parsed correctly
  • [HOPSWORKS-1136] – Off by one error when templating Hopsworks schema upgrade files
  • [HOPSWORKS-1138] – Transaction batchsizes should not be hardcoded in hdfs-site.xml
  • [HOPSWORKS-1147] – Search results for hive datasets are not clickable
  • [HOPSWORKS-1148] – Job duration increasing forever
  • [HOPSWORKS-1149] – CA module should not have its own persistence.xml file
  • [HOPSWORKS-1150] – Users should not be allowed to start jobs with no yarn quota.
  • [HOPSWORKS-1165] – JAXB Date serializer for JWTResource should include TZ info
  • [HOPSWORKS-1168] – GPU warning being shown in Spark Static and Spark Dynamic views
  • [HOPSWORKS-1169] – [hops-util-py] CollectiveAllReduceStrategy and MirroredStrategy should have a ‘chief’ executor in cluster spec
  • [HOPSWORKS-1173] – Expose tez and slider user in hive-chef’s metadata.rb

Sub-task

Task

    Categories: