Hops Enterprise Scale-Out Deep Learning
Hops supports GPUs as a managed resources, and enables Data Scientists to easily parallelize experiments (such as hyperparameter optimization) and parallelize distributed training. Once a satisfactory model has been trained, just a few clicks are need to roll out that trained model into production, by deploying the model on TensorFlow Serving. Client applications can then send inferencing requests to the deployed model using gRPC.
Systems Support for Scale-Out Deep Learning
The Hopsworks platform is richer than the simplified above model shows. We need to close the loop by collecting data from client applications to build a continuous learning platform. We need data management, both for the client application feedback (typically ingested through Kafka), and for the long-term storage of training data and experiments/programs.