SQL on Hops

SQL-on-Hadoop enables the efficient querying of potentially petabytes of data. Hops supports both Apache Hive (LLAP) and Spark SQL. Both platforms can store their backing data in HopsFS. Hive also stores its metadata in the same database as HopsFS, and Hive’s metadata is kept consistent with HopsFS’ metadata through sharing the same database and using transactions and foreign keys to ensure the integrity of both HopsFS’ metadata and HIve’s metadata.

Hops also supports SparkSQL and Parquet, a popular approach for SQL-on-Hadoop.

 

SQL-on-Hops with Hive and SparkSQL

SQL-on-Hadoop enables the efficient querying of potentially petabytes of data.