Apache Drill: Connecting Different Data Sources

By Pete Reilly

Share this post

At AnswerRocket, we give people the power to get answers from data without technical skills. With Apache Drill, we see tremendous synergies. While AnswerRocket enables search-driven analytics, Apache Drill empowers users by making the data easier to access. This is a powerful combination to enable self service data exploration and visualization.

With Apache Drill, users get seamless SQL-on-Hadoop including the ability to write queries against any combination of JSON files, CSVs, MAPR-FS, HDFS, Parquet files, and more. Apache Drill doesn’t require a schema and can query self describing data formats like JSON and Parquet with no prior knowledge of the file structure. This helps democratize data by eliminating the work required to extract, transform, and load it. Finally, since Drill is optimized for in-memory columnar queries and will run on a single machine up to 1000’s of servers, its fast and highly scalable.

When combined with AnswerRocket’s ability to translate natural language to SQL, users now have the power of natural language query on Hadoop.

The demonstration above includes the downloading of JSON files from Yelp and a CSV file containing date information. We create a view on the data, configure some english terms and demonstrate the ability to do self service big data exploration with natural language queries.

Photo by Ed Schipul / CC BY


Request a Demo

See how AnswerRocket can enable your team to make better, faster, data-driven decisions by simply asking questions.