Presto (on-premise data warehouse)

Presto

Presto definition from developers: Presto is an open-source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

This is designed for any data size. It supports Hadoop (HDFS), S3 (Amazon), Mongo DB, PostgreSQL, Teradata etc... 

Where we can fit this presto? 

Distributed systems like Hadoop or S3 to move the data for reporting applications like Tableau, MicroStrategy etc .. presto clusters will take the pressure of query execution, and The data transfer rates, especially for Tableau extracts, are much faster than existing JDBC connectivity.

The specialty of presto is we can query the data where it can store, without moving the data to any other analytical execution systems, its pure memory-based architecture.

Sample view of Presto CLI

No alt text provided for this image

Combine Data from multiple sources.

  • A Single Presto Query can even combine data from multiple sources
  • Ability to join data between all data sources integrated in presto.
  • one SQL query will join multiple data sources 
No alt text provided for this image


要查看或添加评论,请登录

Saikrishna Cheruvu的更多文章

社区洞察

其他会员也浏览了