Trino

Trino is an open-source, distributed SQL query engine designed for low-latency ad-hoc querying and analytics over large-scale datasets. Formerly known as PrestoSQL, Trino is built on a Massively Parallel Processing (MPP) architecture.

Key Characteristics

Decoupled Storage: Unlike a traditional database, Trino does not have a native storage format. Instead, it queries data in-place from remote sources, including data lakes (with formats like Apache Iceberg or Delta Lake), relational databases (MySQL, PostgreSQL), and NoSQL stores (Elasticsearch, Cassandra).
Query Federation: Allows users to write a single SQL query that joins data across multiple distinct storage systems (e.g., joining an Iceberg table on AWS S3 with customer profile data stored in PostgreSQL).
In-Memory Pipeline Execution: Designed for high performance, Trino processes queries in-memory across a cluster of workers without writing intermediate states to disk.

Trino is widely used to provide interactive BI and ad-hoc query capabilities on top of enterprise data lakes.

Part of the Data & AI Terms glossary.