Search Shortcut cmd + k | ctrl + k
- Installation
- Guides
- Overview
- Data Import & Export
- CSV Import
- CSV Export
- Parquet Import
- Parquet Export
- Query Parquet
- HTTP Parquet Import
- S3 Parquet Import
- S3 Parquet Export
- JSON Import
- JSON Export
- Excel Import
- Excel Export
- SQLite Import
- PostgreSQL Import
- Meta Queries
- ODBC
- Python
- Install
- Execute SQL
- Jupyter Notebooks
- SQL on Pandas
- Import from Pandas
- Export to Pandas
- SQL on Arrow
- Import from Arrow
- Export to Arrow
- Relational API on Pandas
- Multiple Python Threads
- DuckDB with Ibis
- DuckDB with Polars
- DuckDB with Vaex
- DuckDB with DataFusion
- DuckDB with fsspec Filesystems
- SQL Features
- SQL Editors
- Data Viewers
- Documentation
- Connect
- Data Import
- Overview
- CSV Files
- JSON Files
- Multiple Files
- Parquet Files
- Partitioning
- Appender
- Insert Statements
- Client APIs
- Overview
- C
- Overview
- Startup
- Configure
- Query
- Data Chunks
- Values
- Types
- Prepared Statements
- Appender
- Table Functions
- Replacement Scans
- API Reference
- C++
- CLI
- Java
- Julia
- Node.js
- Python
- Overview
- Data Ingestion
- Result Conversion
- DB API
- Relational API
- Function API
- Types API
- Expression API
- Spark API
- API Reference
- Known Python Issues
- R
- Rust
- Scala
- Swift
- Wasm
- ADBC
- ODBC
- SQL
- Introduction
- Statements
- Overview
- Alter Table
- Alter View
- Attach/Detach
- Call
- Checkpoint
- Copy
- Create Macro
- Create Schema
- Create Sequence
- Create Table
- Create View
- Create Type
- Delete
- Drop
- Export
- Insert
- Pivot
- Select
- Set/Reset
- Unpivot
- Update
- Use
- Vacuum
- Query Syntax
- SELECT
- FROM & JOIN
- WHERE
- GROUP BY
- GROUPING SETS
- HAVING
- ORDER BY
- LIMIT
- SAMPLE
- UNNEST
- WITH
- WINDOW
- QUALIFY
- VALUES
- FILTER
- Set Operations
- Data Types
- Overview
- Bitstring
- Blob
- Boolean
- Date
- Enum
- Interval
- List
- Map
- NULL Values
- Numeric
- Struct
- Text
- Time
- Timestamp
- Time Zones
- Union
- Expressions
- Functions
- Overview
- Bitstring Functions
- Blob Functions
- Date Format Functions
- Date Functions
- Date Part Functions
- Enum Functions
- Interval Functions
- Nested Functions
- Numeric Functions
- Pattern Matching
- Text Functions
- Time Functions
- Timestamp Functions
- Timestamp with Time Zone Functions
- Utility Functions
- Aggregates
- Configuration
- Constraints
- Indexes
- Information Schema
- Metadata Functions
- Pragmas
- Rules for Case Sensitivity
- Samples
- Window Functions
- Extensions
- Sitemap
- Why DuckDB
- Media
- FAQ
- Code of Conduct
- Live Demo
Documentation
/ Extensions
Iceberg Extension
The iceberg
extension is a loadable extension that implements support for the Apache Iceberg format.
Installing and Loading
To install and load the iceberg
extension, run:
INSTALL iceberg;
LOAD iceberg;
Usage
To test the examples, download the iceberg_data.zip
file and unzip it.
Querying Individual Tables
SELECT count(*) FROM iceberg_scan('data/iceberg/lineitem_iceberg', ALLOW_MOVED_PATHS=true);
51793
The
ALLOW_MOVED_PATHS
option ensures that some path resolution is performed, which allows scanning Iceberg tables that are moved.
Access Iceberg Metadata
SELECT * FROM iceberg_metadata('data/iceberg/lineitem_iceberg', ALLOW_MOVED_PATHS=true);
│ manifest_path │ manifest_sequence_number │ manifest_content │ status │ content │ file_path │ file_format │ record_count │
│ varchar │ int64 │ varchar │ varchar │ varchar │ varchar │ varchar │ int64 │
├────────────────────────────────────────────────────────┼──────────────────────────┼──────────────────┼─────────┼──────────┼────────────────────────────────────────────────────────────────────────────────────┼─────────────┼──────────────┤
│ lineitem_iceberg/metadata/10eaca8a-1e1c-421e-ad6d-b2… │ 2 │ DATA │ ADDED │ EXISTING │ lineitem_iceberg/data/00041-414-f3c73457-bbd6-4b92-9c15-17b241171b16-00001.parquet │ PARQUET │ 51793 │
│ lineitem_iceberg/metadata/10eaca8a-1e1c-421e-ad6d-b2… │ 2 │ DATA │ DELETED │ EXISTING │ lineitem_iceberg/data/00000-411-0792dcfe-4e25-4ca3-8ada-175286069a47-00001.parquet │ PARQUET │ 60175 │
Visualizing Snapshots
SELECT * FROM iceberg_snapshots('data/iceberg/lineitem_iceberg');
│ sequence_number │ snapshot_id │ timestamp_ms │ manifest_list │
│ uint64 │ uint64 │ timestamp │ varchar │
├─────────────────┼─────────────────────┼─────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 1 │ 3776207205136740581 │ 2023-02-15 15:07:54.504 │ lineitem_iceberg/metadata/snap-3776207205136740581-1-cf3d0be5-cf70-453d-ad8f-48fdc412e608.avro │
│ 2 │ 7635660646343998149 │ 2023-02-15 15:08:14.73 │ lineitem_iceberg/metadata/snap-7635660646343998149-1-10eaca8a-1e1c-421e-ad6d-b232e5ee23d3.avro │