CSV Loading Tips

Search Shortcut cmd + k | ctrl + k

Installation
Guides

Documentation

Connect
Data Import

Overview
CSV Files

JSON Files

Multiple Files

Parquet Files

Partitioning

Appender
Insert Statements

Client APIs

Introduction
Statements

Query Syntax

Data Types

Expressions

Functions

Extensions

Documentation / Data Import / CSV Files

CSV Loading Tips

Below is a collection of tips to help when attempting to process especially gnarly CSV files.

Override the header flag if the header is not correctly detected

If a file contains only string columns the header auto-detection might fail. Provide the header option to override this behavior.

SELECT * FROM read_csv_auto('flights.csv', header=True);

Provide names if the file does not contain a header

If the file does not contain a header, names will be auto-generated by default. You can provide your own names with the names option.

SELECT * FROM read_csv_auto('flights.csv', names=['FlightDate', 'UniqueCarrier']);

Override the types of specific columns

The types flag can be used to override types of only certain columns by providing a struct of name -> type mappings.

SELECT * FROM read_csv_auto('flights.csv', types={'FlightDate': 'DATE'});

The COPY statement copies data directly into a table. The CSV reader uses the schema of the table instead of auto-detecting types from the file. This speeds up the auto-detection, and prevents mistakes from being made during auto-detection.

COPY tbl FROM 'test.csv' (AUTO_DETECT 1);

The union_by_name option can be used to unify the schema of files that have different or missing columns. For files that do not have certain columns, NULL values are filled in.

SELECT * FROM read_csv_auto('flights*.csv', union_by_name=True);

Override the header flag if the header is not correctly detected

Provide names if the file does not contain a header

Override the types of specific columns

Use `COPY` when loading data into a table

Use `union_by_name` when loading files with different schemas

About this page

Override the header flag if the header is not correctly detected

Provide names if the file does not contain a header

Override the types of specific columns

Use COPY when loading data into a table

Use union_by_name when loading files with different schemas

About this page

Use `COPY` when loading data into a table

Use `union_by_name` when loading files with different schemas