duckdb.threadsafety bool

表示此包是线程安全的

duckdb.apilevel int

指示此包实现的Python DBAPI版本

duckdb.paramstyle str

指示duckdb支持哪种参数样式

duckdb.default_connection duckdb.DuckDBPyConnection

如果您没有明确地将连接传递给此模块中的根方法,则默认使用的连接

class duckdb.BinaryValue(object: Any)

基础类:Value

exception duckdb.BinderException

基础类:ProgrammingError

class duckdb.BitValue(object: Any)

基础类:Value

class duckdb.BlobValue(object: Any)

基础类:Value

class duckdb.BooleanValue(object: Any)

基础类:Value

duckdb.CaseExpression(condition: duckdb.duckdb.Expression, value: duckdb.duckdb.Expression) duckdb.duckdb.Expression
exception duckdb.CatalogException

Bases: ProgrammingError

duckdb.CoalesceOperator(*args) duckdb.duckdb.Expression
duckdb.ColumnExpression(name: str) duckdb.duckdb.Expression

从提供的列名创建列引用

exception duckdb.ConnectionException

基础类:OperationalError

duckdb.ConstantExpression(value: object) duckdb.duckdb.Expression

从提供的值创建一个常量表达式

exception duckdb.ConstraintException

基础类:IntegrityError

exception duckdb.ConversionException

基础类:DataError

exception duckdb.DataError

基础类:DatabaseError

class duckdb.DateValue(object: Any)

Bases: Value

class duckdb.DecimalValue(object: Any, width: int, scale: int)

Bases: Value

class duckdb.DoubleValue(object: Any)

Bases: Value

class duckdb.DuckDBPyConnection

基础类:pybind11_object

append(self: duckdb.duckdb.DuckDBPyConnection, table_name: str, df: pandas.DataFrame, *, by_name: bool = False) duckdb.duckdb.DuckDBPyConnection

将传递的DataFrame附加到指定的表中

array_type(self: duckdb.duckdb.DuckDBPyConnection, type: duckdb.duckdb.typing.DuckDBPyType, size: int) duckdb.duckdb.typing.DuckDBPyType

创建一个‘type’类型的数组对象

arrow(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) pyarrow.lib.Table

在执行execute()后获取结果作为Arrow表

begin(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection

开始一个新的事务

checkpoint(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection

将预写日志(WAL)中的数据同步到数据库数据文件中(对于内存连接无效)

close(self: duckdb.duckdb.DuckDBPyConnection) None

关闭连接

commit(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection

提交在事务内执行的更改

create_function(self: duckdb.duckdb.DuckDBPyConnection, name: str, function: Callable, parameters: object = None, return_type: duckdb.duckdb.typing.DuckDBPyType = None, *, type: duckdb.duckdb.functional.PythonUDFType = <PythonUDFType.NATIVE: 0>, null_handling: duckdb.duckdb.functional.FunctionNullHandling = <FunctionNullHandling.DEFAULT: 0>, exception_handling: duckdb.duckdb.PythonExceptionHandling = <PythonExceptionHandling.DEFAULT: 0>, side_effects: bool = False) duckdb.duckdb.DuckDBPyConnection

将传入的Python函数创建为DuckDB函数,以便可以在查询中使用

cursor(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection

创建当前连接的副本

decimal_type(self: duckdb.duckdb.DuckDBPyConnection, width: int, scale: int) duckdb.duckdb.typing.DuckDBPyType

创建一个带有‘width’和‘scale’的十进制类型

property description

获取结果集属性,主要是列名

df(self: duckdb.duckdb.DuckDBPyConnection, *, date_as_object: bool = False) pandas.DataFrame

在执行execute()后获取结果作为DataFrame

dtype(self: duckdb.duckdb.DuckDBPyConnection, type_str: str) duckdb.duckdb.typing.DuckDBPyType

通过解析‘type_str’字符串创建一个类型对象

duplicate(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection

创建当前连接的副本

enum_type(self: duckdb.duckdb.DuckDBPyConnection, name: str, type: duckdb.duckdb.typing.DuckDBPyType, values: list) duckdb.duckdb.typing.DuckDBPyType

创建一个基础类型为‘type’的枚举类型,由‘values’列表组成

execute(self: duckdb.duckdb.DuckDBPyConnection, query: object, parameters: object = None) duckdb.duckdb.DuckDBPyConnection

执行给定的SQL查询,可以选择使用带有参数设置的预处理语句

executemany(self: duckdb.duckdb.DuckDBPyConnection, query: object, parameters: object = None) duckdb.duckdb.DuckDBPyConnection

使用参数集中的参数列表多次执行给定的预处理语句

extract_statements(self: duckdb.duckdb.DuckDBPyConnection, query: str) list

解析查询字符串并提取生成的Statement对象

fetch_arrow_table(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) pyarrow.lib.Table

在执行execute()后获取结果作为Arrow表

fetch_df(self: duckdb.duckdb.DuckDBPyConnection, *, date_as_object: bool = False) pandas.DataFrame

在执行execute()后获取结果作为DataFrame

fetch_df_chunk(self: duckdb.duckdb.DuckDBPyConnection, vectors_per_chunk: int = 1, *, date_as_object: bool = False) pandas.DataFrame

在执行execute()后获取结果的一部分作为DataFrame

fetch_record_batch(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) pyarrow.lib.RecordBatchReader

在执行execute()后获取一个Arrow RecordBatchReader

fetchall(self: duckdb.duckdb.DuckDBPyConnection) list

从执行后的结果中获取所有行

fetchdf(self: duckdb.duckdb.DuckDBPyConnection, *, date_as_object: bool = False) pandas.DataFrame

在执行execute()后获取结果作为DataFrame

fetchmany(self: duckdb.duckdb.DuckDBPyConnection, size: int = 1) list

从执行后的结果中获取下一组行

fetchnumpy(self: duckdb.duckdb.DuckDBPyConnection) dict

在执行后获取结果作为NumPy数组的列表

fetchone(self: duckdb.duckdb.DuckDBPyConnection) Optional[tuple]

在执行后从结果中获取单行

filesystem_is_registered(self: duckdb.duckdb.DuckDBPyConnection, name: str) bool

检查是否已注册具有提供名称的文件系统

from_arrow(self: duckdb.duckdb.DuckDBPyConnection, arrow_object: object) duckdb.duckdb.DuckDBPyRelation

从Arrow对象创建一个关系对象

from_csv_auto(self: duckdb.duckdb.DuckDBPyConnection, path_or_buffer: object, **kwargs) duckdb.duckdb.DuckDBPyRelation

从‘name’中的CSV文件创建一个关系对象

from_df(self: duckdb.duckdb.DuckDBPyConnection, df: pandas.DataFrame) duckdb.duckdb.DuckDBPyRelation

从DataFrame df中创建一个关系对象

from_parquet(*args, **kwargs)

重载函数。

  1. from_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation

从file_glob中的Parquet文件创建一个关系对象

  1. from_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation

从file_globs中的Parquet文件创建一个关系对象

from_query(self: duckdb.duckdb.DuckDBPyConnection, query: object, *, alias: str = '', params: object = None) duckdb.duckdb.DuckDBPyRelation

运行一个SQL查询。如果它是一个SELECT语句,从给定的SQL查询创建一个关系对象,否则按原样运行查询。

from_substrait(self: duckdb.duckdb.DuckDBPyConnection, proto: bytes) duckdb.duckdb.DuckDBPyRelation

从protobuf计划创建一个查询对象

from_substrait_json(self: duckdb.duckdb.DuckDBPyConnection, json: str) duckdb.duckdb.DuckDBPyRelation

从JSON protobuf计划创建一个查询对象

get_substrait(self: duckdb.duckdb.DuckDBPyConnection, query: str, *, enable_optimizer: bool = True) duckdb.duckdb.DuckDBPyRelation

将查询序列化为protobuf

get_substrait_json(self: duckdb.duckdb.DuckDBPyConnection, query: str, *, enable_optimizer: bool = True) duckdb.duckdb.DuckDBPyRelation

将查询序列化为JSON格式的protobuf

get_table_names(self: duckdb.duckdb.DuckDBPyConnection, query: str) set[str]

从查询中提取所需的表名

install_extension(self: duckdb.duckdb.DuckDBPyConnection, extension: str, *, force_install: bool = False, repository: object = None, repository_url: object = None, version: object = None) None

通过名称安装扩展,可以选择指定版本和/或存储库以获取扩展

interrupt(self: duckdb.duckdb.DuckDBPyConnection) None

中断挂起的操作

list_filesystems(self: duckdb.duckdb.DuckDBPyConnection) list

列出已注册的文件系统,包括内置的文件系统

list_type(self: duckdb.duckdb.DuckDBPyConnection, type: duckdb.duckdb.typing.DuckDBPyType) duckdb.duckdb.typing.DuckDBPyType

创建一个‘type’类型的列表对象

load_extension(self: duckdb.duckdb.DuckDBPyConnection, extension: str) None

加载已安装的扩展

map_type(self: duckdb.duckdb.DuckDBPyConnection, key: duckdb.duckdb.typing.DuckDBPyType, value: duckdb.duckdb.typing.DuckDBPyType) duckdb.duckdb.typing.DuckDBPyType

从‘key_type’和‘value_type’创建一个映射类型对象

pl(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) duckdb::PolarsDataFrame

在执行execute()后获取一个Polars DataFrame结果

query(self: duckdb.duckdb.DuckDBPyConnection, query: object, *, alias: str = '', params: object = None) duckdb.duckdb.DuckDBPyRelation

运行一个SQL查询。如果它是一个SELECT语句,从给定的SQL查询创建一个关系对象,否则按原样运行查询。

read_csv(self: duckdb.duckdb.DuckDBPyConnection, path_or_buffer: object, **kwargs) duckdb.duckdb.DuckDBPyRelation

从‘name’中的CSV文件创建一个关系对象

read_json(self: duckdb.duckdb.DuckDBPyConnection, path_or_buffer: object, *, columns: Optional[object] = None, sample_size: Optional[object] = None, maximum_depth: Optional[object] = None, records: Optional[str] = None, format: Optional[str] = None, date_format: Optional[object] = None, timestamp_format: Optional[object] = None, compression: Optional[object] = None, maximum_object_size: Optional[object] = None, ignore_errors: Optional[object] = None, convert_strings_to_integers: Optional[object] = None, field_appearance_threshold: Optional[object] = None, map_inference_threshold: Optional[object] = None, maximum_sample_files: Optional[object] = None, filename: Optional[object] = None, hive_partitioning: Optional[object] = None, union_by_name: Optional[object] = None, hive_types: Optional[object] = None, hive_types_autocast: Optional[object] = None) duckdb.duckdb.DuckDBPyRelation

从‘name’中的JSON文件创建一个关系对象

read_parquet(*args, **kwargs)

重载函数。

  1. 读取Parquet文件(self: duckdb.duckdb.DuckDBPyConnection, file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation

从file_glob中的Parquet文件创建一个关系对象

  1. 读取Parquet文件(self: duckdb.duckdb.DuckDBPyConnection, file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation

从file_globs中的Parquet文件创建一个关系对象

register(self: duckdb.duckdb.DuckDBPyConnection, view_name: str, python_object: object) duckdb.duckdb.DuckDBPyConnection

注册传递的Python对象值以使用视图进行查询

register_filesystem(self: duckdb.duckdb.DuckDBPyConnection, filesystem: fsspec.AbstractFileSystem) None

注册一个符合fsspec规范的文件系统

remove_function(self: duckdb.duckdb.DuckDBPyConnection, name: str) duckdb.duckdb.DuckDBPyConnection

删除之前创建的函数

rollback(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection

回滚在事务中执行的更改

row_type(self: duckdb.duckdb.DuckDBPyConnection, fields: object) duckdb.duckdb.typing.DuckDBPyType

从‘fields’创建一个结构类型对象

property rowcount

获取结果集的行数

sql(self: duckdb.duckdb.DuckDBPyConnection, query: object, *, alias: str = '', params: object = None) duckdb.duckdb.DuckDBPyRelation

运行一个SQL查询。如果它是一个SELECT语句,从给定的SQL查询创建一个关系对象,否则按原样运行查询。

sqltype(self: duckdb.duckdb.DuckDBPyConnection, type_str: str) duckdb.duckdb.typing.DuckDBPyType

通过解析‘type_str’字符串创建一个类型对象

string_type(self: duckdb.duckdb.DuckDBPyConnection, collation: str = '') duckdb.duckdb.typing.DuckDBPyType

创建一个带有可选排序规则的字符串类型

struct_type(self: duckdb.duckdb.DuckDBPyConnection, fields: object) duckdb.duckdb.typing.DuckDBPyType

从‘fields’创建一个结构类型对象

table(self: duckdb.duckdb.DuckDBPyConnection, table_name: str) duckdb.duckdb.DuckDBPyRelation

为指定的表创建一个关系对象

table_function(self: duckdb.duckdb.DuckDBPyConnection, name: str, parameters: object = None) duckdb.duckdb.DuckDBPyRelation

从具有给定参数的命名表函数创建关系对象

tf(self: duckdb.duckdb.DuckDBPyConnection) dict

在执行execute()后,获取一个作为TensorFlow张量字典的结果

torch(self: duckdb.duckdb.DuckDBPyConnection) dict

在执行execute()后,获取结果作为PyTorch张量的字典

type(self: duckdb.duckdb.DuckDBPyConnection, type_str: str) duckdb.duckdb.typing.DuckDBPyType

通过解析‘type_str’字符串创建一个类型对象

union_type(self: duckdb.duckdb.DuckDBPyConnection, members: object) duckdb.duckdb.typing.DuckDBPyType

从'members'创建一个联合类型对象

unregister(self: duckdb.duckdb.DuckDBPyConnection, view_name: str) duckdb.duckdb.DuckDBPyConnection

取消注册视图名称

unregister_filesystem(self: duckdb.duckdb.DuckDBPyConnection, name: str) None

注销一个文件系统

values(self: duckdb.duckdb.DuckDBPyConnection, values: object) duckdb.duckdb.DuckDBPyRelation

从传递的值创建一个关系对象

view(self: duckdb.duckdb.DuckDBPyConnection, view_name: str) duckdb.duckdb.DuckDBPyRelation

为命名视图创建一个关系对象

class duckdb.DuckDBPyRelation

Bases: pybind11_object

aggregate(self: duckdb.duckdb.DuckDBPyRelation, aggr_expr: object, group_expr: str = '') duckdb.duckdb.DuckDBPyRelation

在关系上通过可选的组group_expr计算聚合aggr_expr

property alias

获取当前别名的名称

any_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

返回给定列中的第一个非空值

apply(self: duckdb.duckdb.DuckDBPyRelation, function_name: str, function_aggr: str, group_expr: str = '', function_parameter: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

通过关系上的可选组计算单列或列列表的函数

arg_max(self: duckdb.duckdb.DuckDBPyRelation, arg_column: str, value_column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

查找具有值列最大值的行,并返回该行在参数列中的值

arg_min(self: duckdb.duckdb.DuckDBPyRelation, arg_column: str, value_column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

找到值列中具有最小值的行,并返回该行在参数列中的值

arrow(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.Table

执行并获取所有行作为Arrow表

avg(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的平均值

bit_and(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有位的按位与

bit_or(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有位的按位或

bit_xor(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有位的按位异或

bitstring_agg(self: duckdb.duckdb.DuckDBPyRelation, column: str, min: Optional[object] = None, max: Optional[object] = None, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算一个位字符串,其中为给定列中的每个不同值设置位

bool_and(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有值的逻辑与

bool_or(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有值的逻辑或

close(self: duckdb.duckdb.DuckDBPyRelation) None

关闭结果

property columns

返回一个包含关系列名的列表。

count(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中存在的元素数量

create(self: duckdb.duckdb.DuckDBPyRelation, table_name: str) None

创建一个名为 table_name 的新表,内容来自关系对象

create_view(self: duckdb.duckdb.DuckDBPyRelation, view_name: str, replace: bool = True) duckdb.duckdb.DuckDBPyRelation

创建一个名为 view_name 的视图,该视图引用关系对象

cume_dist(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算分区内的累积分布

dense_rank(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算分区内的密集排名

describe(self: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation

提供基本统计信息(例如,最小值,最大值)以及关系中每列是否存在`NULL`。

property description

返回结果的描述

df(self: duckdb.duckdb.DuckDBPyRelation, *, date_as_object: bool = False) pandas.DataFrame

执行并获取所有行作为pandas DataFrame

distinct(self: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation

从这个关系对象中检索不同的行

property dtypes

返回一个包含关系列类型的列表。

except_(self: duckdb.duckdb.DuckDBPyRelation, other_rel: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation

创建此关系对象与另一个关系对象other_rel的差集

execute(self: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation

将关系转换为结果集

explain(self: duckdb.duckdb.DuckDBPyRelation, type: duckdb.duckdb.ExplainType = 'standard') str
favg(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

使用更精确的浮点求和(Kahan Sum)计算给定列中所有值的平均值。

fetch_arrow_reader(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.RecordBatchReader

执行并返回一个Arrow Record Batch Reader,该读取器生成所有行

fetch_arrow_table(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.Table

执行并获取所有行作为Arrow表

fetch_df_chunk(self: duckdb.duckdb.DuckDBPyRelation, vectors_per_chunk: int = 1, *, date_as_object: bool = False) pandas.DataFrame

执行并获取一部分行

fetchall(self: duckdb.duckdb.DuckDBPyRelation) list

执行并获取所有行作为元组列表

fetchdf(self: duckdb.duckdb.DuckDBPyRelation, *, date_as_object: bool = False) pandas.DataFrame

执行并获取所有行作为pandas DataFrame

fetchmany(self: duckdb.duckdb.DuckDBPyRelation, size: int = 1) list

执行并获取下一组行作为元组列表

fetchnumpy(self: duckdb.duckdb.DuckDBPyRelation) dict

执行并获取所有行作为Python字典,将每列映射到一个numpy数组

fetchone(self: duckdb.duckdb.DuckDBPyRelation) Optional[tuple]

执行并获取单行作为元组

filter(self: duckdb.duckdb.DuckDBPyRelation, filter_expr: object) duckdb.duckdb.DuckDBPyRelation

通过filter_expr中的过滤器过滤关系对象

first(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

返回给定列的第一个值

first_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算组或分区内的第一个值

fsum(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

使用更精确的浮点求和(Kahan Sum)计算给定列中所有值的总和。

geomean(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有值的几何平均值

histogram(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有值的直方图

insert(self: duckdb.duckdb.DuckDBPyRelation, values: object) None

将给定值插入到关系中

insert_into(self: duckdb.duckdb.DuckDBPyRelation, table_name: str) None

将关系对象插入到名为 table_name 的现有表中

intersect(self: duckdb.duckdb.DuckDBPyRelation, other_rel: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation

创建此关系对象与另一个关系对象在other_rel中的集合交集

join(self: duckdb.duckdb.DuckDBPyRelation, other_rel: duckdb.duckdb.DuckDBPyRelation, condition: object, how: str = 'inner') duckdb.duckdb.DuckDBPyRelation

使用join_condition中的连接条件表达式将关系对象与其他关系对象other_rel连接起来。支持的类型有‘inner’和‘left’。

lag(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str, offset: int = 1, default_value: str = 'NULL', ignore_nulls: bool = False, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算分区内的滞后

last(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

返回给定列的最后一个值

last_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算组或分区内的最后一个值

lead(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str, offset: int = 1, default_value: str = 'NULL', ignore_nulls: bool = False, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算分区内的领先值

limit(self: duckdb.duckdb.DuckDBPyRelation, n: int, offset: int = 0) duckdb.duckdb.DuckDBPyRelation

仅从该关系对象中检索前n行,从偏移量开始

list(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

返回包含给定列中所有值的列表

map(self: duckdb.duckdb.DuckDBPyRelation, map_function: Callable, *, schema: Optional[object] = None) duckdb.duckdb.DuckDBPyRelation

在关系上调用传递的函数

max(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

返回给定列中的最大值

mean(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的平均值

median(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有值的中位数

min(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

返回给定列中的最小值

mode(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有值的模式

n_tile(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, num_buckets: int, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

尽可能将分区平均分成num_buckets

nth_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str, offset: int, ignore_nulls: bool = False, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算分区内的第n个值

order(self: duckdb.duckdb.DuckDBPyRelation, order_expr: str) duckdb.duckdb.DuckDBPyRelation

按 order_expr 重新排序关系对象

percent_rank(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算分区内的相对排名

pl(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) duckdb::PolarsDataFrame

执行并获取所有行作为Polars DataFrame

product(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

返回给定列中所有值的乘积

project(self: duckdb.duckdb.DuckDBPyRelation, *args, groups: str = '') duckdb.duckdb.DuckDBPyRelation

通过project_expr中的投影来投影关系对象

quantile(self: duckdb.duckdb.DuckDBPyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的精确分位数值

quantile_cont(self: duckdb.duckdb.DuckDBPyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的插值分位数值

quantile_disc(self: duckdb.duckdb.DuckDBPyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的精确分位数值

query(self: duckdb.duckdb.DuckDBPyRelation, virtual_table_name: str, sql_query: str) duckdb.duckdb.DuckDBPyRelation

在名为 virtual_table_name 的视图上运行 sql_query 中给定的 SQL 查询,该视图引用关系对象

rank(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算分区内的排名

rank_dense(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算分区内的密集排名

record_batch(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.RecordBatchReader

执行并返回一个Arrow Record Batch Reader,该读取器生成所有行

row_number(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算分区内的行号

select(self: duckdb.duckdb.DuckDBPyRelation, *args, groups: str = '') duckdb.duckdb.DuckDBPyRelation

通过project_expr中的投影来投影关系对象

select_dtypes(self: duckdb.duckdb.DuckDBPyRelation, types: object) duckdb.duckdb.DuckDBPyRelation

通过基于类型进行筛选,从关系中选择列

select_types(self: duckdb.duckdb.DuckDBPyRelation, types: object) duckdb.duckdb.DuckDBPyRelation

通过基于类型进行筛选,从关系中选择列

set_alias(self: duckdb.duckdb.DuckDBPyRelation, alias: str) duckdb.duckdb.DuckDBPyRelation

将关系对象重命名为新的别名

property shape

关系中的行数、列数的元组。

show(self: duckdb.duckdb.DuckDBPyRelation, *, max_width: Optional[int] = None, max_rows: Optional[int] = None, max_col_width: Optional[int] = None, null_value: Optional[str] = None, render_mode: object = None) None

显示数据的摘要

sort(self: duckdb.duckdb.DuckDBPyRelation, *args) duckdb.duckdb.DuckDBPyRelation

根据提供的表达式重新排序关系对象

sql_query(self: duckdb.duckdb.DuckDBPyRelation) str

获取与关系等效的SQL查询

std(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的样本标准差

stddev(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的样本标准差

stddev_pop(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的总体标准差

stddev_samp(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的样本标准差

string_agg(self: duckdb.duckdb.DuckDBPyRelation, column: str, sep: str = ',', groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

将给定列中的值与分隔符连接起来

sum(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中所有值的总和

tf(self: duckdb.duckdb.DuckDBPyRelation) dict

获取结果作为TensorFlow张量的字典

to_arrow_table(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.Table

执行并获取所有行作为Arrow表

to_csv(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None) None

将关系对象写入名为‘file_name’的CSV文件中

to_df(self: duckdb.duckdb.DuckDBPyRelation, *, date_as_object: bool = False) pandas.DataFrame

执行并获取所有行作为pandas DataFrame

to_parquet(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, compression: object = None, field_ids: object = None, row_group_size_bytes: object = None, row_group_size: object = None) None

将关系对象写入名为‘file_name’的Parquet文件中

to_table(self: duckdb.duckdb.DuckDBPyRelation, table_name: str) None

创建一个名为 table_name 的新表,内容来自关系对象

to_view(self: duckdb.duckdb.DuckDBPyRelation, view_name: str, replace: bool = True) duckdb.duckdb.DuckDBPyRelation

创建一个名为 view_name 的视图,该视图引用关系对象

torch(self: duckdb.duckdb.DuckDBPyRelation) dict

获取结果作为PyTorch张量的字典

property type

获取关系的类型。

property types

返回一个包含关系列类型的列表。

union(self: duckdb.duckdb.DuckDBPyRelation, union_rel: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation

创建此关系对象与另一个关系对象other_rel的集合联合

unique(self: duckdb.duckdb.DuckDBPyRelation, unique_aggr: str) duckdb.duckdb.DuckDBPyRelation

列中不同值的数量。

value_counts(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列中存在的元素数量,同时投影原始列

var(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的样本方差

var_pop(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的总体方差

var_samp(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的样本方差

variance(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation

计算给定列的样本方差

write_csv(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None) None

将关系对象写入名为‘file_name’的CSV文件中

write_parquet(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, compression: object = None, field_ids: object = None, row_group_size_bytes: object = None, row_group_size: object = None) None

将关系对象写入名为‘file_name’的Parquet文件中

exception duckdb.Error

基础类: Exception

class duckdb.ExplainType

Bases: pybind11_object

成员:

标准

分析

ANALYZE = <ExplainType.ANALYZE: 1>
STANDARD = <ExplainType.STANDARD: 0>
property name
property value
class duckdb.Expression

Bases: pybind11_object

alias(self: duckdb.duckdb.Expression, arg0: str) duckdb.duckdb.Expression

使用给定的别名创建此表达式的副本。

Parameters:

名称:用于表达式的别名,这将影响其引用方式。

Returns:

表达式:带有别名的 self。

asc(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression

将排序修饰符设置为升序。

cast(self: duckdb.duckdb.Expression, type: duckdb.duckdb.typing.DuckDBPyType) duckdb.duckdb.Expression

创建一个CastExpression以从self进行类型转换

Parameters:

类型:要转换到的类型

Returns:

类型转换表达式: self::type

desc(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression

将排序修饰符设置为降序。

isin(self: duckdb.duckdb.Expression, *args) duckdb.duckdb.Expression

返回一个IN表达式,将自身与输入参数进行比较。

Returns:

DuckDBPyExpression: 比较IN表达式

isnotin(self: duckdb.duckdb.Expression, *args) duckdb.duckdb.Expression

返回一个NOT IN表达式,将自身与输入参数进行比较。

Returns:

DuckDBPyExpression: 比较 NOT IN 表达式

isnotnull(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression

从自身创建一个二进制 IS NOT NULL 表达式

Returns:

DuckDBPyExpression: self IS NOT NULL

isnull(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression

从自身创建一个二进制的 IS NULL 表达式

Returns:

DuckDBPyExpression: self IS NULL

nulls_first(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression

将NULL顺序修改器设置为NULLS FIRST。

nulls_last(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression

将NULL顺序修改符设置为NULLS LAST。

otherwise(self: duckdb.duckdb.Expression, value: duckdb.duckdb.Expression) duckdb.duckdb.Expression

向CaseExpression添加一个ELSE 子句。

Parameters:

value: 如果没有满足任何WHEN条件时使用的值。

Returns:

CaseExpression: 带有 ELSE 子句的 self。

show(self: duckdb.duckdb.Expression) None

打印表达式的字符串化版本。

when(self: duckdb.duckdb.Expression, condition: duckdb.duckdb.Expression, value: duckdb.duckdb.Expression) duckdb.duckdb.Expression

向CaseExpression添加一个额外的WHEN THEN 子句。

Parameters:

条件:必须满足的条件。 值:如果条件满足,则使用的值。

Returns:

CaseExpression: 自身带有额外的WHEN子句。

exception duckdb.FatalException

Bases: DatabaseError

class duckdb.FloatValue(object: Any)

Bases: Value

duckdb.FunctionExpression(function_name: str, *args) duckdb.duckdb.Expression
exception duckdb.HTTPException

基础类: IOException

当httpfs扩展发生错误或在下载扩展时抛出。

body: str
headers: Dict[str, str]
reason: str
status_code: int
class duckdb.HugeIntegerValue(object: Any)

Bases: Value

exception duckdb.IOException

Bases: OperationalError

class duckdb.IntegerValue(object: Any)

Bases: Value

exception duckdb.IntegrityError

Bases: DatabaseError

exception duckdb.InternalError

Bases: DatabaseError

exception duckdb.InternalException

基础类:InternalError

exception duckdb.InterruptException

Bases: DatabaseError

class duckdb.IntervalValue(object: Any)

Bases: Value

exception duckdb.InvalidInputException

Bases: ProgrammingError

exception duckdb.InvalidTypeException

Bases: ProgrammingError

class duckdb.LongValue(object: Any)

Bases: Value

exception duckdb.NotImplementedException

基础类:NotSupportedError

exception duckdb.NotSupportedError

Bases: DatabaseError

class duckdb.NullValue

Bases: Value

exception duckdb.OperationalError

Bases: DatabaseError

exception duckdb.OutOfMemoryException

Bases: OperationalError

exception duckdb.OutOfRangeException

Bases: DataError

exception duckdb.ParserException

Bases: ProgrammingError

exception duckdb.PermissionException

Bases: DatabaseError

exception duckdb.ProgrammingError

Bases: DatabaseError

class duckdb.PythonExceptionHandling

Bases: pybind11_object

成员:

默认

RETURN_NULL

DEFAULT = <PythonExceptionHandling.DEFAULT: 0>
RETURN_NULL = <PythonExceptionHandling.RETURN_NULL: 1>
property name
property value
exception duckdb.SequenceException

Bases: DatabaseError

exception duckdb.SerializationException

Bases: OperationalError

class duckdb.ShortValue(object: Any)

Bases: Value

duckdb.StarExpression(*args, **kwargs)

重载函数。

  1. StarExpression(*, exclude: object = None) -> duckdb.duckdb.Expression

  2. StarExpression() -> duckdb.duckdb.Expression

class duckdb.StringValue(object: Any)

Bases: Value

exception duckdb.SyntaxException

Bases: ProgrammingError

class duckdb.TimeTimeZoneValue(object: Any)

Bases: Value

class duckdb.TimeValue(object: Any)

Bases: Value

class duckdb.TimestampMilisecondValue(object: Any)

Bases: Value

class duckdb.TimestampNanosecondValue(object: Any)

Bases: Value

class duckdb.TimestampSecondValue(object: Any)

Bases: Value

class duckdb.TimestampTimeZoneValue(object: Any)

Bases: Value

class duckdb.TimestampValue(object: Any)

Bases: Value

exception duckdb.TransactionException

Bases: OperationalError

exception duckdb.TypeMismatchException

Bases: DataError

class duckdb.UUIDValue(object: Any)

Bases: Value

class duckdb.UnsignedBinaryValue(object: Any)

Bases: Value

class duckdb.UnsignedIntegerValue(object: Any)

Bases: Value

class duckdb.UnsignedLongValue(object: Any)

Bases: Value

class duckdb.UnsignedShortValue(object: Any)

Bases: Value

class duckdb.Value(object: Any, type: DuckDBPyType)

基础类:object

exception duckdb.Warning

Bases: Exception

duckdb.aggregate(df: pandas.DataFrame, aggr_expr: object, group_expr: str = '', *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

在关系上通过可选的组group_expr计算聚合aggr_expr

duckdb.alias(df: pandas.DataFrame, alias: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

将关系对象重命名为新的别名

duckdb.append(table_name: str, df: pandas.DataFrame, *, by_name: bool = False, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

将传递的DataFrame附加到指定的表中

duckdb.array_type(type: duckdb.duckdb.typing.DuckDBPyType, size: int, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

创建一个‘type’类型的数组对象

duckdb.arrow(*args, **kwargs)

重载函数。

  1. arrow(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) -> pyarrow.lib.Table

在执行execute()后获取结果作为Arrow表

  1. arrow(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) -> pyarrow.lib.Table

在执行execute()后获取结果作为Arrow表

  1. arrow(arrow_object: object, *, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation

从Arrow对象创建一个关系对象

duckdb.begin(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

开始一个新的事务

duckdb.checkpoint(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

将预写日志(WAL)中的数据同步到数据库数据文件中(对于内存连接无效)

duckdb.close(*, connection: duckdb.DuckDBPyConnection = None) None

关闭连接

duckdb.commit(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

提交在事务内执行的更改

duckdb.connect(database: object = ':memory:', read_only: bool = False, config: dict = None) duckdb.DuckDBPyConnection

创建一个DuckDB数据库实例。可以接受一个数据库文件名以读取/写入持久数据,以及一个read_only标志,如果不希望进行任何更改。

duckdb.create_function(name: str, function: Callable, parameters: object = None, return_type: duckdb.duckdb.typing.DuckDBPyType = None, *, type: duckdb.duckdb.functional.PythonUDFType = <PythonUDFType.NATIVE: 0>, null_handling: duckdb.duckdb.functional.FunctionNullHandling = <FunctionNullHandling.DEFAULT: 0>, exception_handling: duckdb.duckdb.PythonExceptionHandling = <PythonExceptionHandling.DEFAULT: 0>, side_effects: bool = False, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

将传入的Python函数创建为DuckDB函数,以便可以在查询中使用

duckdb.cursor(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

创建当前连接的副本

duckdb.decimal_type(width: int, scale: int, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

创建一个带有‘width’和‘scale’的十进制类型

duckdb.description(*, connection: duckdb.DuckDBPyConnection = None) Optional[list]

获取结果集属性,主要是列名

duckdb.df(*args, **kwargs)

重载函数。

  1. df(*, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) -> pandas.DataFrame

在执行execute()后获取结果作为DataFrame

  1. df(*, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) -> pandas.DataFrame

在执行execute()后获取结果作为DataFrame

  1. df(df: pandas.DataFrame, *, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation

从DataFrame df创建一个关系对象

duckdb.distinct(df: pandas.DataFrame, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

从这个关系对象中检索不同的行

duckdb.dtype(type_str: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

通过解析‘type_str’字符串创建一个类型对象

duckdb.duplicate(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

创建当前连接的副本

duckdb.enum_type(name: str, type: duckdb.duckdb.typing.DuckDBPyType, values: list, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

创建一个基础类型为‘type’的枚举类型,由‘values’列表组成

duckdb.execute(query: object, parameters: object = None, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

执行给定的SQL查询,可以选择使用带有参数设置的预处理语句

duckdb.executemany(query: object, parameters: object = None, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

使用参数集中的参数列表多次执行给定的预处理语句

duckdb.extract_statements(query: str, *, connection: duckdb.DuckDBPyConnection = None) list

解析查询字符串并提取生成的Statement对象

duckdb.fetch_arrow_table(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) pyarrow.lib.Table

在执行execute()后获取结果作为Arrow表

duckdb.fetch_df(*, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) pandas.DataFrame

在执行execute()后获取结果作为DataFrame

duckdb.fetch_df_chunk(vectors_per_chunk: int = 1, *, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) pandas.DataFrame

在执行execute()后获取结果的一部分作为DataFrame

duckdb.fetch_record_batch(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) pyarrow.lib.RecordBatchReader

在执行execute()后获取一个Arrow RecordBatchReader

duckdb.fetchall(*, connection: duckdb.DuckDBPyConnection = None) list

从执行后的结果中获取所有行

duckdb.fetchdf(*, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) pandas.DataFrame

在执行execute()后获取结果作为DataFrame

duckdb.fetchmany(size: int = 1, *, connection: duckdb.DuckDBPyConnection = None) list

从执行后的结果中获取下一组行

duckdb.fetchnumpy(*, connection: duckdb.DuckDBPyConnection = None) dict

在执行后获取结果作为NumPy数组的列表

duckdb.fetchone(*, connection: duckdb.DuckDBPyConnection = None) Optional[tuple]

在执行后从结果中获取单行

duckdb.filesystem_is_registered(name: str, *, connection: duckdb.DuckDBPyConnection = None) bool

检查是否已注册具有提供名称的文件系统

duckdb.filter(df: pandas.DataFrame, filter_expr: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

通过filter_expr中的过滤器过滤关系对象

duckdb.from_arrow(arrow_object: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

从Arrow对象创建一个关系对象

duckdb.from_csv_auto(path_or_buffer: object, **kwargs) duckdb.duckdb.DuckDBPyRelation

从‘name’中的CSV文件创建一个关系对象

duckdb.from_df(df: pandas.DataFrame, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

从DataFrame df中创建一个关系对象

duckdb.from_parquet(*args, **kwargs)

重载函数。

  1. from_parquet(file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation

从file_glob中的Parquet文件创建一个关系对象

  1. from_parquet(file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation

从file_globs中的Parquet文件创建一个关系对象

duckdb.from_query(query: object, *, alias: str = '', params: object = None, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

运行一个SQL查询。如果它是一个SELECT语句,从给定的SQL查询创建一个关系对象,否则按原样运行查询。

duckdb.from_substrait(proto: bytes, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

从protobuf计划创建一个查询对象

duckdb.from_substrait_json(json: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

从JSON protobuf计划创建一个查询对象

duckdb.get_substrait(query: str, *, enable_optimizer: bool = True, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

将查询序列化为protobuf

duckdb.get_substrait_json(query: str, *, enable_optimizer: bool = True, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

将查询序列化为JSON格式的protobuf

duckdb.get_table_names(query: str, *, connection: duckdb.DuckDBPyConnection = None) set[str]

从查询中提取所需的表名

duckdb.install_extension(extension: str, *, force_install: bool = False, repository: object = None, repository_url: object = None, version: object = None, connection: duckdb.DuckDBPyConnection = None) None

通过名称安装扩展,可以选择指定版本和/或存储库以获取扩展

duckdb.interrupt(*, connection: duckdb.DuckDBPyConnection = None) None

中断挂起的操作

duckdb.limit(df: pandas.DataFrame, n: int, offset: int = 0, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

仅从该关系对象中检索前n行,从偏移量开始

duckdb.list_filesystems(*, connection: duckdb.DuckDBPyConnection = None) list

列出已注册的文件系统,包括内置的文件系统

duckdb.list_type(type: duckdb.duckdb.typing.DuckDBPyType, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

创建一个‘type’类型的列表对象

duckdb.load_extension(extension: str, *, connection: duckdb.DuckDBPyConnection = None) None

加载已安装的扩展

duckdb.map_type(key: duckdb.duckdb.typing.DuckDBPyType, value: duckdb.duckdb.typing.DuckDBPyType, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

从‘key_type’和‘value_type’创建一个映射类型对象

duckdb.order(df: pandas.DataFrame, order_expr: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

按 order_expr 重新排序关系对象

duckdb.pl(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) duckdb::PolarsDataFrame

在执行execute()后获取一个Polars DataFrame结果

duckdb.project(df: pandas.DataFrame, *args, groups: str = '', connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

通过project_expr中的投影来投影关系对象

duckdb.query(query: object, *, alias: str = '', params: object = None, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

运行一个SQL查询。如果它是一个SELECT语句,从给定的SQL查询创建一个关系对象,否则按原样运行查询。

duckdb.query_df(df: pandas.DataFrame, virtual_table_name: str, sql_query: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

在名为 virtual_table_name 的视图上运行 sql_query 中给定的 SQL 查询,该视图引用关系对象

duckdb.read_csv(path_or_buffer: object, **kwargs) duckdb.duckdb.DuckDBPyRelation

从‘name’中的CSV文件创建一个关系对象

duckdb.read_json(path_or_buffer: object, *, columns: Optional[object] = None, sample_size: Optional[object] = None, maximum_depth: Optional[object] = None, records: Optional[str] = None, format: Optional[str] = None, date_format: Optional[object] = None, timestamp_format: Optional[object] = None, compression: Optional[object] = None, maximum_object_size: Optional[object] = None, ignore_errors: Optional[object] = None, convert_strings_to_integers: Optional[object] = None, field_appearance_threshold: Optional[object] = None, map_inference_threshold: Optional[object] = None, maximum_sample_files: Optional[object] = None, filename: Optional[object] = None, hive_partitioning: Optional[object] = None, union_by_name: Optional[object] = None, hive_types: Optional[object] = None, hive_types_autocast: Optional[object] = None, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

从‘name’中的JSON文件创建一个关系对象

duckdb.read_parquet(*args, **kwargs)

重载函数。

  1. 读取Parquet文件(file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation

从file_glob中的Parquet文件创建一个关系对象

  1. 读取Parquet文件(file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation

从file_globs中的Parquet文件创建一个关系对象

duckdb.register(view_name: str, python_object: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

注册传递的Python对象值以使用视图进行查询

duckdb.register_filesystem(filesystem: fsspec.AbstractFileSystem, *, connection: duckdb.DuckDBPyConnection = None) None

注册一个符合fsspec规范的文件系统

duckdb.remove_function(name: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

删除之前创建的函数

duckdb.rollback(*, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

回滚在事务中执行的更改

duckdb.row_type(fields: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

从‘fields’创建一个结构类型对象

duckdb.rowcount(*, connection: duckdb.DuckDBPyConnection = None) int

获取结果集的行数

duckdb.sql(query: object, *, alias: str = '', params: object = None, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

运行一个SQL查询。如果它是一个SELECT语句,从给定的SQL查询创建一个关系对象,否则按原样运行查询。

duckdb.sqltype(type_str: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

通过解析‘type_str’字符串创建一个类型对象

duckdb.string_type(collation: str = '', *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

创建一个带有可选排序规则的字符串类型

duckdb.struct_type(fields: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

从‘fields’创建一个结构类型对象

duckdb.table(table_name: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

为指定的表创建一个关系对象

duckdb.table_function(name: str, parameters: object = None, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

从具有给定参数的命名表函数创建关系对象

duckdb.tf(*, connection: duckdb.DuckDBPyConnection = None) dict

在执行execute()后,获取一个作为TensorFlow张量字典的结果

class duckdb.token_type

Bases: pybind11_object

成员:

标识符

numeric_const

string_const

操作符

关键词

评论

comment = <token_type.comment: 5>
identifier = <token_type.identifier: 0>
keyword = <token_type.keyword: 4>
property name
numeric_const = <token_type.numeric_const: 1>
operator = <token_type.operator: 3>
string_const = <token_type.string_const: 2>
property value
duckdb.tokenize(query: str) list

对SQL字符串进行分词,返回一个(位置,类型)元组的列表,可用于例如语法高亮

duckdb.torch(*, connection: duckdb.DuckDBPyConnection = None) dict

在执行execute()后,获取结果作为PyTorch张量的字典

duckdb.type(type_str: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

通过解析‘type_str’字符串创建一个类型对象

duckdb.union_type(members: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.typing.DuckDBPyType

从'members'创建一个联合类型对象

duckdb.unregister(view_name: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.DuckDBPyConnection

取消注册视图名称

duckdb.unregister_filesystem(name: str, *, connection: duckdb.DuckDBPyConnection = None) None

注销一个文件系统

duckdb.values(values: object, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

从传递的值创建一个关系对象

duckdb.view(view_name: str, *, connection: duckdb.DuckDBPyConnection = None) duckdb.duckdb.DuckDBPyRelation

为命名视图创建一个关系对象

duckdb.write_csv(df: pandas.DataFrame, filename: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None, connection: duckdb.DuckDBPyConnection = None) None

将关系对象写入名为‘file_name’的CSV文件中