DuckDB 提供了一个 Swift API。详情请参阅公告帖子。
实例化 DuckDB
DuckDB 支持内存数据库和持久数据库。 要使用内存数据库,请运行:
let database = try Database(store: .inMemory)
要使用持久数据库,请运行:
let database = try Database(store: .file(at: "test.db"))
可以通过数据库连接发出查询。
let connection = try database.connect()
DuckDB 支持每个数据库的多个连接。
应用示例
页面的其余部分基于我们公告帖子的示例,该示例使用了直接从NASA的系外行星档案加载到DuckDB中的原始数据。
创建特定应用类型
我们首先创建一个特定于应用程序的类型,用于容纳我们的数据库和连接,并最终通过它定义我们特定于应用程序的查询。
import DuckDB
final class ExoplanetStore {
let database: Database
let connection: Connection
init(database: Database, connection: Connection) {
self.database = database
self.connection = connection
}
}
加载CSV文件
我们从NASA的系外行星档案加载数据:
wget https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+pl_name+,+disc_year+from+pscomppars&format=csv -O downloaded_exoplanets.csv
一旦我们在本地下载了CSV文件,我们可以使用以下SQL命令将其作为新表加载到DuckDB中:
CREATE TABLE exoplanets AS
SELECT * FROM read_csv('downloaded_exoplanets.csv');
让我们将其打包为ExoplanetStore
类型上的一个新的异步工厂方法:
import DuckDB
import Foundation
final class ExoplanetStore {
// Factory method to create and prepare a new ExoplanetStore
static func create() async throws -> ExoplanetStore {
// Create our database and connection as described above
let database = try Database(store: .inMemory)
let connection = try database.connect()
// Download the CSV from the exoplanet archive
let (csvFileURL, _) = try await URLSession.shared.download(
from: URL(string: "https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+pl_name+,+disc_year+from+pscomppars&format=csv")!)
// Issue our first query to DuckDB
try connection.execute("""
CREATE TABLE exoplanets AS
SELECT * FROM read_csv('\(csvFileURL.path)');
""")
// Create our pre-populated ExoplanetStore instance
return ExoplanetStore(
database: database,
connection: connection
)
}
// Let's make the initializer we defined previously
// private. This prevents anyone accidentally instantiating
// the store without having pre-loaded our Exoplanet CSV
// into the database
private init(database: Database, connection: Connection) {
...
}
}
查询数据库
以下示例通过异步函数在Swift中查询DuckDB。这意味着在查询执行时,调用者不会被阻塞。然后,我们将使用DuckDB的ResultSet
cast(to:)
系列方法将结果列转换为Swift原生类型,最后将它们包装在TabularData框架的DataFrame
中。
...
import TabularData
extension ExoplanetStore {
// Retrieves the number of exoplanets discovered by year
func groupedByDiscoveryYear() async throws -> DataFrame {
// Issue the query we described above
let result = try connection.query("""
SELECT disc_year, count(disc_year) AS Count
FROM exoplanets
GROUP BY disc_year
ORDER BY disc_year
""")
// Cast our DuckDB columns to their native Swift
// equivalent types
let discoveryYearColumn = result[0].cast(to: Int.self)
let countColumn = result[1].cast(to: Int.self)
// Use our DuckDB columns to instantiate TabularData
// columns and populate a TabularData DataFrame
return DataFrame(columns: [
TabularData.Column(discoveryYearColumn).eraseToAnyColumn(),
TabularData.Column(countColumn).eraseToAnyColumn(),
])
}
}
Complete Project
要获取完整的示例项目,请克隆DuckDB Swift 仓库并打开位于Examples/SwiftUI/ExoplanetExplorer.xcodeproj
中的可运行应用程序项目。