luasql-duckdb和luadbi-duckdb两个模块都能访问duckdb数据库,而luadbi-duckdb只对有限个数据类型做了处理,如DECIMAL就没有处理,不满足要求,所以用luasql-duckdb实现。
先用简单的例子分析,仿照已有的示例,对duckdb 生成的tpch数据集的lineitem表查询
D load tpch;
D call dbgen(sf=0.3);
D describe lineitem;
┌─────────────────┬───────────────┬─────────┬─────────┬─────────┬─────────┐
│ column_name │ column_type │ null │ key │ default │ extra │
│ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │
├─────────────────┼───────────────┼─────────┼─────────┼─────────┼─────────┤
│ l_orderkey │ BIGINT │ NO │ NULL │ NULL │ NULL │
│ l_partkey │ BIGINT │ NO │ NULL │ NULL │ NULL │
│ l_suppkey │ BIGINT │ NO │ NULL │ NULL │ NULL │
│ l_linenumber │ BIGINT │ NO │ NULL │ NULL │ NULL │
│ l_quantity │ DECIMAL(15,2) │ NO │ NULL │ NULL │ NULL │
│ l_extendedprice │ DECIMAL(15,2) │ NO │ NULL │ NULL │ NULL │
│ l_discount │ DECIMAL(15,2) │ NO │ NULL │ NULL │ NULL │
│ l_tax │ DECIMAL(15,2) │ NO │ NULL │ NULL │ NULL │
│ l_returnflag │ VARCHAR │ NO │ NULL │ NULL │ NULL │
│ l_linestatus │ VARCHAR │ NO │ NULL │ NULL │ NULL │
│ l_shipdate │ DATE │ NO │ NULL │ NULL │ NULL │
│ l_commitdate │ DATE │ NO │ NULL │ NULL │ NULL │
│ l_receiptdate │ DATE │ NO │ NULL │ NULL │ NULL │
│ l_shipinstruct │ VARCHAR │ NO │ NULL │ NULL │ NULL │
│ l_shipmode │ VARCHAR │ NO │ NULL │ NULL │ NULL │
│ l_comment │ VARCHAR │ NO │ NULL │ NULL │ NULL │
├─────────────────┴───────────────┴─────────┴─────────┴─────────┴─────────┤
│ 16 rows 6 columns │
└─────────────────────────────────────────────────────────────────────────┘
上面有4种数据类型,lua代码如下
driver = require"luasql.duckdb"
-- create environment object
env = assert (driver.duckdb())
-- connect to data source
con = assert (env:connect("tpch_duckdb"))
cur = assert (con:execute"SELECT * from lineitem limit 3")
-- print all rows
row = cur:fetch ({}, "a") -- the rows will be indexed by field names
while row do
print(string.format("l_orderkey: %d, l_quantity: %d l_shipdate: %s", row.l_orderkey, row.l_quantity, row.l_shipdate))
row = cur:fetch (row, "a") -- reusing the table of results
end
输出如下
root@6ae32a5ffcde:/par# ./duckdb141 tpch_duckdb -c "SELECT l_orderkey,l_quantity,l_shipdate from lineitem limit 3"
┌────────────┬───────────────┬────────────┐
│ l_orderkey │ l_quantity │ l_shipdate │
│ int64 │ decimal(15,2) │ date │
├────────────┼───────────────┼────────────┤
│ 1 │ 17.00 │ 1996-03-13 │
│ 1 │ 36.00 │ 1996-04-12 │
│ 1 │ 8.00 │ 1996-01-29 │
└────────────┴───────────────┴────────────┘
root@6ae32a5ffcde:/par# lua ducktpch2.lua
l_orderkey: 1, l_quantity: 17 l_shipdate: 1996-03-13
l_orderkey: 1, l_quantity: 36 l_shipdate: 1996-04-12
l_orderkey: 1, l_quantity: 8 l_shipdate: 1996-01-29
可见l_orderkey和l_shipdate完全正确,而 l_quantity缺少小数点。
把print语句l_quantity的格式符修改为%15.2f,l_quantity的输出格式就对了
print(string.format("l_orderkey: %d, l_quantity: %15.2f l_shipdate: %s", row.l_orderkey, row.l_quantity, row.l_shipdate))
l_orderkey: 1, l_quantity: 17.00 l_shipdate: 1996-03-13
l_orderkey: 1, l_quantity: 36.00 l_shipdate: 1996-04-12
l_orderkey: 1, l_quantity: 8.00 l_shipdate: 1996-01-29
可以看到,row是fetch的结果,它自动关联了duckdb查询结果中的列,而对于任意的查询,怎么知道列名,可以通过遍历lua表row来查看
我们用交互式命令行
root@6ae32a5ffcde:/par# lua
Lua 5.4.8 Copyright (C) 1994-2025 Lua.org, PUC-Rio
> driver = require"luasql.duckdb"
> env = assert (driver.duckdb())
> con = assert (env:connect("tpch_duckdb"))
> cur = assert (con:execute"SELECT * from lineitem limit 3")
> row.l_quantity
17.00
> for key, value in pairs(row) do print(key, value, type(value)) end
l_comment to beans x-ray carefull string
l_shipdate 1996-03-13 string
l_linenumber 1 string
l_suppkey 2323 string
l_quantity 17.00 string
l_returnflag N string
l_shipmode TRUCK string
l_tax 0.02 string
l_orderkey 1 string
l_receiptdate 1996-03-22 string
l_commitdate 1996-02-12 string
l_linestatus O string
l_extendedprice 25560.35 string
l_partkey 46557 string
l_discount 0.04 string
l_shipinstruct DELIVER IN PERSON string
可见,所有duckdb查询结果中的列都被格式化为字符串了,所以不用考虑列类型,统一用%s即可。
可以定义一个lua表来保存列名,
> tabcol={}
> i=1
> for key, value in pairs(row) do tabcol[i]=key i=i+1 end
> for key, value in pairs(tabcol)do print(key, value) end
1 l_linestatus
2 l_discount
3 l_orderkey
4 l_shipdate
5 l_comment
6 l_partkey
7 l_shipmode
8 l_extendedprice
9 l_tax
10 l_returnflag
11 l_receiptdate
12 l_suppkey
13 l_shipinstruct
14 l_linenumber
15 l_commitdate
16 l_quantity
> print(row[tabcol[1]])
O
再定义一个lua表来保存一行的数据
tabval={}
> for key, value in pairs(tabcol)do tabval[key]=row[tabcol[key]] end
> tabval[1]
O
> rowstring=table.concat(tabval, ",")
> print(rowstring)
O,0.04,1,1996-03-13,to beans x-ray carefull,46557,TRUCK,25560.35,0.02,N,1996-03-22,2323,DELIVER IN PERSON,1,1996-02-12,17.00
> headerstring=table.concat(tabcol, ",")
> print(headerstring)
l_linestatus,l_discount,l_orderkey,l_shipdate,l_comment,l_partkey,l_shipmode,l_extendedprice,l_tax,l_returnflag,l_receiptdate,l_suppkey,l_shipinstruct,l_linenumber,l_commitdate,l_quantity
依次输出即可,完整代码如下
driver = require"luasql.duckdb"
env = assert (driver.duckdb())
con = assert (env:connect("tpch_duckdb"))
cur = assert (con:execute"SELECT * from lineitem limit 3")
row = cur:fetch ({}, "a") -- the rows will be indexed by field names
tabcol={}
i=1
for key, value in pairs(row) do tabcol[i]=key i=i+1 end
headerstring=table.concat(tabcol, ",")
print(headerstring)
tabval={}
while row do
for key, value in pairs(tabcol)do tabval[key]=row[tabcol[key]] end
rowstring=table.concat(tabval, ",")
print(rowstring)
row = cur:fetch (row, "a") -- reusing the table of results
end
执行结果如下
root@6ae32a5ffcde:/par# lua ducktpch2.lua
l_tax,l_returnflag,l_discount,l_orderkey,l_linenumber,l_shipmode,l_partkey,l_receiptdate,l_quantity,l_commitdate,l_suppkey,l_extendedprice,l_shipdate,l_comment,l_shipinstruct,l_linestatus
0.02,N,0.04,1,1,TRUCK,46557,1996-03-22,17.00,1996-02-12,2323,25560.35,1996-03-13,to beans x-ray carefull,DELIVER IN PERSON,O
0.06,N,0.09,1,2,MAIL,20193,1996-04-20,36.00,1996-02-28,2194,40074.84,1996-04-12, according to the final foxes. qui,TAKE BACK RETURN,O
0.02,N,0.10,1,3,REG AIR,19110,1996-01-31,8.00,1996-03-05,1111,8232.88,1996-01-29,ourts cajole above the furiou,TAKE BACK RETURN,O


被折叠的 条评论
为什么被折叠?



