利用luasql-duckdb输出任意duckdb查询结果到csv文件

luasql-duckdb和luadbi-duckdb两个模块都能访问duckdb数据库,而luadbi-duckdb只对有限个数据类型做了处理,如DECIMAL就没有处理,不满足要求,所以用luasql-duckdb实现。

先用简单的例子分析,仿照已有的示例,对duckdb 生成的tpch数据集的lineitem表查询

D load tpch;
D call dbgen(sf=0.3);
D describe lineitem;
┌─────────────────┬───────────────┬─────────┬─────────┬─────────┬─────────┐
│   column_name   │  column_type  │  nullkeydefault │  extra  │
│     varcharvarcharvarcharvarcharvarcharvarchar │
├─────────────────┼───────────────┼─────────┼─────────┼─────────┼─────────┤
│ l_orderkey      │ BIGINTNONULLNULLNULL    │
│ l_partkey       │ BIGINTNONULLNULLNULL    │
│ l_suppkey       │ BIGINTNONULLNULLNULL    │
│ l_linenumber    │ BIGINTNONULLNULLNULL    │
│ l_quantity      │ DECIMAL(15,2)NONULLNULLNULL    │
│ l_extendedprice │ DECIMAL(15,2)NONULLNULLNULL    │
│ l_discount      │ DECIMAL(15,2)NONULLNULLNULL    │
│ l_tax           │ DECIMAL(15,2)NONULLNULLNULL    │
│ l_returnflag    │ VARCHARNONULLNULLNULL    │
│ l_linestatus    │ VARCHARNONULLNULLNULL    │
│ l_shipdate      │ DATENONULLNULLNULL    │
│ l_commitdate    │ DATENONULLNULLNULL    │
│ l_receiptdate   │ DATENONULLNULLNULL    │
│ l_shipinstruct  │ VARCHARNONULLNULLNULL    │
│ l_shipmode      │ VARCHARNONULLNULLNULL    │
│ l_comment       │ VARCHARNONULLNULLNULL    │
├─────────────────┴───────────────┴─────────┴─────────┴─────────┴─────────┤
│ 16 rows                                                       6 columns │
└─────────────────────────────────────────────────────────────────────────┘

上面有4种数据类型,lua代码如下

driver = require"luasql.duckdb"
-- create environment object
env = assert (driver.duckdb())
-- connect to data source
con = assert (env:connect("tpch_duckdb"))

cur = assert (con:execute"SELECT * from lineitem limit 3")
-- print all rows
row = cur:fetch ({}, "a")	-- the rows will be indexed by field names
while row do
  print(string.format("l_orderkey: %d, l_quantity: %d l_shipdate: %s", row.l_orderkey, row.l_quantity, row.l_shipdate))
  row = cur:fetch (row, "a")	-- reusing the table of results
end

输出如下

root@6ae32a5ffcde:/par# ./duckdb141 tpch_duckdb -c "SELECT l_orderkey,l_quantity,l_shipdate from lineitem limit 3"
┌────────────┬───────────────┬────────────┐
│ l_orderkey │  l_quantity   │ l_shipdate │
│   int64    │ decimal(15,2) │    date    │
├────────────┼───────────────┼────────────┤
│          1 │         17.00 │ 1996-03-13 │
│          1 │         36.00 │ 1996-04-12 │
│          1 │          8.00 │ 1996-01-29 │
└────────────┴───────────────┴────────────┘
root@6ae32a5ffcde:/par# lua ducktpch2.lua
l_orderkey: 1, l_quantity: 17 l_shipdate: 1996-03-13
l_orderkey: 1, l_quantity: 36 l_shipdate: 1996-04-12
l_orderkey: 1, l_quantity: 8 l_shipdate: 1996-01-29

可见l_orderkey和l_shipdate完全正确,而 l_quantity缺少小数点。
把print语句l_quantity的格式符修改为%15.2f,l_quantity的输出格式就对了

print(string.format("l_orderkey: %d, l_quantity: %15.2f l_shipdate: %s", row.l_orderkey, row.l_quantity, row.l_shipdate))

l_orderkey: 1, l_quantity:           17.00 l_shipdate: 1996-03-13
l_orderkey: 1, l_quantity:           36.00 l_shipdate: 1996-04-12
l_orderkey: 1, l_quantity:            8.00 l_shipdate: 1996-01-29

可以看到,row是fetch的结果,它自动关联了duckdb查询结果中的列,而对于任意的查询,怎么知道列名,可以通过遍历lua表row来查看
我们用交互式命令行

root@6ae32a5ffcde:/par# lua
Lua 5.4.8  Copyright (C) 1994-2025 Lua.org, PUC-Rio
> driver = require"luasql.duckdb"
> env = assert (driver.duckdb())
> con = assert (env:connect("tpch_duckdb"))
> cur = assert (con:execute"SELECT * from lineitem limit 3")
> row.l_quantity
17.00

> for key, value in pairs(row) do print(key, value, type(value)) end
l_comment       to beans x-ray carefull string
l_shipdate      1996-03-13      string
l_linenumber    1       string
l_suppkey       2323    string
l_quantity      17.00   string
l_returnflag    N       string
l_shipmode      TRUCK   string
l_tax   0.02    string
l_orderkey      1       string
l_receiptdate   1996-03-22      string
l_commitdate    1996-02-12      string
l_linestatus    O       string
l_extendedprice 25560.35        string
l_partkey       46557   string
l_discount      0.04    string
l_shipinstruct  DELIVER IN PERSON       string

可见,所有duckdb查询结果中的列都被格式化为字符串了,所以不用考虑列类型,统一用%s即可。

可以定义一个lua表来保存列名,

> tabcol={}
> i=1
> for key, value in pairs(row) do tabcol[i]=key i=i+1 end
> for key, value in pairs(tabcol)do print(key, value) end
1       l_linestatus
2       l_discount
3       l_orderkey
4       l_shipdate
5       l_comment
6       l_partkey
7       l_shipmode
8       l_extendedprice
9       l_tax
10      l_returnflag
11      l_receiptdate
12      l_suppkey
13      l_shipinstruct
14      l_linenumber
15      l_commitdate
16      l_quantity
> print(row[tabcol[1]])
O

再定义一个lua表来保存一行的数据

tabval={}
> for key, value in pairs(tabcol)do tabval[key]=row[tabcol[key]] end
> tabval[1]
O
> rowstring=table.concat(tabval, ",")
> print(rowstring)
O,0.04,1,1996-03-13,to beans x-ray carefull,46557,TRUCK,25560.35,0.02,N,1996-03-22,2323,DELIVER IN PERSON,1,1996-02-12,17.00

> headerstring=table.concat(tabcol, ",")
> print(headerstring)
l_linestatus,l_discount,l_orderkey,l_shipdate,l_comment,l_partkey,l_shipmode,l_extendedprice,l_tax,l_returnflag,l_receiptdate,l_suppkey,l_shipinstruct,l_linenumber,l_commitdate,l_quantity

依次输出即可,完整代码如下


driver = require"luasql.duckdb"

env = assert (driver.duckdb())

con = assert (env:connect("tpch_duckdb"))

cur = assert (con:execute"SELECT * from lineitem limit 3")

row = cur:fetch ({}, "a")	-- the rows will be indexed by field names

tabcol={}
i=1
for key, value in pairs(row) do tabcol[i]=key i=i+1 end
headerstring=table.concat(tabcol, ",")
print(headerstring)

tabval={}

while row do
for key, value in pairs(tabcol)do tabval[key]=row[tabcol[key]] end
rowstring=table.concat(tabval, ",")
print(rowstring)
  row = cur:fetch (row, "a")	-- reusing the table of results
end

执行结果如下

root@6ae32a5ffcde:/par# lua ducktpch2.lua
l_tax,l_returnflag,l_discount,l_orderkey,l_linenumber,l_shipmode,l_partkey,l_receiptdate,l_quantity,l_commitdate,l_suppkey,l_extendedprice,l_shipdate,l_comment,l_shipinstruct,l_linestatus
0.02,N,0.04,1,1,TRUCK,46557,1996-03-22,17.00,1996-02-12,2323,25560.35,1996-03-13,to beans x-ray carefull,DELIVER IN PERSON,O
0.06,N,0.09,1,2,MAIL,20193,1996-04-20,36.00,1996-02-28,2194,40074.84,1996-04-12, according to the final foxes. qui,TAKE BACK RETURN,O
0.02,N,0.10,1,3,REG AIR,19110,1996-01-31,8.00,1996-03-05,1111,8232.88,1996-01-29,ourts cajole above the furiou,TAKE BACK RETURN,O
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值