2025年6月在几种单机版数据库上测试TPCH

我们选择DuckDB、GlareDB、clickhouse、datafusion-cli来测试。
本来还想测试国产的Databend
https://repo.databend.com/databend/v1.2.755-nightly/databend-v1.2.755-nightly-aarch64-unknown-linux-gnu.tar.gz
https://repo.databend.com/bendsql/v0.27.5/bendsql-aarch64-unknown-linux-musl.tar.gz
,它需要建stage,步骤比较复杂。
和闭源的cedardb
https://download.cedardb.com/latest/cedar-current-arm64.tar.xz
但还不支持对parquet文件直接查询,暂时没测。
1.利用tpchgen-rs工具生成parquet格式TPCH数据
首先需要一个rust语言编译环境,可参考这篇文章
再下载源代码包,然后在解压缩后的目录下,运行cargo build --release编译。

选择一个目录存放文件,我选的是/par/tpch/,然后运行如下命令行生成sf4规模的数据集。

time /par/tpchgen-rs-main/target/release/tpchgen-cli -s4 --output-dir sf4-parquet --format=parquet

real	0m15.582s
user	1m2.044s
sys	0m2.060s
ls -l /par/tpch/sf4-parquet
total 1456872
-rw-r--r-- 1 root root  53985688 Jun 27 05:54 customer.parquet
-rw-r--r-- 1 root root 975758540 Jun 27 05:54 lineitem.parquet
-rw-r--r-- 1 root root      2966 Jun 27 05:54 nation.parquet
-rw-r--r-- 1 root root 262169281 Jun 27 05:54 orders.parquet
-rw-r--r-- 1 root root  27412078 Jun 27 05:54 part.parquet
-rw-r--r-- 1 root root 168886901 Jun 27 05:54 partsupp.parquet
-rw-r--r-- 1 root root      1474 Jun 27 05:54 region.parquet
-rw-r--r-- 1 root root   3581714 Jun 27 05:54 supplier.parquet

2.进入数据库各自的CLI界面,都打开计时功能,分别复制粘贴如下两个查询语句。

SELECT
    --Query01
    l_returnflag,
    l_linestatus,
    SUM(l_quantity) AS sum_qty,
    SUM(l_extendedprice) AS sum_base_price,
    SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
    SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
    AVG(l_quantity) AS avg_qty,
    AVG(l_extendedprice) AS avg_price,
    AVG(l_discount) AS avg_disc,
    COUNT(*) AS count_order
FROM
    '/par/tpch/sf4-parquet/lineitem.parquet'
WHERE
    l_shipdate <= CAST('1998-09-02' AS date)
GROUP BY
    l_returnflag,
    l_linestatus
ORDER BY
    l_returnflag,
    l_linestatus;


select 
--Query03
l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from 
'/par/tpch/sf4-parquet/customer.parquet' c, 
'/par/tpch/sf4-parquet/orders.parquet' o, 
'/par/tpch/sf4-parquet/lineitem.parquet' l
 where c_mktsegment = 'BUILDING' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date'1995-03-15' and l_shipdate > date'1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10;

3.计算结果比较
查询1

--duckdb
┌──────────────┬──────────────┬───────────────┬─────────────────┬───────────────────┬─────────────────────┬────────────────────┬────────────────────┬──────────────────────┬─────────────┐
│ l_returnflag │ l_linestatus │    sum_qty    │ sum_base_price  │  sum_disc_price   │     sum_charge      │      avg_qty       │     avg_price      │       avg_disc       │ count_order │
│   varcharvarchardecimal(38,2)decimal(38,2)decimal(38,4)decimal(38,6)doubledoubledouble        │    int64    │
├──────────────┼──────────────┼───────────────┼─────────────────┼───────────────────┼─────────────────────┼────────────────────┼────────────────────┼──────────────────────┼─────────────┤
│ A            │ F            │  151026307.00226509091035.52215184111589.9063223794237456.67979025.50338578685517638249.950274615180.04999739437949255921814 │
│ N            │ F            │    3931226.005887988103.965593393984.51605817216592.90647425.538386581262138250.08187898710.05005768706068835153934 │
│ N            │ O            │  297323932.00445900037882.87423604954474.6505440557228781.02920525.50020785327745538242.9479232014060.04999676234321271511659667 │
│ R            │ F            │  151132549.00226643425460.80215313759563.1301223924615047.43351925.51290386238896338260.003970580220.049994429222341385923769 │
└──────────────┴──────────────┴───────────────┴─────────────────┴───────────────────┴─────────────────────┴────────────────────┴────────────────────┴──────────────────────┴─────────────┘
Run Time (s): real 1.596 user 3.024000 sys 0.228000
--glaredb
┌──────────────┬──────────────┬──────────────────┬──────────────────┬───────────────────┬─────────────────────┬────────────────────┬────────────────────┬──────────────────────┬─────────────┐
│ l_returnflag │ l_linestatus │ sum_qty          │ sum_base_price   │ sum_disc_price    │ sum_charge          │ avg_qty            │ avg_price          │ avg_disc             │ count_order │
│ Utf8         │ Utf8         │ Decimal128(38,2) │ Decimal128(38,2) │ Decimal128(38,4)  │ Decimal128(38,6)    │ Float64            │ Float64            │ Float64              │ Int64       │
├──────────────┼──────────────┼──────────────────┼──────────────────┼───────────────────┼─────────────────────┼────────────────────┼────────────────────┼──────────────────────┼─────────────┤
│ A            │ F            │     151026307.00226509091035.52215184111589.9063223794237456.67979025.50338578685517638249.950274615180.04999739437949255921814 │
│ N            │ F            │       3931226.005887988103.965593393984.51605817216592.90647425.538386581262138250.08187898710.05005768706068835153934 │
│ N            │ O            │     297323932.00445900037882.87423604954474.6505440557228781.02920525.50020785327745538242.9479232014060.04999676234321271511659667 │
│ R            │ F            │     151132549.00226643425460.80215313759563.1301223924615047.43351925.51290386238896338260.003970580220.049994429222341385923769 │
└──────────────┴──────────────┴──────────────────┴──────────────────┴───────────────────┴─────────────────────┴────────────────────┴────────────────────┴──────────────────────┴─────────────┘
Execution duration (s): 2.75739
--clickhouse
   ┌─l_returnflag─┬─l_linestatus─┬───sum_qty─┬──sum_base_price─┬────sum_disc_price─┬──────────sum_charge─┬────────────avg_qty─┬──────────avg_price─┬─────────────avg_disc─┬─count_order─┐
1. │ A            │ F            │ 151026307226509091035.52215184111589.9063223794237456.6797925.50338578685517638249.950274615180.049997394379492559218142. │ N            │ F            │   39312265887988103.965593393984.5165817216592.90647425.538386581262138250.08187898710.050057687060688351539343. │ N            │ O            │ 297323932445900037882.87423604954474.6505440557228781.02920525.50020785327745538242.9479232014060.049996762343212715116596674. │ R            │ F            │ 151132549226643425460.8215313759563.1301223924615047.43351925.51290386238896338260.003970580210.049994429222341395923769 │
   └──────────────┴──────────────┴───────────┴─────────────────┴───────────────────┴─────────────────────┴────────────────────┴────────────────────┴──────────────────────┴─────────────┘

4 rows in set. Elapsed: 2.432 sec. Processed 24.00 million rows, 974.66 MB (9.87 million rows/s., 400.81 MB/s.)
Peak memory usage: 147.13 MiB.
--datafusion-cli
+--------------+--------------+--------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
| l_returnflag | l_linestatus | sum_qty      | sum_base_price  | sum_disc_price    | sum_charge          | avg_qty   | avg_price    | avg_disc | count_order |
+--------------+--------------+--------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
| A            | F            | 151026307.00 | 226509091035.52 | 215184111589.9063 | 223794237456.679790 | 25.503385 | 38249.950274 | 0.049997 | 5921814     |
| N            | F            | 3931226.00   | 5887988103.96   | 5593393984.5160   | 5817216592.906474   | 25.538386 | 38250.081878 | 0.050057 | 153934      |
| N            | O            | 297323932.00 | 445900037882.87 | 423604954474.6505 | 440557228781.029205 | 25.500207 | 38242.947923 | 0.049996 | 11659667    |
| R            | F            | 151132549.00 | 226643425460.80 | 215313759563.1301 | 223924615047.433519 | 25.512903 | 38260.003970 | 0.049994 | 5923769     |
+--------------+--------------+--------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
4 row(s) fetched. 
Elapsed 2.364 seconds.

查询3

--duckdb
┌────────────┬───────────────┬─────────────┬────────────────┐
│ l_orderkey │    revenue    │ o_orderdate │ o_shippriority │
│   int64    │ decimal(38,4)date     │     int32      │
├────────────┼───────────────┼─────────────┼────────────────┤
│   21679558442394.39661995-01-170 │
│    8443557429378.65811995-03-120 │
│   10631491423595.95021995-03-140 │
│   20841824421258.04241995-03-080 │
│   11568359418888.55911995-03-130 │
│   10725895399668.42271995-03-120 │
│   20811973399662.47101995-02-130 │
│   17937728396726.32891995-03-120 │
│    6128295393717.72311995-03-050 │
│   21236614392019.32251995-03-140 │
├────────────┴───────────────┴─────────────┴────────────────┤
│ 10 rows                                         4 columns │
└───────────────────────────────────────────────────────────┘
Run Time (s): real 0.425 user 1.832000 sys 0.296000
--glaredb
┌────────────┬──────────────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue          │ o_orderdate │ o_shippriority │
│ Int64      │ Decimal128(38,4) │ Date32      │ Int32          │
├────────────┼──────────────────┼─────────────┼────────────────┤
│   21679558442394.39661995-01-170 │
│    8443557429378.65811995-03-120 │
│   10631491423595.95021995-03-140 │
│   20841824421258.04241995-03-080 │
│   11568359418888.55911995-03-130 │
│   10725895399668.42271995-03-120 │
│   20811973399662.47101995-02-130 │
│   17937728396726.32891995-03-120 │
│    6128295393717.72311995-03-050 │
│   21236614392019.32251995-03-140 │
└────────────┴──────────────────┴─────────────┴────────────────┘
Execution duration (s): 1.08639
--clickhouse
    ┌─l_orderkey─┬─────revenue─┬─o_orderdate─┬─o_shippriority─┐
 1.21679558442394.39661995-01-1702.8443557429378.65811995-03-1203.10631491423595.95021995-03-1404.20841824421258.04241995-03-0805.11568359418888.55911995-03-1306.10725895399668.42271995-03-1207.20811973399662.4711995-02-1308.17937728396726.32891995-03-1209.6128295393717.72311995-03-05010.21236614392019.32251995-03-140 │
    └────────────┴─────────────┴─────────────┴────────────────┘

10 rows in set. Elapsed: 3.736 sec. Processed 30.60 million rows, 1.29 GB (8.19 million rows/s., 345.50 MB/s.)
Peak memory usage: 1.33 GiB.
--datafusion-cli
+------------+-------------+-------------+----------------+
| l_orderkey | revenue     | o_orderdate | o_shippriority |
+------------+-------------+-------------+----------------+
| 21679558   | 442394.3966 | 1995-01-17  | 0              |
| 8443557    | 429378.6581 | 1995-03-12  | 0              |
| 10631491   | 423595.9502 | 1995-03-14  | 0              |
| 20841824   | 421258.0424 | 1995-03-08  | 0              |
| 11568359   | 418888.5591 | 1995-03-13  | 0              |
| 10725895   | 399668.4227 | 1995-03-12  | 0              |
| 20811973   | 399662.4710 | 1995-02-13  | 0              |
| 17937728   | 396726.3289 | 1995-03-12  | 0              |
| 6128295    | 393717.7231 | 1995-03-05  | 0              |
| 21236614   | 392019.3225 | 1995-03-14  | 0              |
+------------+-------------+-------------+----------------+
10 row(s) fetched. 
Elapsed 1.233 seconds.

4.总结
从SQL兼容性看,DuckDB最好,多个表连接时不需要别名也能解析成功。为了其他数据库也能执行,我统一加上了表别名。
从表格显示看,DuckDB、GlareDB、clickhouse都能做到文本类型左对齐,数字类型右对齐,前两个还能显示数据类型。datafusion-cli统一左对齐。clickhouse自动删除末尾的0。
从执行速度看,duckdb最快,其他3种数据库都在同一水平。

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值