【tpc-h】数据库性能测试数据批量生成工具

原创于 2024-12-05 11:12:46 发布 · 567 阅读

5 ·

CC 4.0 BY-SA版权

文章标签：

#数据库

参考文章

TPC-H测试超详细介绍
 金仓数据库TPC-H自动化测试实践
 2024年8款数据库数据分析能力（TPC-H）真实性能评测

TPC使用

以tpc-h为例

部署tpc-h

下载地址：https://www.tpc.org/tpc_documents_current_versions/current_specifications5.asp
在Source Code栏下载，一般需要再linux环境使用，当然在win环境使用也可以。以linux为例：
解压文件，生成TPC-H_Tools_v3.0.0文件夹，cd TPC-H_Tools_v3.0.0/dbgen，cp makefile.suite Makefile,vim Makefile,
可以看到DATABASE可以选择的比较少，没关系，tpc生成的数据类型简单，其他数据库也基本都支持。

CC      = gcc
# Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
#                                  SQLSERVER, SYBASE, ORACLE, VECTORWISE
# Current values for MACHINE are:  ATT, DOS, HP, IBM, ICL, MVS, 
#                                  SGI, SUN, U2200, VMS, LINUX, WIN32 
# Current values for WORKLOAD are:  TPCH
DATABASE=SQLSERVER
MACHINE = LINUX
WORKLOAD = TPCH

在dbgen执行make命令编译，生成dbgen和qgen可执行文件

生成数据

执行./dbgen -h查看帮助：

[root@some dbgen]# ./dbgen -h
TPC-H Population Generator (Version 3.0.0 build 0)
Copyright Transaction Processing Performance Council 1994 - 2010
USAGE:
dbgen [-{vf}][-T {pcsoPSOL}]
        [-s <scale>][-C <procs>][-S <step>]
dbgen [-v] [-O m] [-s <scale>] [-U <updates>]

Basic Options
===========================
-C <n> -- separate data set into <n> chunks (requires -S, default: 1)
-f     -- force. Overwrite existing files
-h     -- display this message
-q     -- enable QUIET mode
-s <n> -- set Scale Factor (SF) to  <n> (default: 1) 
-S <n> -- build the <n>th step of the data/update set (used with -C or -U)
-U <n> -- generate <n> update sets
-v     -- enable VERBOSE mode

Advanced Options
===========================
-b <s> -- load distributions for <s> (default: dists.dss)
-d <n> -- split deletes between <n> files (requires -U)
-i <n> -- split inserts between <n> files (requires -U)
-T c   -- generate cutomers ONLY
-T l   -- generate nation/region ONLY
-T L   -- generate lineitem ONLY
-T n   -- generate nation ONLY
-T o   -- generate orders/lineitem ONLY
-T O   -- generate orders ONLY
-T p   -- generate parts/partsupp ONLY
-T P   -- generate parts ONLY
-T r   -- generate region ONLY
-T s   -- generate suppliers ONLY
-T S   -- generate partsupp ONLY

To generate the SF=1 (1GB), validation database population, use:
        dbgen -vf -s 1

To generate updates for a SF=1 (1GB), use:
        dbgen -v -U 1 -s 1

在此帮助的最后给出了案例：dbgen -vf -s 1
所以常用参数：
-f – force. 覆盖已有文件
-s – set 放大因子，就是生成数量的放大因子，1约为1GB，0.01约为10MB。
-v – enable VERBOSE mode 打印详细日志。

执行后，会在dbgen文件夹内生成*.tbl文件就是数据文件了，比如：

nation.tbl文件内容：

0|ALGERIA|0| haggle. carefully final deposits detect slyly agai|
1|ARGENTINA|1|al foxes promise slyly according to the regular accounts. bold requests alon|
2|BRAZIL|1|y alongside of the pending deposits. carefully special packages are about the ironic forges. slyly special |

可以使用mv命令都再放一个文件夹，mkdir ../tbls,mv *.tbl ../tbls

生成查询语句

./qgen -h

[root@some dbgen]# ./qgen -h
TPC-H Parameter Substitution (v. 3.0.0 build 0)
Copyright Transaction Processing Performance Council 1994 - 2010
USAGE: ./qgen <options> [ queries ]
Options:
        -a              -- use ANSI semantics.
        -b <str>        -- load distributions from <str>
        -c              -- retain comments found in template.
        -d              -- use default substitution values.
        -h              -- print this usage summary.
        -i <str>        -- use the contents of file <str> to begin a query.
        -l <str>        -- log parameters to <str>.
        -n <str>        -- connect to database <str>.
        -N              -- use default rowcounts and ignore :n directive.
        -o <str>        -- set the output file base path to <str>.
        -p <n>          -- use the query permutation for stream <n>
        -r <n>          -- seed the random number generator with <n>
        -s <n>          -- base substitutions on an SF of <n>
        -v              -- verbose.
        -t <str>        -- use the contents of file <str> to complete a query
        -x              -- enable SET EXPLAIN in each query.

创建个文件夹mkdir queries生成一个批量执行qgen的脚本：

cat >> gen-sql.sh < EOF
for i  in {1..22}
do
name="d$i.sql"
echo $name
./qgen -d $i >$name
done
EOF

执行后会生成1.sql .. 22.sql查询文件。

ddl文件

dbgen/dss.ddl就是dll文件