1、模型介绍
SSB(Star Schema Benchmark)是麻省州立大学波士顿校区的研究人员定义的基于现实商业应用的数据模型,业界公认用来模拟决策支持类应用,比较公正和中立。
学术界和工业界普遍采用它来评价决策支持技术方面应用的性能。
全方位评测系统的整体商业计算综合能力,对厂商的要求更高。
在银行信贷分析和信用卡分析、电信运营分析、税收分析、烟草行业决策分析中都有广泛的应用。
SSB基准测试包括:
1个事实表:lineorder
4个维度表:customer,part,dwdate,supplier
13条标准SQL查询测试语句:统计查询、多表关联、sum、复杂条件、group by、order by等组合方式。

SSB数据自动生成器下载地址:地址
2、生成数据步骤
生成步骤:
1、将下载的dbgen.zip文件上传到服务器并解压unzip dbgen.zip

2、编译
编译之前需要安装依赖包
yum -y install gcc gcc-c++ make cmake
root@gbase01:/opt# cd dbgen/
root@gbase01:/opt/dbgen# make


最后的告警可以忽略,执行完成之后会生成dbgen文件,查看 dbgen 命令帮助:./dbgen -h
root@gbase01:/opt/dbgen# ./dbgen -h
SSBM (Star Schema Benchmark) Population Generator (Version 1.0.0)
Copyright Transaction Processing Performance Council 1994 - 2000
USAGE:
dbgen [-{vfFD}] [-O {fhmsv}][-T {pcsdla}]
[-s <scale>][-C <procs>][-S <step>]
dbgen [-v] [-O {dfhmr}] [-s <scale>] [-U <updates>] [-r <percent>]
-b <s> -- load distributions for <s>
-C <n> -- use <n> processes to generate data
[Under DOS, must be used with -S]
-D -- do database load in line
-d <n> -- split deletes between <n> files
-f -- force. Overwrite existing files
-F -- generate flat files output
-h -- display this message
-i <n> -- split inserts between <n> files
-n <s> -- inline load into database <s>
-O d -- generate SQL syntax for deletes
-O f -- over-ride default output file names
-O h -- output files with headers
-O m -- produce columnar output
-O r -- generate key ranges for deletes.
-O v -- Verify data set without generating it.
-q -- enable QUIET mode
-r <n> -- updates refresh (n/100)% of the
data set
-s <n> -- set Scale Factor (SF) to <n>
-S <n> -- build the <n>th step of the data/update set
-T c -- generate cutomers dimension table ONLY
-T p -- generate parts dimension table ONLY
-T s -- generate suppliers dimension table ONLY
-T d -- generate date dimension table ONLY
-T l -- generate lineorder fact table ONLY
-U <s> -- generate <s> update sets
-v -- enable VERBOSE mode
To generate the SF=1 (1GB), validation database population, use:
dbgen -vfF -s 1
To generate updates for a SF=1 (1GB), use:
dbgen -v -U 1 -s 1
root@gbase01:/opt/dbgen#
生成表数据
生成customer.tbl
./dbgen -s 1 -T c
生成part.tbl
./dbgen -s 1 -T p
生成supplier.tbl
./dbgen -s 1 -T s
生成date.tbl
./dbgen -s 1 -T d
生成lineorder.tbl
./dbgen -s 1 -T l
生成所有的表for all SSBM tables 是参数a
./dbgen -s 1 -T a
-s<n> -- 比例因子(SF)设置为<n> 比如生成100倍的数据就是 ./dbgen -s 100 -T a

3、数据库建表
创建数据库: create database ssbm;
创建数据库表:
CREATE TABLE CUSTOMER ( C_CUSTKEY INTEGER,
C_NAME VARCHAR(25) NOT NULL,
C_ADDRESS VARCHAR(40) NOT NULL,
C_CITY VARCHAR(10) NOT NULL,
C_NATION VARCHAR(15) NOT NULL,
C_REGION VARCHAR(12) NOT NULL,
C_PHONE VARCHAR(15) NOT NULL,
C_MKTSEGMENT VARCHAR(10) NOT NULL);
CREATE TABLE DATES ( D_DATEKEY INTEGER,
D_DATE VARCHAR(18) NOT NULL,
D_DAYOFWEEK VARCHAR(18) NOT NULL,
D_MONTH VARCHAR(9) NOT NULL,
D_YEAR INTEGER NOT NULL,
D_YEARMONTHNUM INTEGER,
D_YEARMONTH VARCHAR(7) NOT NULL,
D_DAYNUMINWEEK INTEGER,
D_DAYNUMINMONTH INTEGER,
D_DAYNUMINYEAR INTEGER,
D_MONTHNUMINYEAR INTEGER,
D_WEEKNUMINYEAR INTEGER,
D_SELLINGSEASON VARCHAR(12) NOT NULL,
D_LASTDAYINWEEKFL INTEGER,
D_LASTDAYINMONTHFL INTEGER,
D_HOLIDAYFL INTEGER,
D_WEEKDAYFL INTEGER);
CREATE TABLE PART ( P_PARTKEY INTEGER,
P_NAME VARCHAR(22) NOT NULL,
P_MFGR VARCHAR(6) NOT NULL,
P_CATEGORY VARCHAR(7) NOT NULL,
P_BRAND VARCHAR(9) NOT NULL,
P_COLOR VARCHAR(11) NOT NULL,
P_TYPE VARCHAR(25) NOT NULL,
P_SIZE INTEGER NOT NULL,
P_CONTAINER VARCHAR(10) NOT NULL);
CREATE TABLE SUPPLIER ( S_SUPPKEY INTEGER,
S_NAME VARCHAR(25) NOT NULL,
S_ADDRESS VARCHAR(25) NOT NULL,
S_CITY VARCHAR(10) NOT NULL,
S_NATION VARCHAR(15) NOT NULL,
S_REGION VARCHAR(12) NOT NULL,
S_PHONE VARCHAR(15) NOT NULL);
CREATE TABLE LINEORDER ( LO_ORDERKEY BIGINT,
LO_LINENUMBER BIGINT,
LO_CUSTKEY INTEGER NOT NULL,
LO_PARTKEY INTEGER NOT NULL,
LO_SUPPKEY INTEGER NOT NULL,
LO_ORDERDATE INTEGER NOT NULL,
LO_ORDERPRIOTITY VARCHAR(15) NOT NULL,
LO_SHIPPRIOTITY INTEGER,
LO_QUANTITY BIGINT,
LO_EXTENDEDPRICE BIGINT,
LO_ORDTOTALPRICE BIGINT,
LO_DISCOUNT BIGINT,
LO_REVENUE BIGINT,
LO_SUPPLYCOST BIGINT,
LO_TAX BIGINT,
LO_COMMITDATE INTEGER NOT NULL,
LO_SHIPMODE VARCHAR(10) NOT NULL);
加载数据
mysql
load data infile '/dbgen/lineorder.tbl' into table LINEORDER fields terminated by '|' lines terminated by '|\n';
load data infile '/dbgen/supplier.tbl' into table SUPPLIER fields terminated by '|' lines terminated by '|\n';
load data infile '/dbgen/customer.tbl' into table CUSTOMER fields terminated by '|' lines terminated by '|\n';
load data infile '/dbgen/part.tbl' into table PART fields terminated by '|' lines terminated by '|\n';
load data infile '/dbgen/date.tbl' into table DATES fields terminated by '|' lines terminated by '|\n';
gbase
load data infile ' sftp://gbase:gbase@node1ip//opt/data/lineorder.tbl' into table lineorder data_format 3 fields terminated by '|';
load data infile ' sftp://gbase:gbase@node1ip//opt/data/part.tbl' into table part data_format 3 fields terminated by '|';
load data infile ' sftp://gbase:gbase@node1ip//opt/data/supplier.tbl' into table supplier data_format 3 fields terminated by '|';
load data infile ' sftp://gbase:gbase@node1ip//opt/data/customer.tbl' into table customer data_format 3 fields terminated by '|';
load data infile ' sftp://gbase:gbase@node1ip//opt/data/dwdate.tbl' into table dwdate data_format 3 fields terminated by '|';
ClickHouse参考:
https://www.it610.com/article/1314439393146150912.htm
2538

被折叠的 条评论
为什么被折叠?



