脚本背景
在日常工作中,会经常遇到需要把某个环境的数据迁移至另一个环境的库中去。作为测试人员,最怕的是测试的时候没有数据。如果别的环境存在数据,并且能够有权限去获取数据,这个时候我通常会从其他环境(例如:开发环境)中拷贝一份数据过来。
用navicate其实很容易进行导出整个库的数据,如果库中只有某几张表数据会用到,那么也可以只导出某几张表。但有时会遇到这么一个情况,表中的数据只有少量是我们需要的,可能表中有近千万的数据,我们只要近千条数据就能满足测试。如果表字段较多,导出、导入就很耗费时间,而且手动操作也很麻烦。于是,准备用脚本去代替人工这一过程。
过程分析
有了想法,接下来是分析想法的可行性。1、配置两个数据库,一个为源数据库,一个为目标测试库。2、从源数据库获取数据。3、将获取的数据插入到目标测试库中。
针对第1步,我的做法是将数据库配置放在一个json文件中(可查阅之前的博客:https://blog.youkuaiyun.com/ck3207/article/details/78031505)如下:
写一个类,连接数据库,以及读取json配置文件:
class Connect_mysql:
"""Get Configuration and Connect to Mysql!"""
def __init__(self):
setup_logging()
def get_config(self, file_name="config"):
"""Get Configuration!"""
with open(file_name, "r", encoding="utf-8") as f:
config = json.load(f)
return config
def conn_mysql(self, host, port, user, password, database, charset="utf8"):
"""Connetct to Mysql."""
logger = logging.getLogger(self.__class__.__name__)
try:
conn = pymysql.connect(host=host, port=port, user=user, password=password, database=database, charset=charset)
cur = conn.cursor()
return conn, cur
except Exception as e:
logger.info('Connect to mysql Error!')
logger.error(e)
def disconnect(self, conn, cur):
cur.close()
conn.close()
调用如下:
connect_mysql = Connect_mysql()
mysql_config = connect_mysql.get_config("mysql_config.json")
conn, cur = connect_mysql.conn_mysql(host=mysql_config["localhost_cf_test"]["host"], port=mysql_config["localhost_cf_test"]["port"],\
user=mysql_config["localhost_cf_test"]["user"], password=mysql_config["localhost_cf_test"]["password"], \
database=mysql_config["localhost_cf_test"]["database"], charset=mysql_config["localhost_cf_test"]["charset"])
针对第2~3步,我是这么考虑的:维护一个表配置项(哪些需要进行处理的表,放在一个json配置文件中),根据配置会拼接sql,通过拼接的sql去源数据库查询数据,而后将插入结果转化成insert语句,插入到目标测试库中。表配置如下:
{
"smart_stock_notice":{"hasWhere":false,"isPass":true,"cols":{"CreateTime":">='2018-04-01' and CreateTime<='2018-04-11'"},
"limit":"100"},
"smart_stock_pbpe":{"hasWhere":false,"isPass":true,"cols":{"CreateTime":">='2018-04-01' and CreateTime<='2018-04-11'"},
"limit":"100"},
"smart_stock_performance":{"hasWhere":true,"isPass":false,"cols":{"UpdateTime":">='2018-04-01' and UpdateTime<='2018-04-20'"},
"orderby":"UpdateTime desc","limit":"10"},
"smart_stock_platebelong":{"hasWhere":false,"isPass":true,"cols":{"CreateTime":">='2018-04-01' and CreateTime<='2018-04-11'"},
"limit":"100"}
}
hasWhere判断拼接的sql是否有where条件;isPass 是过滤条件,如果为真则表示不对这张表进行数据迁移,反之,则对这张表进行数据迁移;cols 是where 后的条件,其键为字段名,值为字段条件(此条件仅在hasWhere条件成立情况下生效);orderby 的值为排序规则;limit 的值是查询条数。sql拼接代码如下:
def insert_data_sql(self, table, table_info):
"""Query from database."""
where_condition = ""
# 判断是否有where条件
if table_info.get("hasWhere") == True:
where_condition = "where "
# 判断是否有条件过滤
if table_info.get("cols"):
for key,value in table_info["cols"].items():
where_condition += key+value+" and "
where_condition = where_condition[:-4]
# 判断是否有排序
if table_info.get("orderby"):
where_condition += " order by " + table_info.get("orderby")
# 判断是否查询条数限制
if table_info.get("limit"):
where_condition += " limit " + table_info.get("limit")
condition = where_condition
# 查询源数据库sql
query_sql = """select * from {0} {1} ;""".format(table, condition)
# 创表语句
create_table = self.get_sql_of_drop_and_create_table(table)
# 插入数据的sql
insert_sql = "insert into {0} values ".format(table)
return create_table,query_sql,insert_sql
方法get_sql_of_drop_and_create_table 是获取源数据库的创表语句。为了防止数据库脚本不一致而报错。如下:
def get_sql_of_drop_and_create_table(self,table):
"""Get the sql of drop table and create table."""
sql = """show create table {0}""".format(table)
self.cur_from.execute(sql)
create_table = self.cur_from.fetchone()
sql = "drop table if EXISTS {0};\n".format(table) + create_table[1]
return sql
通过query_data_sql 可以得到查询sql,下一步是将查询的结果拼接成insert 语句。通过fetchmany 控制每次插入的数据量,当该方法无返回数据时,表示数据已经批量处理完毕。如下:
def insert_data(self):
"""Insert Data which query from database to the target database"""
# 表配置项
for table,table_info in self.table_dic.items():
# 是否过滤此表
if table_info["isPass"] == True:
continue
# 创表语句,查询源数据库语句,插入语句
create_sql, query_sql,insert_sql = self.query_data_sql(table,table_info)
# 目标库重新建表
self.cur_target.execute(create_sql)
# 源数据库获取数据
self.cur_from.execute(query_sql)
while True:
# 每次读取指定数量数据
data = self.cur_from.fetchmany(self.size)
# 数据是否读取完毕
if data == ():
break
temp_sql = ""
# 拼接获取的每一条数据值
for each_value in data:
temp_sql = "{0}(".format(temp_sql)
# 拼接一条数据的每个字段值
for each_col in each_value:
# 如果查询表字段为NULL,查询返回为None,需处理成NULL
if each_col == None:
temp_sql = "{0}NULL,".format(temp_sql)
else:
temp_sql = "{0}'{1}',".format(temp_sql,each_col)
temp_sql = "{0}),".format(temp_sql[:-1])
# 拼接后的插入sql语句
sql = insert_sql + temp_sql[:-1] + ";"
# execute_sql
print("Dealing table:{0}".format(table))
# 执行插入sql语句
self.query_data_to_insert_data(sql)
# 关闭数据连接
self.__end()
query_data_to_insert_data 为一个执行插入sql语句的方法,如下:
def query_data_to_insert_data(self,insert_sql):
"""Execute SQL."""
try:
self.cur_target.execute(insert_sql)
self.conn_target.commit()
except Exception as e:
print(str(e))
print("Insert Data SQL:", insert_sql)
__end() 是关闭数据连接的方法,如下:
def __end(self):
"""Close Connection!"""
self.cur_from.close()
self.cur_target.close()
self.conn_from.close()
self.conn_target.close()
上述所有方法都是在类Transfer_Data 中,它的_init_ 函数如下:
class Transfer_Data:
"""从A数据库筛选特定数据插入至指定数据库B中"""
def __init__(self, size=2000):
"""Connect to mysql.
table_dict is a dict. For example:
{"table":{"smart_stock_search":{"hasWhere":True,"cols":{"CreateTime":">='2018-04-01' \
and CreateTime<='2018-04-11'"},"orderby":"CreateTime asc", "limit":"100"}}}"""
# 定义每组数据的插入条数
self.size = size
# 获取表配置项
self.table_dic = self.get_config()
connect_mysql = Connect_mysql()
# 获取数据库配置
mysql_config = connect_mysql.get_config("mysql_config.json")
self.conn_from, self.cur_from = connect_mysql.conn_mysql(host=mysql_config["small_tools_ifs8"]["host"],
port=mysql_config["small_tools_ifs8"]["port"],
user=mysql_config["small_tools_ifs8"]["user"],
password=mysql_config["small_tools_ifs8"]["password"],
database=mysql_config["small_tools_ifs8"]["database"],
charset=mysql_config["small_tools_ifs8"]["charset"])
self.conn_target, self.cur_target = connect_mysql.conn_mysql(host=mysql_config["localhost_cf_test"]["host"],
port=mysql_config["localhost_cf_test"]["port"],
user=mysql_config["localhost_cf_test"]["user"],
password=mysql_config["localhost_cf_test"]["password"],
database=mysql_config["localhost_cf_test"]["database"],
charset=mysql_config["localhost_cf_test"]["charset"])
脚本调用方法为:
if __name__ == "__main__":
transfer = Transfer_Data(size=2000)
transfer.insert_data()