在类似于集成构建的平台,会有各种脚本,日志或临时数据文件按日期保存,时间久了,磁盘存储空间不足,因此需要增加相关脚本做定期清理。如下以linux环境的清理为例,分享案例。
业务场景:在目录/data_shared下有如下几个目录
dmcs@/data_shared>ls
dmcs_list damm_check 2352 2690 20191210 20191211 20191212 20191215 20191216
tyyh zmsn f3dj3lix_12.zip
需求:查找当前目录下的日期目录,并删除7天之前的8位日期目录。
过程:
使用正则表达式,获取所有日期目录,不太严谨,但是能匹配出8位数字目录
dmcs@/data_shared>ls mypath | grep -E "[0-9]{4}[0-9]{2}[0-9]{2}"
20191210
20191211
20191212
20191215
20191216
因为还需要做删除,换了用find命令来查找并删除。
1、先查找日期目录,依然使用同上的正则表达式,发现无法获取到匹配的日期目录。
dmcs@/data_shared>find ./ -regex '[0-9]{4}[0-9]{2}[0-9]{2}' -type d
2、测试另一种表达式,发现能匹配到带数字的目录,但是无法精确匹配到8位日期目录
dmcs@/data_shared>find ./ -regex '\./[0-9]+' -type d
./2352
./2690
./20191210
./20191211
./20191212
./20191215
./20191216
3、参照下方资料,通过regextype 指定正则表达式类型,可查询获取到日期目录。
dmcs@/data_shared>find ./ -regextype 'posix-egrep' -regex '\./[0-9]{2}[0-9]{2}[0-9]{2}' -type d
./20191210
./20191211
./20191212
./20191215
./20191216
4、删除7天前的日期目录
dmcs@/data_shared>find ./ -regextype 'posix-egrep' -regex '\./[0-9]{2}[0-9]{2}[0-9]{2}' -type d -exec rm -rf {} \;
删除成功!
补充:通过另一篇帖子(https://blog.youkuaiyun.com/dubiousway/article/details/8165121)了解到需要指定正则表达式类型。
-regex pattern
File name matches regular expression pattern. This is a match on the whole path, not a search. For example, to match a file named ./fubar3’, you can
use the regular expression.bar.’ or.b.3', but not
f.r3’. The regular expressions understood by find are by default Emacs Regular Expressions,
but this can be changed with the -regextype option.
-regextype type
Changes the regular expression syntax understood by -regex and -iregex tests which occur later on the command line. Currently-implemented types are emacs (this is the default), posix-awk, posix-basic, posix-egrep and posix-extended.
通过命令行查询获取到,-regextype 有如下几种合法类型。即 “findutils-default”, “awk”, “egrep”, “ed”, “emacs”, “gnu-awk”, “grep”, “posix-awk”, “posix-egrep”, “posix-entended”, “posix-minimal-basic”, “sed”