比如我电脑上的一个文件下的内容如下,两个子文件中只有文件无文件夹:
目录
os.walk遍历所有子目录和文件
首先需要理解os.walk返回的三个值的含义
root, dirnames, filenames = os.walk(filepath)
root 所指的是当前正在遍历的这个文件夹的本身的地址
dirs 是一个 list ,内容是该文件夹中所有的目录的名字(不包括子目录)
files 同样是 list , 内容是该文件夹中所有的文件(不包括子目录)
是不是感觉有点抽象,实际跑一下程序帮助大家理解
for root, dirnames, filenames in os.walk(filepath):
print(root)
D:\9_Code\Tensorflow_tutorials
D:\9_Code\Tensorflow_tutorials.ipynb_checkpoints
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization
三个目录的绝对路径
for root, dirnames, filenames in os.walk(filepath):
print(dirnames)
[’.ipynb_checkpoints’, ‘Convnet_Visualization’]
[]
[]
三个目录中的子目录
for root, dirnames, filenames in os.walk(filepath):
print(filenames)
[‘cache.tf-data.index’, ‘Cat.jpg’, ‘Convnet_Visualization_CAM - 副本.ipynb’, ‘Convnet_Visualization_CAM.ipynb’, ‘Egyptian_cat.jpg’, ‘Load images.ipynb’]
[‘Load images-checkpoint.ipynb’]
[‘cat_egyptian.jpg’, ‘Convnet_Visualization.ipynb’, ‘Convnet_Visualization.png’, ‘Convnet_Visualization1.png’, ‘Convnet_Visualization2.png’, ‘Convnet_Visualization3.png’, ‘Convnet_Visualization4.png’, ‘Egptian_cat.jpg’, ‘sedan.jpg’, ‘tabby.jpg’, ‘Tabby1.jpg’, ‘tiger_cat.jpg’, ‘tigger_cat.jpg’]
三个目录中的文件
遍历所有子目录
dirlist=[]
for root, dirnames, filenames in os.walk(filepath):
for dirname in dirnames:
dirlist.append(os.path.join(root, dirname))
print(os.path.join(root, dirname))
D:\9_Code\Tensorflow_tutorials.ipynb_checkpoints
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization
遍历所有子文件
filelist=[]
for root, dirnames, filenames in os.walk(filepath):
for filename in filenames:
filelist.append(os.path.join(root,filename))
print(os.path.join(root,filename))
D:\9_Code\Tensorflow_tutorials\cache.tf-data.index
D:\9_Code\Tensorflow_tutorials\Cat.jpg
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization_CAM - 副本.ipynb
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization_CAM.ipynb
D:\9_Code\Tensorflow_tutorials\Egyptian_cat.jpg
D:\9_Code\Tensorflow_tutorials\Load images.ipynb
D:\9_Code\Tensorflow_tutorials.ipynb_checkpoints\Load images-checkpoint.ipynb
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\cat_egyptian.jpg
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\Convnet_Visualization.ipynb
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\Convnet_Visualization.png
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\Convnet_Visualization1.png
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\Convnet_Visualization2.png
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\Convnet_Visualization3.png
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\Convnet_Visualization4.png
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\Egptian_cat.jpg
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\sedan.jpg
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\tabby.jpg
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\Tabby1.jpg
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\tiger_cat.jpg
D:\9_Code\Tensorflow_tutorials\Convnet_Visualization\tigger_cat.jpg
os.path
同样我们可以利用os.path自己写循环实现相同的目的
遍历所有子目录
import os
filepath=r'D:\9_Code\Tensorflow_tutorials'
dirlist=[]
list1= os.listdir(filepath)
for file1 in list1:
path = os.path.join(filepath, file1)
if os.path.isdir(path):
dirlist.append(path)
print(dirlist)
遍历所有子文件
import os
filepath=r'D:\9_Code\Tensorflow_tutorials'
filelist=[]
list1= os.listdir(filepath)
for file1 in list1:
path = os.path.join(filepath, file1)
if os.path.isfile(path):
filelist.append(path)
else:
list2=os.listdir(path)
for file2 in list2:
path2 = os.path.join(path, file2)
if os.path.isfile(path2):
filelist.append(path2)
print(filelist)
glob.glob明确的文件类型
如果我们的检索目标比较明确,比如我要得到此目录下的.jpg文件,可以使用glob.glob
import glob
filepath=r'D:\9_Code\Tensorflow_tutorials'
jpg_img = glob.glob(filepath+'\*.jpg')
print(jpg_img)
[‘D:\9_Code\Tensorflow_tutorials\Cat.jpg’, ‘D:\9_Code\Tensorflow_tutorials\Egyptian_cat.jpg’]
便可直接返回目录下的列表形式,
注意:输入是相对路径,则返回也是相对路径
glob.glob多种文件类型
一个小技巧’*.[pn][jp]g’可以实现jpg或者png文件的同时读取
import glob
filepath=r'D:\9_Code\Tensorflow_tutorials'
jpg_img = glob.glob(filepath+'\*.[pn][jp]g')
比如写一个便利两级子目录的文件和图片的代码
import os
import glob
import csv
filepath = './'
filelist1 = sorted(os.listdir(filepath))
sum_frames=0
sum_file=0
with open('num_frame.csv', mode='w', newline='') as f:
writer = csv.writer(f)
for file1 in filelist1:
if len(file1)<4:
filelist2 = sorted(os.listdir(file1))
for file2 in filelist2:
path = os.path.join(filepath, file1, file2)
frames = glob.glob(path+'/*.[jp][pn]g')
num_frame =len(frames)
sum_frames +=num_frame
writer.writerow([num_frame])
sum_file +=1
# print(path, num_frame)
print('totoal number frames:', sum_frames)
print('totol file:', sum_file)
判断是否未图像
images_names = [f for f in os.listdir(videos[i] + frames_path) if
f.endswith(('.jpg', '.jpeg', '.png'))]
images_names.sort()
检查文件夹是否存在
另外在一些创建文件夹操作中,使用os.path.exists可以避免重复操作
if not os.path.exists(filename):
os.makedirs(filepath)
检查是否为文件夹
for ....:
if not example_folder.is_dir():
continue
获得当前绝对路径
from pathlib import Path
default_data_dir = Path(__file__).resolve().parent.parent / "data"
删除路径下所有文件和文件夹
在这里插入代码片