分析一点python源代码

最新推荐文章于 2025-08-12 18:06:10 发布

转载最新推荐文章于 2025-08-12 18:06:10 发布 · 57 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/nannanITeye/p/3283487.html

文章标签：

#python

本文深入剖析Python源代码，以os.walk()函数为例，详细讲解递归实现原理及yield的嵌套使用，提供代码解读与实践案例，帮助开发者深入理解Python核心库的高效实现方式。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

偶然看了一下python的部分源代码，感觉python的作者写的代码真心很美，简洁美观，学习之。

举几个例子抛砖引玉一下：

def removedirs(name):
    """removedirs(path)

    Super-rmdir; remove a leaf directory and all empty intermediate
    ones.  Works like rmdir except that, if the leaf directory is
    successfully removed, directories corresponding to rightmost path
    segments will be pruned away until either the whole path is
    consumed or an error occurs.  Errors during this latter phase are
    ignored -- they generally mean that a directory was not empty.

    """
    rmdir(name)
    head, tail = path.split(name)
    if not tail:
        head, tail = path.split(head)
    while head and tail:
        try:
            rmdir(head)
        except error:
            break
        head, tail = path.split(head)

这个函数的英文解释：删除一个空的目录，也就是一个空的文件夹，注意文件夹必须是空的，不能有子文件夹，也不能有子文件，否则会报错。它是os.rmdir()方法的加强版，os.rmdir()作用是删除一个空的目录，仅此而已，但是os.removedirs()方法删除了当前的目录后，会试着去删除它的上一级目录，如果是空的，就继续删除，否则停止，说明上级目录是非空的。

分析源码：首先rmdir(name)删除给定的空目录，通过path.split(name)得到它的上级目录，主要是下面的while循环，每次都会尝试删除head，即上级目录，直到上级目录非空。

 1 def walk(top, topdown=True, onerror=None, followlinks=False):
 2     """Directory tree generator.
 3 
 4     For each directory in the directory tree rooted at top (including top
 5     itself, but excluding '.' and '..'), yields a 3-tuple
 6 
 7         dirpath, dirnames, filenames
 8 
 9     dirpath is a string, the path to the directory.  dirnames is a list of
10     the names of the subdirectories in dirpath (excluding '.' and '..').
11     filenames is a list of the names of the non-directory files in dirpath.
12     Note that the names in the lists are just names, with no path components.
13     To get a full path (which begins with top) to a file or directory in
14     dirpath, do os.path.join(dirpath, name).
15 
16     If optional arg 'topdown' is true or not specified, the triple for a
17     directory is generated before the triples for any of its subdirectories
18     (directories are generated top down).  If topdown is false, the triple
19     for a directory is generated after the triples for all of its
20     subdirectories (directories are generated bottom up).
21 
22     When topdown is true, the caller can modify the dirnames list in-place
23     (e.g., via del or slice assignment), and walk will only recurse into the
24     subdirectories whose names remain in dirnames; this can be used to prune
25     the search, or to impose a specific order of visiting.  Modifying
26     dirnames when topdown is false is ineffective, since the directories in
27     dirnames have already been generated by the time dirnames itself is
28     generated.
29 
30     By default errors from the os.listdir() call are ignored.  If
31     optional arg 'onerror' is specified, it should be a function; it
32     will be called with one argument, an os.error instance.  It can
33     report the error to continue with the walk, or raise the exception
34     to abort the walk.  Note that the filename is available as the
35     filename attribute of the exception object.
36 
37     By default, os.walk does not follow symbolic links to subdirectories on
38     systems that support them.  In order to get this functionality, set the
39     optional argument 'followlinks' to true.
40 
41     Caution:  if you pass a relative pathname for top, don't change the
42     current working directory between resumptions of walk.  walk never
43     changes the current directory, and assumes that the client doesn't
44     either.
45 
46     Example:
47 
48     import os
49     from os.path import join, getsize
50     for root, dirs, files in os.walk('python/Lib/email'):
51         print root, "consumes",
52         print sum([getsize(join(root, name)) for name in files]),
53         print "bytes in", len(files), "non-directory files"
54         if 'CVS' in dirs:
55             dirs.remove('CVS')  # don't visit CVS directories
56     """
57 
58     islink, join, isdir = path.islink, path.join, path.isdir
59 
60     # We may not have read permission for top, in which case we can't
61     # get a list of the files the directory contains.  os.path.walk
62     # always suppressed the exception then, rather than blow up for a
63     # minor reason when (say) a thousand readable directories are still
64     # left to visit.  That logic is copied here.
65     try:
66         # Note that listdir and error are globals in this module due
67         # to earlier import-*.
68         names = listdir(top)
69     except error, err:
70         if onerror is not None:
71             onerror(err)
72         return
73 
74     dirs, nondirs = [], []
75     for name in names:
76         if isdir(join(top, name)):
77             dirs.append(name)
78         else:
79             nondirs.append(name)
80 
81     if topdown:
82         yield top, dirs, nondirs
83     for name in dirs:
84         new_path = join(top, name)
85         if followlinks or not islink(new_path):
86             for x in walk(new_path, topdown, onerror, followlinks):
87                 yield x
88     if not topdown:
89         yield top, dirs, nondirs

这是很常用的os.walk()函数的源代码，用了递归的方式实现的，主要理解yield，我有一篇专门介绍yield的博文。还有yield的递归使用时需要注意的。

def fab(max):
   n,a,b=0,0,1
   while n<max:
       yield b       #A
       a,b=b,a+b
       n=n+1

def ff(max):
   for x in fab(max):
       yield x       #B

for i in ff(5):
   print i

   上面这俩函数能解释os.walk()了。yield嵌套的执行过程：ff()函数开始执行，运行到fab()函数中A地方，返回一个b值，并且fab()函数暂停，ff函数得到这个值后，返回x，ff函数暂停；由于我们是for循环执行ff函数的，相当于执行next()函数，所以，ff函数继续执行，ff函数中也是for循环执行fab函数的，所以，fab函数继续执行，返回下一个b值，暂停，ff函数得到b值，返回x暂停，x值输出后继续执行，就是这样循环。
   结论：想要嵌套执行有yield的函数，必须用for循环来执行，得到yield返回的迭代值，必须用for循环遍历。

未完待续。。。

转载于:https://www.cnblogs.com/nannanITeye/p/3283487.html