Tornado"debug=True"动态编译

最新推荐文章于 2021-02-03 09:48:43 发布

原创最新推荐文章于 2021-02-03 09:48:43 发布 · 3.9k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#Tornado #Python #Debug #动态编译

tornado 专栏收录该内容

3 篇文章

订阅专栏

本文详细解析了 tornado 应用程序中动态自动编译的功能实现，包括 debug 参数的作用、autoreload 函数的工作原理及流程，以及如何通过监控文件修改自动重新编译代码。

星期天难得比较空闲，刚好前段时间在老师布置的网页作业里用到了debug动态自动编译，一直没明白是怎么工作的，于是就上网google了tornado的英文官网，查看了有关动态自动编译的源代码。还好源码注释多，大概明白了个原理，现在就来分享一下我的见解吧，有错漏之处还请指点。

先来看一下application对象的构造函数

def __init__(self, handlers=None, default_host="", transforms=None,**settings):

其中在**settings里面我们可以传入debug=True启用动态编译，如下

    __app__ = tornado.web.Application(
        handlers=[(r'/', MainHandler)]
        debug=True
    )

传进去后，在application构造函数里会执行以下代码：

 if self.settings.get('debug'):
     self.settings.setdefault('autoreload', True)
     self.settings.setdefault('compiled_template_cache', False)
     self.settings.setdefault('static_hash_cache', False)
     self.settings.setdefault('serve_traceback', True)

 # Automatically reload modified modules
 if self.settings.get('autoreload'):
     from tornado import autoreload
     autoreload.start()

可以看到，当debug被设置为True以后autoreload、complied_template_cache、static_hash_cache、serve_traceback被分别赋值。其中complied_template_cache默认值为True，当其修改为False以后，模板在接受到每一条请求时都会被重新编译。static_hash_cache默认值为True，当其被修改为False以后，url在接受到每一条请求时会被重新加载。serve_traceback应该是与异常跟踪有关。

在autoreload被设置成True以后，执行下一个if语句。在这里，一个autoreload函数start()被调用。注意并没有autoreload对象生成，因为autoreload不是一个class，依我看更像是一个功能函数集的头域。
那么我们跟踪这一条信息，看看到底start()里面做了神马事情居然能动态编译？

于是我找来了tornado.autoreload的源代码，先看看start()函数：

_watched_files = set()
_reload_hooks = []
_reload_attempted = False
_io_loops = weakref.WeakKeyDictionary()

def start(io_loop=None, check_time=500):
    """Begins watching source files for changes using the given `.IOLoop`. """
    io_loop = io_loop or ioloop.IOLoop.current()
    if io_loop in _io_loops:
        return
    _io_loops[io_loop] = True
    if len(_io_loops) > 1:
        gen_log.warning("tornado.autoreload started more than once in the same process")
    add_reload_hook(functools.partial(io_loop.close, all_fds=True))
    modify_times = {}
    callback = functools.partial(_reload_on_update, modify_times)
    scheduler = ioloop.PeriodicCallback(callback, check_time, io_loop=io_loop)
    scheduler.start()

在这里start有两个默认参数，所以上面可以直接autoreload.start()这样调用。io_loop赋值当前的io_loop对象(io_loop对象是tornado为了实现高性能和高并发，处理socket读写事件的)。前几句应该是用来判断是否有多个相同的io_loop被启用(这个我不是很明白，不过对理解没太大影响)。接着调用了add_reload_hook函数，并传递了一个偏函数作为其参数。

呵呵，你肯定要问什么是偏函数了。简单地来说，就是提前把参数传递给一个函数，然后返回一个可调用的函数。例如：

from operator import add
import functools
print add(1,2) #3
add1 = functools.partial(add,1)
print add1(10) #11

接着画面跳转到add_reload_hook函数：

def add_reload_hook(fn):
    """Add a function to be called before reloading the process.
    """
    _reload_hooks.append(fn)

我*，居然只有一条添加语句我也是醉了。不过注释是关键！说明函数储存在_reload_hooks列表里是为了以后要把它取出来调用的。

好吧，我们再回到start()函数。接下来是初始化了一个modify_times字典，看名字就猜到是用来记录文档修改时间的，后来证明确实如此。接着又一个偏函数callback被定义，调用callback()相当于调用_reload_on_update(modify_times)。于是画面又跳转到_reload_on_update函数：

def _reload_on_update(modify_times):
    if _reload_attempted:
        # We already tried to reload and it didn't work, so don't try again.
        return
    if process.task_id() is not None:
        # We're in a child process created by fork_processes.  If child
        # processes restarted themselves, they'd all restart and then
        # all call fork_processes again.
        return
    for module in sys.modules.values():
        # Some modules play games with sys.modules (e.g. email/__init__.py
        # in the standard library), and occasionally this can cause strange
        # failures in getattr.  Just ignore anything that's not an ordinary
        # module.
        if not isinstance(module, types.ModuleType):
            continue
        path = getattr(module, "__file__", None)
        if not path:
            continue
        if path.endswith(".pyc") or path.endswith(".pyo"):
            path = path[:-1]
        _check_file(modify_times, path)
    for path in _watched_files:
        _check_file(modify_times, path)

照例，前几个判断都是为了防止一些重复或错误的操作。sys.module.values()返回所有系统导入的模块，接下来path就获取了当前模块的路径。正常情况下不会遇到continue跳出循环，所以会调用_check_file函数。画面再一次开启传送门：

def _check_file(modify_times, path):
    try:
        modified = os.stat(path).st_mtime
    except Exception:
        return
    if path not in modify_times:
        modify_times[path] = modified
        return
    if modify_times[path] != modified:
        gen_log.info("%s modified; restarting server", path)
        _reload()

哈哈！我们看到modified获取了当前模块路径的修改时间。接下来的两句判断至关重要！第一个判断是判断当前模块是否在modify_times字典里，如果不在，说明模块是新的，那么将其添加进字典并赋值其修改的时间。第二个判断则是判断已存在模块是否重新被修改。例如第一次我修改了.py文件是在3:00:00触发了第一个判断，那么如果第二次修改在3:00:10，这时就会触发第二个判断。因为修改的时间发生了变化，于是一条“某某模块路径被修改，正在重启服务器”的信息就会打印在屏幕上，并执行_reload()。画面跳转：

def _reload():
    global _reload_attempted
    _reload_attempted = True
    for fn in _reload_hooks:
        fn()
    if hasattr(signal, "setitimer"):
        # Clear the alarm signal set by
        # ioloop.set_blocking_log_threshold so it doesn't fire
        # after the exec.
        signal.setitimer(signal.ITIMER_REAL, 0, 0)
    # sys.path fixes: see comments at top of file.  If sys.path[0] is an empty
    # string, we were (probably) invoked with -m and the effective path
    # is about to change on re-exec.  Add the current directory to $PYTHONPATH
    # to ensure that the new process sees the same path we did.
    path_prefix = '.' + os.pathsep
    if (sys.path[0] == '' and
            not os.environ.get("PYTHONPATH", "").startswith(path_prefix)):
        os.environ["PYTHONPATH"] = (path_prefix +
                                    os.environ.get("PYTHONPATH", ""))
    if sys.platform == 'win32':
        # os.execv is broken on Windows and can't properly parse command line
        # arguments and executable name if they contain whitespaces. subprocess
        # fixes that behavior.
        subprocess.Popen([sys.executable] + sys.argv)
        sys.exit(0)
    else:
        try:
            os.execv(sys.executable, [sys.executable] + sys.argv)
        except OSError:
            # Mac OS X versions prior to 10.6 do not support execv in
            # a process that contains multiple threads.  Instead of
            # re-executing in the current process, start a new one
            # and cause the current process to exit.  This isn't
            # ideal since the new process is detached from the parent
            # terminal and thus cannot easily be killed with ctrl-C,
            # but it's better than not being able to autoreload at
            # all.
            # Unfortunately the errno returned in this case does not
            # appear to be consistent, so we can't easily check for
            # this error specifically.
            os.spawnv(os.P_NOWAIT, sys.executable,
                      [sys.executable] + sys.argv)
            sys.exit(0)

不要嫌其长，大部分都是注释。这里先把_reload_attempted赋值为了True，告诉大家我已经尝试过重新编译了。然后我们看到先前被存放在_reload_hooks里的偏函数

functools.partial(io_loop.close, all_fds=True)

被取出来调用了，关闭了之前的io_loop。然后的一串代码我无法解释，不过我猜测应该是保证编译路径的正确性，因为在这过程中用户很可能改变了文件的位置甚至是编译环境。最后就是判断操作平台，然后执行编译命令，代码于是被重新编译。

咳咳，你以为这样就完了？其实这只是start()里一条偏函数引出来的一串函数而已，我们还得再回到start()函数。接下来有这么两条语句：

scheduler = ioloop.PeriodicCallback(callback, check_time, io_loop=io_loop)
scheduler.start()

这个非常好理解，schedule是时刻表的意思，PeriodicCallback顾名思义就是周期性调用的意思。也就是sheduler.start()以后，程序每隔check_time时间会调用一次callback()函数，然后执行刚才那一连串的函数，只要相关的文件被修改了，就会停止一切动作，重新编译代码。于是乎，一个动态的自动编译功能就实现了！

当然，想详细了解的童鞋可以从此传送门进入到tornado官方网站查看百分百原汁原味的源代码：

http://www.tornadoweb.org/en/stable/_modules/tornado/web.html#Application
http://www.tornadoweb.org/en/stable/_modules/tornado/autoreload.html#add_reload_hook

再一次说明，这篇文章纯属个人见解，如有不当或错误的地方还望大家踊跃指出。

好吧，已经快0点了，我还是洗洗睡了~