urllib2 - The Missing Manual

本文档详细介绍了Python中urllib2模块的使用方法,包括如何通过urlopen函数获取URL内容,处理请求过程中的错误,如URLError和HTTPError等,并提供了基本认证、代理设置及超时设置等高级功能的实现方案。

From: http://www.voidspace.org.uk/python/articles/urllib2.shtml

urllib2 - The Missing Manual

Fetching URLs With Python

Introduction

urllib2 is a Python module for fetching URLs. It offers a very simple interface, in the form of the urlopen function. This is capable of fetching URLs using a variety of different protocols. It also offers a slightly more complex interface for handling common situations - like basic authentication, cookies, proxies, and so on. These are provided by objects called handlers and openers.

For straightforward situations urlopen is very easy to use. But as soon as you encounter errors, or non-trivial cases, you will need some understanding of the HTTP protocol. The most comprehensive reference to HTTP is RFC 2616. This is a technical document and not intended to be easy to read Laughing . This tutorial aims to illustrate using urllib2, with enough detail about HTTP to help you through. It is not intended to replace the urllib2 docs [1], but is supplementary to them.

Fetching URLs

HTTP is based on requests and responses - the client makes requests and servers send responses. Python mirrors this by having you form a Request object which represents the request you are making. In it's simplest form you create a Request object that specifies the URL you want to fetch [2]. Calling urlopen with this Request object returns a handle on the page requested. This handle is a file like object :

import urllib2

the_url = 'http://www.voidspace.org.uk'
req = urllib2 . Request ( the_url )
handle = urllib2 . urlopen ( req )
the_page = handle . read ( )

There are two extra things that Request objects allow you to do. Sometimes you want to POST data to a CGI [3] or other web application. This is what your browser does when you fill in a FORM on the web. You may be mimicking a FORM submission, or transmitting data to your own application. In either case the data needs to be encoded for safe transmission over HTTP, and then passed to the Request object as the data argument. The encoding is done using a function from the urllib library not from urllib2 Razz .

import urllib
import urllib2

the_url = 'http://www.someserver.com/cgi-bin/register.cgi'
values = { 'name' : 'Michael Foord' ,
           'location' : 'Northampton' ,
           'language' : 'Python' }

data = urllib . urlencode ( values )
req = urllib2 . Request ( the_url , data )
handle = urllib2 . urlopen ( req )
the_page = handle . read ( )

Some websites [4] dislike being browsed by programs, or send different versions to different browsers [5] . By default urllib2 identifies itself as Python-urllib/2.4, which may confuse the site, or just plain not work. The way a browser identifies itself is through the User-Agent header [6]. When you create a Request object you can pass a dictionary of headers in. The following example makes the same request as above, but identifies itself as a version of IE [7].

import urllib
import urllib2

the_url = 'http://www.someserver.com/cgi-bin/register.cgi'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
values = { 'name' : 'Michael Foord' ,
           'location' : 'Northampton' ,
           'language' : 'Python' }
headers = { 'User-Agent' : user_agent }

data = urllib . urlencode ( values )
req = urllib2 . Request ( the_url , data , headers )
handle = urllib2 . urlopen ( req )
the_page = handle . read ( )

The handle also has two useful methods. See the section on info and geturl which comes after we have a look at what happens when things go wrong.

Coping With Errors

urlopen raises URLError or HTTPError in the event of an error. HTTPError is a subclass of URLError, which is a subclass of IOError. This means you can trap for IOError if you want.

req = urllib2 . Request ( some_url )
try :
     handle = urllib2 . urlopen ( req )
except IOError :
     print 'Something went wrong'
else :
     print handle . read ( )

URLError

If the request fails to reach a server then urlopen will raise a URLError. This will usually be because there is no network connection (no route to the specified server), or the specified server doesn't exist.

In this case, the exception raised will have a 'reason' attribute, which is a tuple containing an error code and a text error message.

e.g.

>>> req = urllib2.Request('http://www.pretend_server.org')
>>> try: urllib2.urlopen(req)
>>> except IOError, e:
>>>    print e.reason
>>>
(4, 'getaddrinfo failed')

HTTPError

If the request reaches a server, but the server is unable to fulfil the request, it returns an error code. The default handlers will hande some of these errors for you. For those it can't handle, urlopen will raise an HTTPError. Typical errors include '404' (page not found), '403' (request forbidden), and '401' (authentication required).

See http://www.w3.org/Protocols/HTTP/HTRESP.html for a reference on all the http error codes.

The HTTPError instance raised will have an integer 'code' attribute, which corresponds to the error sent by the server.

There is a useful dictionary of response codes in HTTPBaseServer, that shows all the defined response codes. Because the default handlers handle redirects (codes in the 300 range), and codes in the 100-299 range indicate success, you will usually only see error codes in the 400-599 range.

Error Codes

Note

As of Python 2.5 a dictionary like this one has become part of urllib2. Very Happy

# Table mapping response codes to messages; entries have the
# form {code: (shortmessage, longmessage)}.
httpresponses = {
     100 : ( 'Continue' , 'Request received, please continue' ) ,
     101 : ( 'Switching Protocols' ,
           'Switching to new protocol; obey Upgrade header' ) ,

     200 : ( 'OK' , 'Request fulfilled, document follows' ) ,
     201 : ( 'Created' , 'Document created, URL follows' ) ,
     202 : ( 'Accepted' ,
           'Request accepted, processing continues off-line' ) ,
     203 : ( 'Non-Authoritative Information' ,
             'Request fulfilled from cache' ) ,
     204 : ( 'No response' , 'Request fulfilled, nothing follows' ) ,
     205 : ( 'Reset Content' , 'Clear input form for further input.' ) ,
     206 : ( 'Partial Content' , 'Partial content follows.' ) ,

     300 : ( 'Multiple Choices' ,
           'Object has several resources -- see URI list' ) ,
     301 : ( 'Moved Permanently' ,
             'Object moved permanently -- see URI list' ) ,
     302 : ( 'Found' , 'Object moved temporarily -- see URI list' ) ,
     303 : ( 'See Other' , 'Object moved -- see Method and URL list' ) ,
     304 : ( 'Not modified' ,
           'Document has not changed since given time' ) ,
     305 : ( 'Use Proxy' ,
             'You must use proxy specified in Location'
             ' to access this resource.' ) ,
     307 : ( 'Temporary Redirect' ,
           'Object moved temporarily -- see URI list' ) ,

     400 : ( 'Bad request' ,
           'Bad request syntax or unsupported method' ) ,
     401 : ( 'Unauthorized' ,
           'No permission -- see authorization schemes' ) ,
     402 : ( 'Payment required' ,
           'No payment -- see charging schemes' ) ,
     403 : ( 'Forbidden' ,
           'Request forbidden -- authorization will not help' ) ,
     404 : ( 'Not Found' , 'Nothing matches the given URI' ) ,
     405 : ( 'Method Not Allowed' ,
           'Specified method is invalid for this server.' ) ,
     406 : ( 'Not Acceptable' ,
             'URI not available in preferred format.' ) ,
     407 : ( 'Proxy Authentication Required' ,
             'You must authenticate with '
             'this proxy before proceeding.' ) ,
     408 : ( 'Request Time-out' ,
             'Request timed out; try again later.' ) ,
     409 : ( 'Conflict' , 'Request conflict.' ) ,
     410 : ( 'Gone' ,
           'URI no longer exists and has been permanently removed.' ) ,
     411 : ( 'Length Required' , 'Client must specify Content-Length.' ) ,
     412 : ( 'Precondition Failed' ,
             'Precondition in headers is false.' ) ,
     413 : ( 'Request Entity Too Large' , 'Entity is too large.' ) ,
     414 : ( 'Request-URI Too Long' , 'URI is too long.' ) ,
     415 : ( 'Unsupported Media Type' ,
             'Entity body in unsupported format.' ) ,
     416 : ( 'Requested Range Not Satisfiable' ,
           'Cannot satisfy request range.' ) ,
     417 : ( 'Expectation Failed' ,
           'Expect condition could not be satisfied.' ) ,

     500 : ( 'Internal error' , 'Server got itself in trouble' ) ,
     501 : ( 'Not Implemented' ,
           'Server does not support this operation' ) ,
     502 : ( 'Bad Gateway' ,
             'Invalid responses from another server/proxy.' ) ,
     503 : ( 'Service temporarily overloaded' ,
           'The server cannot '
           'process the request due to a high load' ) ,
     504 : ( 'Gateway timeout' ,
           'The gateway server did not receive a timely response' ) ,
     505 : ( 'HTTP Version not supported' , 'Cannot fulfill request.' ) ,
     }

When an error is raised the server responds by returning an http error code and an error page. You can use the HTTPError instance as a handle on the page returned. This means that as well as the code attribute, it also has read, geturl, and info, methods.

>>> req = urllib2.Request('http://www.python.org/fish.html')
>>> try:
>>>     urllib2.urlopen(req)
>>> except IOError, e:
>>>     print e.code
>>>     print e.read()
>>>
404
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    "http://www.w3.org/TR/html4/loose.dtd">
<?xml-stylesheet href="./css/ht2html.css"
    type="text/css"?>
<html><head><title>Error 404: File Not Found</title>
...... etc...

Wrapping it Up

So if you want to be prepared for HTTPError or URLError there are two basic approaches. I prefer the second approach.

Number 1

from urllib2 import Request , urlopen , URLError , HTTPError
req = Request ( someurl )
try :
     handle = urlopen ( req )
except URLError , e :
     print 'We failed to reach a server.'
     print 'Reason: ' , e . reason
except HTTPError , e :
     print 'The server couldn/'t fulfill the request.'
     print 'Error code: ' , e . code
else :
     # everything is fine

Number 2

from urllib2 import Request , urlopen
req = Request ( someurl )
try :
     handle = urlopen ( req )
except IOError , e :
     if hasattr ( e , 'reason' ) :
         print 'We failed to reach a server.'
         print 'Reason: ' , e . reason
     elif hasattr ( e , 'code' ) :
         print 'The server couldn/'t fulfill the request.'
         print 'Error code: ' , e . code
else :
     # everything is fine

info and geturl

The handle returned by urlopen (or the HTTPError instance) has two useful methods info and geturl.

geturl - this returns the real url of the page fetched. This is useful because urlopen (or the openener object used) may have followed a redirect. The url of the page fetched may not be the same as the url requested.

info - this returns a dictionary like object that describes the page fetched, particularly the headers sent by the server. It is actually an httplib.HTTPMessage instance. In versions of Python prior to 2.3.4 it wasn't safe to iterate over the object directly, so you should iterate over the list returned by msg.keys() instead.

Typical headers include 'content-length', 'content-type', and so on. See the Quick Reference to HTTP Headers for a useful reference on the different sort of headers.

Openers and Handlers

Openers and handlers are slightly esoteric parts of urllib2. When you fetch a URL you use an opener. Normally we have been using the default opener - via urlopen - but you can create custom openers. Openers use handlers.

build_opener is used to create opener objects - for fetching URLs with specific handlers installed. Handlers can handle cookies, authentication, and other common but slightly specialised situations. Opener objects have an open method, which can be called directly to fetch urls in the same way as the urlopen function.

install_opener can be used to make an opener object the default opener. This means that calls to urlopen will use the opener you have installed.

Basic Authentication

To illustrate creating and installing a handler we will use the HTTPBasicAuthHandler. For a more detailed discussion of this subject - including an explanation of how Basic Authentication works - see the Basic Authentication Tutorial.

When authentication is required, the server sends a header (as well as the 401 error code) requesting authentication. This specifies the authentication scheme and a 'realm'. The header looks like : www-authenticate: SCHEME realm="REALM".

e.g.

www-authenticate: Basic realm="cPanel"

The client should then retry the request with the appropriate name and password for the realm included as a header in the request. This is 'basic authentication'. In order to simplify this process we can create an instance of HTTPBasicAuthHandler and an opener to use this handler.

The HTTPBasicAuthHandler uses an object called a password manager to handle the mapping of URIs and realms to passwords and usernames. If you know what the realm is (from the authentication header sent by the server), then you can use a HTTPPasswordMgr. Generally there is only one realm per URI, so it is possible to use HTTPPasswordMgrWithDefaultRealm. This allows you to specify a default username and password for a URI. This will be supplied in the absence of yoou providing an alternative combination for a specific realm. We signify this by providing None as the realm argument to the add_password method.

The toplevelurl is the first url that requires authentication. This is usually a 'super-url' of any others in the same realm.

password_mgr = urllib2 . HTTPPasswordMgrWithDefaultRealm ( )
# create a password manager

password_mgr . add_password ( None ,
     top_level_url , username , password )
# add the username and password
# if we knew the realm, we could
# use it instead of ``None``

handler = urllib2 . HTTPBasicAuthHandler ( password_mgr )
# create the handler

opener = urllib2 . build_opener ( handler )
# from handler to opener

opener . open ( a_url )
# use the opener to fetch a URL

urllib2 . install_opener ( opener )
# install the opener
# now all calls to urllib2.urlopen use our opener

Note

In the above example we only supplied our HHTPBasicAuthHandler to build_opener. By default openers have the handlers for normal situations - ProxyHandler, UnknownHandler, HTTPHandler, HTTPDefaultErrorHandler, HTTPRedirectHandler, FTPHandler, FileHandler, HTTPErrorProcessor. The only reason to explicitly supply these to build_opener (which chains handlers provided as a list), would be to change the order they appear in the chain.

One thing not to get bitten by is that the top_level_url in the code above must not contain the protocol - the http:// part. So if the URL we are trying to access is http://www.someserver.com/path/page.html, then we set :

top_level_url = "www.someserver.com/path/page.html"
# *no* http:// !!

It took me a long time to track that down the first time I tried to use handlers Embarassed .

Proxies

urllib2 will auto-detect your proxy settings and use those. This is through the ProxyHandler which is part of the normal handler chain. Normally that's a good thing, but there are occasions when it may not be helpful [8]. In order to do this we need to setup our own ProxyHandler, with no proxies defined. This is done using similar steps to setting up a Basic Authentication handler :

>>> proxy_support = urllib2.ProxyHandler({})
>>> opener = urllib2.build_opener(proxy_support)
>>> urllib2.install_opener(opener)

Caution!

Currently urllib2 does not support fetching of https locations through a proxy. Sad This can be a problem.

Sockets and Layers

The Python support for fetching resources from the web is layered. urllib2 uses the httplib library, which in turn uses the socket library.

As of Python 2.3 you can specify how long a socket should wait for a response before timing out. This can be useful in applications which have to fetch web pages. By default the socket module has no timeout and can hang. To set the timeout use :

import socket
import urllib2

timeout = 10
# timeout in seconds
socket . setdefaulttimeout ( timeout )

req = urllib2 . Request ( 'http://www.voidspace.org.uk' )
handle = urllib2 . urlopen ( req )
# this call to urllib2.urlopen
# now uses the default timeout
# we have set in the socket module

Footnotes

[1]Possibly some of this tutorial will make it into the standard library docs for versions of Python after 2.4.1.
[2]You can fetch URLs directly with urlopen, without using a request object. It's more explicit, and therefore more Pythonic, to use urllib2.Request though. It also makes it easier to add headers to your request.
[3]For an introduction to the CGI protocol see Writing Web Applications in Python.
[4]Like Google for example. The proper way to use google from a program is to use PyGoogle of course. See Voidspace Google for some examples of using the Google API.
[5]Browser sniffing is a very bad practise for website design - building sites using web standards is much more sensible. Unfortunately a lot of sites still send different versions to different browsers.
[6]The user agent for MSIE 6 is 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)'
[7]For details of more HTTP request headers, see Quick Reference to HTTP Headers.
[8]In my case I have to use a proxy to access the internet at work. If you attempt to fetch localhost URLs through this proxy it blocks them. IE is set to use the proxy, which urllib2 picks up on. In order to test scripts with a localhost server, I have to prevent urllib2 from using the proxy.
PowerShell 7 环境已加载 (版本: 7.5.2) PowerShell 7 环境已加载 (版本: 7.5.2) PS C:\Users\Administrator\Desktop> cd E:\PyTorch_Build\pytorch PS E:\PyTorch_Build\pytorch> python -m venv rtx5070_env PS E:\PyTorch_Build\pytorch> .\rtx5070_env\Scripts\activate (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 修复之前的脚本错误 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $fixedActivation = @" >> try { >> & "$activatePath" >> Write-Host "✅ 虚拟环境激活成功" -ForegroundColor Green >> python -VV >> } >> catch [System.Exception] { >> Write-Host "❌ 激活失败: $($_.Exception.Message)" -ForegroundColor Red >> } >> "@ InvalidOperation: Line | 3 | & "$activatePath" | ~~~~~~~~~~~~~ | The variable '$activatePath' cannot be retrieved because it has not been set. (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 切换到PyTorch源码目录 (rtx5070_env) PS E:\PyTorch_Build\pytorch> cd E:\PyTorch_Build\pytorch (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 更新pip到最新版 (rtx5070_env) PS E:\PyTorch_Build\pytorch> python -m pip install --upgrade pip Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: pip in e:\pytorch_build\pytorch\rtx5070_env\lib\site-packages (22.3.1) Collecting pip Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b7/3f/945ef7ab14dc4f9d7f40288d2df998d1837ee0888ec3659c813487572faa/pip-25.2-py3-none-any.whl (1.8 MB) Installing collected packages: pip Attempting uninstall: pip Found existing installation: pip 22.3.1 Uninstalling pip-22.3.1: Successfully uninstalled pip-22.3.1 Successfully installed pip-25.2 (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 安装编译依赖 (rtx5070_env) PS E:\PyTorch_Build\pytorch> pip install -r requirements-build.txt --verbose Using pip 25.2 from E:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\pip (python 3.10) Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting setuptools<80.0,>=70.1.0 (from -r requirements-build.txt (line 2)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/0d/6d/b4752b044bf94cb802d88a888dc7d288baaf77d7910b7dedda74b5ceea0c/setuptools-79.0.1-py3-none-any.whl (1.3 MB) Collecting cmake>=3.27 (from -r requirements-build.txt (line 3)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/7c/d0/73cae88d8c25973f2465d5a4457264f95617c16ad321824ed4c243734511/cmake-4.1.0-py3-none-win_amd64.whl (37.6 MB) Collecting ninja (from -r requirements-build.txt (line 4)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/29/45/c0adfbfb0b5895aa18cec400c535b4f7ff3e52536e0403602fc1a23f7de9/ninja-1.13.0-py3-none-win_amd64.whl (309 kB) Link requires a different Python (3.10.10 not in: '>=3.11'): https://pypi.tuna.tsinghua.edu.cn/packages/f3/db/8e12381333aea300890829a0a36bfa738cac95475d88982d538725143fd9/numpy-2.3.0.tar.gz (from https://pypi.tuna.tsinghua.edu.cn/simple/numpy/) (requires-python:>=3.11) Link requires a different Python (3.10.10 not in: '>=3.11'): https://pypi.tuna.tsinghua.edu.cn/packages/2e/19/d7c972dfe90a353dbd3efbbe1d14a5951de80c99c9dc1b93cd998d51dc0f/numpy-2.3.1.tar.gz (from https://pypi.tuna.tsinghua.edu.cn/simple/numpy/) (requires-python:>=3.11) Link requires a different Python (3.10.10 not in: '>=3.11'): https://pypi.tuna.tsinghua.edu.cn/packages/37/7d/3fec4199c5ffb892bed55cff901e4f39a58c81df9c44c280499e92cad264/numpy-2.3.2.tar.gz (from https://pypi.tuna.tsinghua.edu.cn/simple/numpy/) (requires-python:>=3.11) Collecting numpy (from -r requirements-build.txt (line 5)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/a3/dd/4b822569d6b96c39d1215dbae0582fd99954dcbcf0c1a13c61783feaca3f/numpy-2.2.6-cp310-cp310-win_amd64.whl (12.9 MB) Collecting packaging (from -r requirements-build.txt (line 6)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/20/12/38679034af332785aac8774540895e234f4d07f7545804097de4b666afd8/packaging-25.0-py3-none-any.whl (66 kB) Collecting pyyaml (from -r requirements-build.txt (line 7)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b5/84/0fa4b06f6d6c958d207620fc60005e241ecedceee58931bb20138e1e5776/PyYAML-6.0.2-cp310-cp310-win_amd64.whl (161 kB) Collecting requests (from -r requirements-build.txt (line 8)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/1e/db/4254e3eabe8020b458f1a747140d32277ec7a271daf1d235b70dc0b4e6e3/requests-2.32.5-py3-none-any.whl (64 kB) Collecting six (from -r requirements-build.txt (line 9)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b7/ce/149a00dd41f10bc29e5921b496af8b574d8413afcd5e30dfa0ed46c2cc5e/six-1.17.0-py2.py3-none-any.whl (11 kB) Collecting typing-extensions>=4.10.0 (from -r requirements-build.txt (line 10)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/18/67/36e9267722cc04a6b9f15c7f3441c2363321a3ea07da7ae0c0707beb2a9c/typing_extensions-4.15.0-py3-none-any.whl (44 kB) Collecting charset_normalizer<4,>=2 (from requests->-r requirements-build.txt (line 8)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e2/c6/f05db471f81af1fa01839d44ae2a8bfeec8d2a8b4590f16c4e7393afd323/charset_normalizer-3.4.3-cp310-cp310-win_amd64.whl (107 kB) Collecting idna<4,>=2.5 (from requests->-r requirements-build.txt (line 8)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/76/c6/c88e154df9c4e1a2a66ccf0005a88dfb2650c1dffb6f5ce603dfbd452ce3/idna-3.10-py3-none-any.whl (70 kB) Collecting urllib3<3,>=1.21.1 (from requests->-r requirements-build.txt (line 8)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/a7/c2/fe1e52489ae3122415c51f387e221dd0773709bad6c6cdaa599e8a2c5185/urllib3-2.5.0-py3-none-any.whl (129 kB) Collecting certifi>=2017.4.17 (from requests->-r requirements-build.txt (line 8)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e5/48/1549795ba7742c948d2ad169c1c8cdbae65bc450d6cd753d124b17c8cd32/certifi-2025.8.3-py3-none-any.whl (161 kB) Installing collected packages: urllib3, typing-extensions, six, setuptools, pyyaml, packaging, numpy, ninja, idna, cmake, charset_normalizer, certifi, requests Attempting uninstall: setuptools Found existing installation: setuptools 65.5.0 Uninstalling setuptools-65.5.0: Removing file or directory e:\pytorch_build\pytorch\rtx5070_env\lib\site-packages\_distutils_hack\ Removing file or directory e:\pytorch_build\pytorch\rtx5070_env\lib\site-packages\distutils-precedence.pth Removing file or directory e:\pytorch_build\pytorch\rtx5070_env\lib\site-packages\pkg_resources\ Removing file or directory e:\pytorch_build\pytorch\rtx5070_env\lib\site-packages\setuptools-65.5.0.dist-info\ Removing file or directory e:\pytorch_build\pytorch\rtx5070_env\lib\site-packages\setuptools\ Successfully uninstalled setuptools-65.5.0 Successfully installed certifi-2025.8.3 charset_normalizer-3.4.3 cmake-4.1.0 idna-3.10 ninja-1.13.0 numpy-2.2.6 packaging-25.0 pyyaml-6.0.2 requests-2.32.5 setuptools-79.0.1 six-1.17.0 typing-extensions-4.15.0 urllib3-2.5.0 (rtx5070_env) PS E:\PyTorch_Build\pytorch> pip install cmake ninja --upgrade Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: cmake in e:\pytorch_build\pytorch\rtx5070_env\lib\site-packages (4.1.0) Requirement already satisfied: ninja in e:\pytorch_build\pytorch\rtx5070_env\lib\site-packages (1.13.0) (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 清理旧编译产物 (rtx5070_env) PS E:\PyTorch_Build\pytorch> Remove-Item -Recurse -Force build, dist -ErrorAction SilentlyContinue (rtx5070_env) PS E:\PyTorch_Build\pytorch> Write-Host "`n==== 编译环境验证 ====" -ForegroundColor Cyan ==== 编译环境验证 ==== (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 1. 目录验证 (rtx5070_env) PS E:\PyTorch_Build\pytorch> Write-Host "当前目录: $pwd" 当前目录: E:\PyTorch_Build\pytorch (rtx5070_env) PS E:\PyTorch_Build\pytorch> if ($pwd -ne "E:\PyTorch_Build\pytorch") { >> Write-Host "⚠️ 错误: 需要切换到E:\PyTorch_Build\pytorch" -ForegroundColor Yellow >> cd E:\PyTorch_Build\pytorch >> } ⚠️ 错误: 需要切换到E:\PyTorch_Build\pytorch (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 2. CUDA工具链验证 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $cudaStatus = @( >> "nvcc --version", >> "nvidia-smi", >> "where cudnn64_8.dll" >> ) (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> foreach ($cmd in $cudaStatus) { >> Write-Host "`n执行: $cmd" -ForegroundColor Magenta >> try { >> Invoke-Expression $cmd >> } >> catch { >> Write-Host "❌ 命令失败: $_" -ForegroundColor Red >> } >> } 执行: nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2025 NVIDIA Corporation Built on Wed_Jul_16_20:06:48_Pacific_Daylight_Time_2025 Cuda compilation tools, release 13.0, V13.0.48 Build cuda_13.0.r13.0/compiler.36260728_0 执行: nvidia-smi Wed Sep 3 22:04:47 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 580.97 Driver Version: 580.97 CUDA Version: 13.0 | +-----------------------------------------+------------------------+----------------------+ | GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 5070 WDDM | 00000000:01:00.0 On | N/A | | 0% 38C P3 22W / 250W | 1601MiB / 12227MiB | 1% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 1540 C+G ...yb3d8bbwe\WindowsTerminal.exe N/A | | 0 N/A N/A 1916 C+G C:\Windows\System32\dwm.exe N/A | | 0 N/A N/A 4972 C+G ...em32\ApplicationFrameHost.exe N/A | | 0 N/A N/A 5036 C+G ...ef.win7x64\steamwebhelper.exe N/A | | 0 N/A N/A 5996 C+G ...8bbwe\PhoneExperienceHost.exe N/A | | 0 N/A N/A 6420 C+G ...ntrolPanel\SystemSettings.exe N/A | | 0 N/A N/A 8280 C+G C:\Windows\explorer.exe N/A | | 0 N/A N/A 8428 C+G ...indows\System32\ShellHost.exe N/A | | 0 N/A N/A 8616 C+G ..._cw5n1h2txyewy\SearchHost.exe N/A | | 0 N/A N/A 9212 C+G ...y\StartMenuExperienceHost.exe N/A | | 0 N/A N/A 10092 C+G ...0.3405.125\msedgewebview2.exe N/A | | 0 N/A N/A 12816 C+G ...5n1h2txyewy\TextInputHost.exe N/A | | 0 N/A N/A 13400 C+G ...crosoft\OneDrive\OneDrive.exe N/A | | 0 N/A N/A 14212 C+G ...t\Edge\Application\msedge.exe N/A | | 0 N/A N/A 14440 C+G ...acted\runtime\WeChatAppEx.exe N/A | | 0 N/A N/A 15156 C+G ...les\Tencent\Weixin\Weixin.exe N/A | | 0 N/A N/A 18312 C+G ...es\Microsoft VS Code\Code.exe N/A | +-----------------------------------------------------------------------------------------+ 执行: where cudnn64_8.dll (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 3. Python环境验证 (rtx5070_env) PS E:\PyTorch_Build\pytorch> Write-Host "`nPython环境状态:" -ForegroundColor Magenta Python环境状态: (rtx5070_env) PS E:\PyTorch_Build\pytorch> pip show torch | Select-String "Location" WARNING: Package(s) not found: torch (rtx5070_env) PS E:\PyTorch_Build\pytorch> python -c "import torch; print(f'PyTorch版本: {torch.__version__}')" Traceback (most recent call last): File "<string>", line 1, in <module> File "E:\PyTorch_Build\pytorch\torch\__init__.py", line 61, in <module> from torch.torch_version import __version__ as __version__ File "E:\PyTorch_Build\pytorch\torch\torch_version.py", line 5, in <module> from torch.version import __version__ as internal_version ModuleNotFoundError: No module named 'torch.version' (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 设置RTX 5070专属编译参数 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $cmakeArgs = @( >> "-B build", >> "-G Ninja", >> "-DUSE_CUDA=ON", >> "-DUSE_CUDNN=ON", >> "-DCUDA_TOOLKIT_ROOT_DIR=`"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0`"", >> "-DCUDNN_ROOT_DIR=`"E:\Program Files\NVIDIA\CUNND\v9.12`"", >> "-DCUDA_ARCH_LIST=`"8.9`"", # RTX 5070架构 >> "-DTORCH_CUDA_ARCH_LIST=`"8.9`"", >> "-DCMAKE_BUILD_TYPE=Release", >> "-DUSE_NCCL=OFF", >> "-DUSE_MKLDNN=ON", >> "-DTORCH_CUDA_VERSION=11.8" # 兼容旧驱动 >> ) (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 启动配置过程 (rtx5070_env) PS E:\PyTorch_Build\pytorch> cmake ($cmakeArgs -join " ") CMake Error: Unable to (re)create the private pkgRedirects directory: E:/PyTorch_Build/pytorch/build -G Ninja -DUSE_CUDA=ON -DUSE_CUDNN=ON -DCUDA_TOOLKIT_ROOT_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0" -DCUDNN_ROOT_DIR="E:/Program Files/NVIDIA/CUNND/v9.12" -DCUDA_ARCH_LIST="8.9" -DTORCH_CUDA_ARCH_LIST="8.9" -DCMAKE_BUILD_TYPE=Release -DUSE_NCCL=OFF -DUSE_MKLDNN=ON -DTORCH_CUDA_VERSION=11.8/CMakeFiles/pkgRedirects This may be caused by not having read/write access to the build directory. Try specifying a location with read/write access like: cmake -B build If using a CMake presets file, ensure that preset parameter 'binaryDir' expands to a writable directory. (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 设置并行编译(根据CPU核心数调整) (rtx5070_env) PS E:\PyTorch_Build\pytorch> $env:MAX_JOBS = 8 (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 启动编译并记录日志 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $logFile = "build_$(Get-Date -Format 'yyyyMMdd_HHmmss').log" (rtx5070_env) PS E:\PyTorch_Build\pytorch> Start-Transcript -Path $logFile Transcript started, output file is build_20250903_220514.log (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> try { >> cmake --build build --config Release --parallel $env:MAX_JOBS >> pip install -v --no-build-isolation . >> } >> catch { >> Write-Host "🔥 编译失败!错误详情: $_" -ForegroundColor Red >> } Error: E:/PyTorch_Build/pytorch/build is not a directory Using pip 25.2 from E:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\pip (python 3.10) Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Processing e:\pytorch_build\pytorch Running command Preparing metadata (pyproject.toml) Building wheel torch-2.9.0a0+git2d31c3d E:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\setuptools\config\_apply_pyprojecttoml.py:82: SetuptoolsDeprecationWarning: `project.license` as a TOML table is deprecated !! ******************************************************************************** Please use a simple string containing a SPDX expression for `project.license`. You can also use `project.license-files`. (Both options available on setuptools>=77.0.0). By 2026-Feb-18, you need to update your project and remove deprecated calls or your builds will no longer be supported. See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details. ******************************************************************************** !! corresp(dist, value, root_dir) running dist_info creating C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch.egg-info writing C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch.egg-info\PKG-INFO writing dependency_links to C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch.egg-info\dependency_links.txt writing entry points to C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch.egg-info\entry_points.txt writing requirements to C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch.egg-info\requires.txt writing top-level names to C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch.egg-info\top_level.txt writing manifest file 'C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch.egg-info\SOURCES.txt' reading manifest file 'C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch.egg-info\SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no files found matching 'BUILD' warning: no files found matching '*.BUILD' warning: no files found matching 'BUCK' warning: no files found matching '[Mm]akefile.*' warning: no files found matching '*.[Dd]ockerfile' warning: no files found matching '[Dd]ockerfile.*' warning: no previously-included files matching '*.o' found anywhere in distribution warning: no previously-included files matching '*.obj' found anywhere in distribution warning: no previously-included files matching '*.so' found anywhere in distribution warning: no previously-included files matching '*.a' found anywhere in distribution warning: no previously-included files matching '*.dylib' found anywhere in distribution no previously-included directories found matching '*\.git' warning: no previously-included files matching '*~' found anywhere in distribution warning: no previously-included files matching '*.swp' found anywhere in distribution adding license file 'LICENSE' adding license file 'NOTICE' writing manifest file 'C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch.egg-info\SOURCES.txt' creating 'C:\Users\Administrator\AppData\Local\Temp\pip-modern-metadata-inyji1j9\torch-2.9.0a0+git2d31c3d.dist-info' Preparing metadata (pyproject.toml) ... done Collecting filelock (from torch==2.9.0a0+git2d31c3d) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/42/14/42b2651a2f46b022ccd948bca9f2d5af0fd8929c4eec235b8d6d844fbe67/filelock-3.19.1-py3-none-any.whl (15 kB) Requirement already satisfied: typing-extensions>=4.10.0 in e:\pytorch_build\pytorch\rtx5070_env\lib\site-packages (from torch==2.9.0a0+git2d31c3d) (4.15.0) Collecting sympy>=1.13.3 (from torch==2.9.0a0+git2d31c3d) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/a2/09/77d55d46fd61b4a135c444fc97158ef34a095e5681d0a6c10b75bf356191/sympy-1.14.0-py3-none-any.whl (6.3 MB) Link requires a different Python (3.10.10 not in: '>=3.11'): https://pypi.tuna.tsinghua.edu.cn/packages/eb/8d/776adee7bbf76365fdd7f2552710282c79a4ead5d2a46408c9043a2b70ba/networkx-3.5-py3-none-any.whl (from https://pypi.tuna.tsinghua.edu.cn/simple/networkx/) (requires-python:>=3.11) Link requires a different Python (3.10.10 not in: '>=3.11'): https://pypi.tuna.tsinghua.edu.cn/packages/6c/4f/ccdb8ad3a38e583f214547fd2f7ff1fc160c43a75af88e6aec213404b96a/networkx-3.5.tar.gz (from https://pypi.tuna.tsinghua.edu.cn/simple/networkx/) (requires-python:>=3.11) Link requires a different Python (3.10.10 not in: '>=3.11'): https://pypi.tuna.tsinghua.edu.cn/packages/3f/a1/46c1b6e202e3109d2a035b21a7e5534c5bb233ee30752d7f16a0bd4c3989/networkx-3.5rc0-py3-none-any.whl (from https://pypi.tuna.tsinghua.edu.cn/simple/networkx/) (requires-python:>=3.11) Link requires a different Python (3.10.10 not in: '>=3.11'): https://pypi.tuna.tsinghua.edu.cn/packages/90/7e/0319606a20ced20730806b9f7fe91d8a92f7da63d76a5c388f87d3f7d294/networkx-3.5rc0.tar.gz (from https://pypi.tuna.tsinghua.edu.cn/simple/networkx/) (requires-python:>=3.11) Collecting networkx>=2.5.1 (from torch==2.9.0a0+git2d31c3d) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b9/54/dd730b32ea14ea797530a4479b2ed46a6fb250f682a9cfb997e968bf0261/networkx-3.4.2-py3-none-any.whl (1.7 MB) Collecting jinja2 (from torch==2.9.0a0+git2d31c3d) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/62/a1/3d680cbfd5f4b8f15abc1d571870c5fc3e594bb582bc3b64ea099db13e56/jinja2-3.1.6-py3-none-any.whl (134 kB) Collecting fsspec>=0.8.5 (from torch==2.9.0a0+git2d31c3d) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/47/71/70db47e4f6ce3e5c37a607355f80da8860a33226be640226ac52cb05ef2e/fsspec-2025.9.0-py3-none-any.whl (199 kB) Collecting mpmath<1.4,>=1.1.0 (from sympy>=1.13.3->torch==2.9.0a0+git2d31c3d) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl (536 kB) Collecting MarkupSafe>=2.0 (from jinja2->torch==2.9.0a0+git2d31c3d) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/44/06/e7175d06dd6e9172d4a69a72592cb3f7a996a9c396eee29082826449bbc3/MarkupSafe-3.0.2-cp310-cp310-win_amd64.whl (15 kB) Building wheels for collected packages: torch Running command Building wheel for torch (pyproject.toml) Building wheel torch-2.9.0a0+git2d31c3d -- Building version 2.9.0a0+git2d31c3d E:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\setuptools\_distutils\_msvccompiler.py:12: UserWarning: _get_vc_env is private; find an alternative (pypa/distutils#340) warnings.warn( Cloning into 'nccl'... Note: switching to '3ea7eedf3b9b94f1d9f99f4e55536dfcbd23c1ca'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c <new-branch-name> Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false cmake -GNinja -DBUILD_PYTHON=True -DBUILD_TEST=True -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=E:\PyTorch_Build\pytorch\torch -DCMAKE_PREFIX_PATH=E:\PyTorch_Build\pytorch\rtx5070_env\Lib\site-packages -DCUDNN_INCLUDE_DIR=E:\Program Files\NVIDIA\CUNND\v9.12\include\12.9 -DCUDNN_LIBRARY=E:\Program Files\NVIDIA\CUNND\v9.12\lib\12.9\x64 -DCUDNN_ROOT=E:\Program Files\NVIDIA\CUNND\v9.12 -DPython_EXECUTABLE=E:\PyTorch_Build\pytorch\rtx5070_env\Scripts\python.exe -DPython_NumPy_INCLUDE_DIR=E:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\numpy\_core\include -DTORCH_BUILD_VERSION=2.9.0a0+git2d31c3d -DTORCH_CUDA_ARCH_LIST=8.9 -DUSE_NUMPY=True -DUSE_OPENBLAS=1 E:\PyTorch_Build\pytorch CMake Deprecation Warning at CMakeLists.txt:9 (cmake_policy): The OLD behavior for policy CMP0126 will be removed from a future version of CMake. The cmake-policies(7) manual explains that the OLD behaviors of all policies are deprecated and that a policy should be set to OLD only under specific short-term circumstances. Projects should be ported to the NEW behavior and not rely on setting a policy to OLD. -- The CXX compiler identification is MSVC 19.44.35215.0 -- The C compiler identification is MSVC 19.44.35215.0 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe - skipped -- Detecting C compile features -- Detecting C compile features - done -- Not forcing any particular BLAS to be found CMake Warning at CMakeLists.txt:421 (message): TensorPipe cannot be used on Windows. Set it to OFF CMake Warning at CMakeLists.txt:423 (message): KleidiAI cannot be used on Windows. Set it to OFF CMake Warning at CMakeLists.txt:435 (message): Libuv is not installed in current conda env. Set USE_DISTRIBUTED to OFF. Please run command 'conda install -c conda-forge libuv=1.39' to install libuv. -- Performing Test C_HAS_AVX_1 -- Performing Test C_HAS_AVX_1 - Success -- Performing Test C_HAS_AVX2_1 -- Performing Test C_HAS_AVX2_1 - Success -- Performing Test C_HAS_AVX512_1 -- Performing Test C_HAS_AVX512_1 - Success -- Performing Test CXX_HAS_AVX_1 -- Performing Test CXX_HAS_AVX_1 - Success -- Performing Test CXX_HAS_AVX2_1 -- Performing Test CXX_HAS_AVX2_1 - Success -- Performing Test CXX_HAS_AVX512_1 -- Performing Test CXX_HAS_AVX512_1 - Success -- Current compiler supports avx2 extension. Will build perfkernels. -- Performing Test COMPILER_SUPPORTS_HIDDEN_VISIBILITY -- Performing Test COMPILER_SUPPORTS_HIDDEN_VISIBILITY - Failed -- Performing Test COMPILER_SUPPORTS_HIDDEN_INLINE_VISIBILITY -- Performing Test COMPILER_SUPPORTS_HIDDEN_INLINE_VISIBILITY - Failed -- Could not find hardware support for NEON on this machine. -- No OMAP3 processor on this machine. -- No OMAP4 processor on this machine. -- Compiler does not support SVE extension. Will not build perfkernels. CMake Warning at CMakeLists.txt:841 (message): x64 operating system is required for FBGEMM. Not compiling with FBGEMM. Turn this warning off by USE_FBGEMM=OFF. -- Performing Test HAS/UTF_8 -- Performing Test HAS/UTF_8 - Success -- Found CUDA: E:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0 (found version "13.0") -- The CUDA compiler identification is NVIDIA 13.0.48 with host compiler MSVC 19.44.35215.0 -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Check for working CUDA compiler: E:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/bin/nvcc.exe - skipped -- Detecting CUDA compile features -- Detecting CUDA compile features - done -- Found CUDAToolkit: E:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/include (found version "13.0.48") -- PyTorch: CUDA detected: 13.0 -- PyTorch: CUDA nvcc is: E:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/bin/nvcc.exe -- PyTorch: CUDA toolkit directory: E:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0 -- PyTorch: Header version is: 13.0 -- Found Python: E:\PyTorch_Build\pytorch\rtx5070_env\Scripts\python.exe (found version "3.10.10") found components: Interpreter CMake Warning at cmake/public/cuda.cmake:140 (message): Failed to compute shorthash for libnvrtc.so Call Stack (most recent call first): cmake/Dependencies.cmake:44 (include) CMakeLists.txt:869 (include) -- Found CUDNN: E:/Program Files/NVIDIA/CUNND/v9.12/lib/13.0/x64/cudnn.lib -- Could NOT find CUSPARSELT (missing: CUSPARSELT_LIBRARY_PATH CUSPARSELT_INCLUDE_PATH) CMake Warning at cmake/public/cuda.cmake:226 (message): Cannot find cuSPARSELt library. Turning the option off Call Stack (most recent call first): cmake/Dependencies.cmake:44 (include) CMakeLists.txt:869 (include) -- Could NOT find CUDSS (missing: CUDSS_LIBRARY_PATH CUDSS_INCLUDE_PATH) CMake Warning at cmake/public/cuda.cmake:242 (message): Cannot find CUDSS library. Turning the option off Call Stack (most recent call first): cmake/Dependencies.cmake:44 (include) CMakeLists.txt:869 (include) -- USE_CUFILE is set to 0. Compiling without cuFile support CMake Warning at cmake/public/cuda.cmake:317 (message): pytorch is not compatible with `CMAKE_CUDA_ARCHITECTURES` and will ignore its value. Please configure `TORCH_CUDA_ARCH_LIST` instead. Call Stack (most recent call first): cmake/Dependencies.cmake:44 (include) CMakeLists.txt:869 (include) -- Added CUDA NVCC flags for: -gencode;arch=compute_89,code=sm_89 CMake Warning at cmake/Dependencies.cmake:95 (message): Not compiling with XPU. Could NOT find SYCL. Suppress this warning with -DUSE_XPU=OFF. Call Stack (most recent call first): CMakeLists.txt:869 (include) -- Building using own protobuf under third_party per request. -- Use custom protobuf build. CMake Warning at cmake/ProtoBuf.cmake:37 (message): Ancient protobuf forces CMake compatibility Call Stack (most recent call first): cmake/ProtoBuf.cmake:87 (custom_protobuf_find) cmake/Dependencies.cmake:107 (include) CMakeLists.txt:869 (include) CMake Deprecation Warning at third_party/protobuf/cmake/CMakeLists.txt:2 (cmake_minimum_required): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument <min> value. Or, use the <min>...<max> syntax to tell CMake that the project requires at least <min> but has been updated to work with policies introduced by <max> or earlier. -- -- 3.13.0.0 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - not found -- Found Threads: TRUE -- Caffe2 protobuf include directory: $<BUILD_INTERFACE:E:/PyTorch_Build/pytorch/third_party/protobuf/src>$<INSTALL_INTERFACE:include> -- Trying to find preferred BLAS backend of choice: MKL -- MKL_THREADING = OMP -- Looking for sys/types.h -- Looking for sys/types.h - found -- Looking for stdint.h -- Looking for stdint.h - found -- Looking for stddef.h -- Looking for stddef.h - found -- Check size of void* -- Check size of void* - done -- MKL_THREADING = OMP CMake Warning at cmake/Dependencies.cmake:213 (message): MKL could not be found. Defaulting to Eigen Call Stack (most recent call first): CMakeLists.txt:869 (include) CMake Warning at cmake/Dependencies.cmake:279 (message): Preferred BLAS (MKL) cannot be found, now searching for a general BLAS library Call Stack (most recent call first): CMakeLists.txt:869 (include) -- MKL_THREADING = OMP -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_sequential - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - libiomp5md - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - libiomp5md - pthread] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - pthread] -- Library mkl_intel: not found -- Checking for [mkl - guide - pthread - m] -- Library mkl: not found -- MKL library not found -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Checking for [Accelerate] -- Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND -- Checking for [vecLib] -- Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND -- Checking for [flexiblas] -- Library flexiblas: BLAS_flexiblas_LIBRARY-NOTFOUND -- Checking for [openblas] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m - gomp] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [libopenblas] -- Library libopenblas: BLAS_libopenblas_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran - pthread] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [acml - gfortran] -- Library acml: BLAS_acml_LIBRARY-NOTFOUND -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Could NOT find Atlas (missing: Atlas_CBLAS_INCLUDE_DIR Atlas_CLAPACK_INCLUDE_DIR Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY Atlas_LAPACK_LIBRARY) -- Checking for [ptf77blas - atlas - gfortran] -- Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND -- Checking for [] -- Looking for sgemm_ -- Looking for sgemm_ - not found -- Cannot find a library with BLAS API. Not using BLAS. -- Using pocketfft in directory: E:/PyTorch_Build/pytorch/third_party/pocketfft/ CMake Deprecation Warning at third_party/pthreadpool/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument <min> value. Or, use the <min>...<max> syntax to tell CMake that the project requires at least <min> but has been updated to work with policies introduced by <max> or earlier. CMake Deprecation Warning at third_party/FXdiv/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument <min> value. Or, use the <min>...<max> syntax to tell CMake that the project requires at least <min> but has been updated to work with policies introduced by <max> or earlier. CMake Deprecation Warning at third_party/cpuinfo/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument <min> value. Or, use the <min>...<max> syntax to tell CMake that the project requires at least <min> but has been updated to work with policies introduced by <max> or earlier. -- The ASM compiler identification is MSVC CMake Warning (dev) at rtx5070_env/Lib/site-packages/cmake/data/share/cmake-4.1/Modules/CMakeDetermineASMCompiler.cmake:234 (message): Policy CMP194 is not set: MSVC is not an assembler for language ASM. Run "cmake --help-policy CMP194" for policy details. Use the cmake_policy command to set the policy and suppress this warning. Call Stack (most recent call first): third_party/XNNPACK/CMakeLists.txt:18 (PROJECT) This warning is for project developers. Use -Wno-dev to suppress it. -- Found assembler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe -- Building for XNNPACK_TARGET_PROCESSOR: x86_64 -- Generating microkernels.cmake
09-04
2025-06-23 15:03:47.502915: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2025-06-23 15:03:48.564224: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. [15:03:49] 程序启动 2025-06-23 15:03:49.963303: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 正在初始化FaceNet... 正在下载FaceNet模型文件(约95MB)... 尝试 #1/5 Traceback (most recent call last): File "E:\pycharm\study\object\face_model.py", line 67, in load_facenet_manually with urllib.request.urlopen(req, timeout=60) as response: File "E:\python3.9.13\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "E:\python3.9.13\lib\urllib\request.py", line 523, in open response = meth(req, response) File "E:\python3.9.13\lib\urllib\request.py", line 632, in http_response response = self.parent.error( File "E:\python3.9.13\lib\urllib\request.py", line 561, in error return self._call_chain(*args) File "E:\python3.9.13\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "E:\python3.9.13\lib\urllib\request.py", line 641, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found During handling of the above exception, another exception occurred: Traceback (most recent call last): File "E:\pycharm\study\object\face_model.py", line 384, in <module> main() File "E:\pycharm\study\object\face_model.py", line 334, in main recognizer = DormFaceRecognizer( File "E:\pycharm\study\object\face_model.py", line 136, in __init__ self.embedder = load_facenet_manually(cache_dir=facenet_cache_dir) File "E:\pycharm\study\object\face_model.py", line 92, in load_facenet_manually except (socket.timeout, urllib.error.URLError, ConnectionResetError) as e: TypeError: catching classes that do not inherit from BaseException is not allowed 进程已结束,退出代码为 1
06-24
import os import json import pandas as pd from urllib.parse import urlparse import moxing as mox def load_unique_video_ids(train_txt_path): """ 从 train.txt 中提取 '_' 后的视频 ID,并去重,保持首次出现顺序。 """ video_ids = [] seen = set() with open(train_txt_path, 'r', encoding='utf-8') as f: for line in f: line = line.strip() if not line or '_' not in line: continue vid_id = line.split('_', 1)[1] if vid_id not in seen: seen.add(vid_id) video_ids.append(vid_id) print(f"Loaded {len(video_ids)} unique video IDs.") return video_ids def find_video_path(video_filename): """ 按优先级检查 OBS 路径是否存在。 """ prefixes = [ 'obs://bdd-x/train/', 'obs://bdd-x/val/', 'obs://bdd-x/test/' ] for prefix in prefixes: path = prefix + video_filename if mox.file.exists(path): return path return None def safe_float(x): """ 安全地将任意类型转为 float,失败返回 None。 """ try: if pd.isna(x): return None return float(x) except (ValueError, TypeError): try: return float(str(x).strip()) except: return None def safe_str(x): """ 安全地转为字符串并去除空白,若为 NaN 或无效则返回空字符串。 """ if x is None: return "" if pd.isna(x): return "" try: return str(x).strip() except: return "" def parse_annotations_from_row(row): """ 解析一行标注数据,支持混合类型(object/float64)和 NaN。 """ segments = [] for i in range(1, 16): start_col = f'Answer.{i}start' end_col = f'Answer.{i}end' action_col = f'Answer.{i}action' just_col = f'Answer.{i}justification' # 检查列是否存在 if start_col not in row: break # 提取字段并安全转换 start_val = safe_float(row[start_col]) end_val = safe_float(row[end_col]) # 如果 start 或 end 是无效值,则认为没有更多片段了 if start_val is None or end_val is None: break start = int(start_val) end = int(end_val) action = safe_str(row.get(action_col)) justification = safe_str(row.get(just_col)) # 构建 explanation explanation = action if justification: explanation += f",{justification}" # 确保有内容才添加 if explanation.strip(): segments.append(f"From {start} to {end} seconds,{explanation}") return " ".join(segments) def main(): # 文件路径 train_txt_path = 'a/train.txt' csv_path = 'a/BDD-X-Annotations_v1.csv' output_jsonl = 'BDD-X_train.jsonl' # 检查输入文件 if not os.path.exists(train_txt_path): raise FileNotFoundError(f"{train_txt_path} not found") if not os.path.exists(csv_path): raise FileNotFoundError(f"{csv_path} not found") # Step 1: 加载唯一视频 ID video_ids = load_unique_video_ids(train_txt_path) # Step 2: 读取 CSV 并清理列名 df = pd.read_csv(csv_path, on_bad_lines='skip', dtype=str) # 强制以 string 读入避免类型混乱 print(f"Raw CSV loaded with {len(df)} rows.") # 清理列名:去空格、换行等 df.columns = [col.strip() for col in df.columns] # 构建 filename -> row 映射 url_to_row = {} skipped_no_url = 0 for idx, row in df.iterrows(): input_video = row.get('Input.Video') if pd.isna(input_video) or not input_video.strip(): skipped_no_url += 1 continue try: parsed_url = urlparse(input_video.strip()) filename = os.path.basename(parsed_url.path).strip() if filename and filename not in url_to_row: url_to_row[filename] = row except Exception as e: print(f"Error parsing URL at row {idx}: {e}") continue if skipped_no_url > 0: print(f"Skipped {skipped_no_url} rows due to missing Input.Video") # Step 3: 生成 JSONL results = [] missing_anno = [] missing_obs = [] for idx, vid_id in enumerate(video_ids): mov_filename = f"{vid_id}.mov" # 查找标注 row = url_to_row.get(mov_filename) if row is None: missing_anno.append(vid_id) continue # 查找视频文件路径 obs_path = find_video_path(mov_filename) if not obs_path: missing_obs.append(vid_id) continue # 解析行为描述 description = parse_annotations_from_row(row) if not description.strip(): continue # 跳过无描述的样本 item = { "id": idx, "video": obs_path, "conversations": [ { "from": "human", "value": "<video>\nIn this video, please describe the vehicle's actions in segments and provide corresponding explanations for the driving behavior." }, { "from": "gpt", "value": description.strip() } ] } results.append(item) # Step 4: 写出 JSONL with open(output_jsonl, 'w', encoding='utf-8') as f: for item in results: f.write(json.dumps(item, ensure_ascii=False) + '\n') # 输出统计 print(f"\n=== Summary ===") print(f"Total unique IDs: {len(video_ids)}") print(f"Processed samples: {len(results)}") print(f"Missing annotations: {len(missing_anno)}") print(f"Missing in OBS: {len(missing_obs)}") print(f"Output saved to: {output_jsonl}") # 可选:保存缺失列表 if missing_anno: with open("missing_annotations.txt", "w") as f: for vid in missing_anno: f.write(vid + "\n") print("Saved missing_annotations.txt") if missing_obs: with open("missing_in_obs.txt", "w") as f: for vid in missing_obs: f.write(vid + "\n") print("Saved missing_in_obs.txt") if __name__ == "__main__": # (可选)设置 MoXing 认证(本地运行需配置 AK/SK) # mox.file.set_auth(aws_access_key_id='your_ak', aws_secret_access_key='your_sk') main()Error processing row 12971: 'float' object has no attribute 'decode
10-22
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值