'NoneType' object has no attribute '__getitem__'和argument of type 'NoneType' is not iterable 异常的解决办法

本文解决Python多线程采集新浪微博GPS数据并存入Mongodb时出现的TypeError异常。通过增加条件判断避免对NoneType对象进行操作。
部署运行你感兴趣的模型镜像


python多线程采集新浪微博GPS数据,存入Mongodb。Mongodb要求经纬度的严格顺序为[longtitude, latitude],而微薄返回的顺序正好相反。所以在转换时,报出如上图所示异常:TypeError: 'NoneType' object has no attribute '__getitem__'。代码如下:

#extract one from queue
record = self.queue.get(True)
#import record into MongoDB
if record:
    record['geo']['coordinates'] = record['geo']['coordinates'][::-1]
    self.status.insert(record)
self.queue.task_done()

显然record可能不是dictionary,而为'None',所以在使用'[ ]'取值时就出现has no attribute '__getitem__'的异常。


之后将代码改为:

#extract one from queue
record = self.queue.get(True)
#import record into MongoDB
if record and ('geo' in record) and ('coordinates' in record['geo']):
    record['geo']['coordinates'] = record['geo']['coordinates'][::-1]
    self.status.insert(record)
self.queue.task_done()
以为搞定了,结果程序还是抛出了如下图异常:TypeError: argument of type 'NoneType' is not iterable。后来仔细一看发现,如果record and ('geo' in record)是一对,即使用'in'之前要保证record不是'None',那么('coordinates' in record['geo']似乎缺少保证record['geo']不为'None'。


于是将代码再改为如下,程序运行正常:

#extract one from queue
record = self.queue.get(True)
#import record into MongoDB
if record and ('geo' in record) and record['geo'] and ('coordinates' in record['geo']):
    record['geo']['coordinates'] = record['geo']['coordinates'][::-1]
    self.status.insert(record)
self.queue.task_done()




为了证明分析原因的正确性,重现异常:



所以在python集合中使用'in'和'[ ]'等操作符之前,要确保集合不为'None'类型。



您可能感兴趣的与本文相关的镜像

Python3.11

Python3.11

Conda
Python

Python 是一种高级、解释型、通用的编程语言,以其简洁易读的语法而闻名,适用于广泛的应用,包括Web开发、数据分析、人工智能和自动化脚本

"""Support for dynamic COM client support. Introduction Dynamic COM client support is the ability to use a COM server without prior knowledge of the server. This can be used to talk to almost all COM servers, including much of MS Office. In general, you should not use this module directly - see below. Example >>> import win32com.client >>> xl = win32com.client.Dispatch("Excel.Application") # The line above invokes the functionality of this class. # xl is now an object we can use to talk to Excel. >>> xl.Visible = 1 # The Excel window becomes visible. """ import traceback from itertools import chain from types import MethodType import pythoncom # Needed as code we eval() references it. import win32com.client import winerror from pywintypes import IIDType from . import build debugging = 0 # General debugging debugging_attr = 0 # Debugging dynamic attribute lookups. LCID = 0x0 # These errors generally mean the property or method exists, # but can't be used in this context - eg, property instead of a method, etc. # Used to determine if we have a real error or not. ERRORS_BAD_CONTEXT = [ winerror.DISP_E_MEMBERNOTFOUND, winerror.DISP_E_BADPARAMCOUNT, winerror.DISP_E_PARAMNOTOPTIONAL, winerror.DISP_E_TYPEMISMATCH, winerror.E_INVALIDARG, ] ALL_INVOKE_TYPES = [ pythoncom.INVOKE_PROPERTYGET, pythoncom.INVOKE_PROPERTYPUT, pythoncom.INVOKE_PROPERTYPUTREF, pythoncom.INVOKE_FUNC, ] def debug_print(*args): if debugging: for arg in args: print(arg, end=" ") print() def debug_attr_print(*args): if debugging_attr: for arg in args: print(arg, end=" ") print() # get the type objects for IDispatch and IUnknown PyIDispatchType = pythoncom.TypeIIDs[pythoncom.IID_IDispatch] PyIUnknownType = pythoncom.TypeIIDs[pythoncom.IID_IUnknown] _GoodDispatchTypes = (str, IIDType) def _GetGoodDispatch(IDispatch, clsctx=pythoncom.CLSCTX_SERVER): # quick return for most common case if isinstance(IDispatch, PyIDispatchType): return IDispatch if isinstance(IDispatch, _GoodDispatchTypes): try: IDispatch = pythoncom.connect(IDispatch) except pythoncom.ole_error: IDispatch = pythoncom.CoCreateInstance( IDispatch, None, clsctx, pythoncom.IID_IDispatch ) else: # may already be a wrapped class. IDispatch = getattr(IDispatch, "_oleobj_", IDispatch) return IDispatch def _GetGoodDispatchAndUserName(IDispatch, userName, clsctx): # Get a dispatch object, and a 'user name' (ie, the name as # displayed to the user in repr() etc. if userName is None: if isinstance(IDispatch, str): userName = IDispatch ## ??? else userName remains None ??? else: userName = str(userName) return (_GetGoodDispatch(IDispatch, clsctx), userName) def _GetDescInvokeType(entry, invoke_type): # determine the wFlags argument passed as input to IDispatch::Invoke # Only ever called by __getattr__ and __setattr__ from dynamic objects! # * `entry` is a MapEntry with whatever typeinfo we have about the property we are getting/setting. # * `invoke_type` is either INVOKE_PROPERTYGET | INVOKE_PROPERTYSET and really just # means "called by __getattr__" or "called by __setattr__" if not entry or not entry.desc: return invoke_type if entry.desc.desckind == pythoncom.DESCKIND_VARDESC: return invoke_type # So it's a FUNCDESC - just use what it specifies. return entry.desc.invkind def Dispatch( IDispatch, userName=None, createClass=None, typeinfo=None, clsctx=pythoncom.CLSCTX_SERVER, ): IDispatch, userName = _GetGoodDispatchAndUserName(IDispatch, userName, clsctx) if createClass is None: createClass = CDispatch lazydata = None try: if typeinfo is None: typeinfo = IDispatch.GetTypeInfo() if typeinfo is not None: try: # try for a typecomp typecomp = typeinfo.GetTypeComp() lazydata = typeinfo, typecomp except pythoncom.com_error: pass except pythoncom.com_error: typeinfo = None olerepr = MakeOleRepr(IDispatch, typeinfo, lazydata) return createClass(IDispatch, olerepr, userName, lazydata=lazydata) def MakeOleRepr(IDispatch, typeinfo, typecomp): olerepr = None if typeinfo is not None: try: attr = typeinfo.GetTypeAttr() # If the type info is a special DUAL interface, magically turn it into # a DISPATCH typeinfo. if ( attr[5] == pythoncom.TKIND_INTERFACE and attr[11] & pythoncom.TYPEFLAG_FDUAL ): # Get corresponding Disp interface; # -1 is a special value which does this for us. href = typeinfo.GetRefTypeOfImplType(-1) typeinfo = typeinfo.GetRefTypeInfo(href) attr = typeinfo.GetTypeAttr() if typecomp is None: olerepr = build.DispatchItem(typeinfo, attr, None, 0) else: olerepr = build.LazyDispatchItem(attr, None) except pythoncom.ole_error: pass if olerepr is None: olerepr = build.DispatchItem() return olerepr def DumbDispatch( IDispatch, userName=None, createClass=None, clsctx=pythoncom.CLSCTX_SERVER, ): "Dispatch with no type info" IDispatch, userName = _GetGoodDispatchAndUserName(IDispatch, userName, clsctx) if createClass is None: createClass = CDispatch return createClass(IDispatch, build.DispatchItem(), userName) class CDispatch: def __init__(self, IDispatch, olerepr, userName=None, lazydata=None): if userName is None: userName = "<unknown>" self.__dict__["_oleobj_"] = IDispatch self.__dict__["_username_"] = userName self.__dict__["_olerepr_"] = olerepr self.__dict__["_mapCachedItems_"] = {} self.__dict__["_builtMethods_"] = {} self.__dict__["_enum_"] = None self.__dict__["_unicode_to_string_"] = None self.__dict__["_lazydata_"] = lazydata def __call__(self, *args): "Provide 'default dispatch' COM functionality - allow instance to be called" if self._olerepr_.defaultDispatchName: invkind, dispid = self._find_dispatch_type_( self._olerepr_.defaultDispatchName ) else: invkind, dispid = ( pythoncom.DISPATCH_METHOD | pythoncom.DISPATCH_PROPERTYGET, pythoncom.DISPID_VALUE, ) if invkind is not None: allArgs = (dispid, LCID, invkind, 1) + args return self._get_good_object_( self._oleobj_.Invoke(*allArgs), self._olerepr_.defaultDispatchName, None ) raise TypeError("This dispatch object does not define a default method") def __bool__(self): return True # ie "if object:" should always be "true" - without this, __len__ is tried. # _Possibly_ want to defer to __len__ if available, but I'm not sure this is # desirable??? def __repr__(self): return "<COMObject %s>" % (self._username_) def __str__(self): # __str__ is used when the user does "print(object)", so we gracefully # fall back to the __repr__ if the object has no default method. try: return str(self.__call__()) except pythoncom.com_error as details: if details.hresult not in ERRORS_BAD_CONTEXT: raise return self.__repr__() def __dir__(self): attributes = chain(self.__dict__, dir(self.__class__), self._dir_ole_()) try: attributes = chain(attributes, [p.Name for p in self.Properties_]) except AttributeError: pass return list(set(attributes)) def _dir_ole_(self): items_dict = {} for iTI in range(0, self._oleobj_.GetTypeInfoCount()): typeInfo = self._oleobj_.GetTypeInfo(iTI) self._UpdateWithITypeInfo_(items_dict, typeInfo) return list(items_dict) def _UpdateWithITypeInfo_(self, items_dict, typeInfo): typeInfos = [typeInfo] # suppress IDispatch and IUnknown methods inspectedIIDs = {pythoncom.IID_IDispatch: None} while len(typeInfos) > 0: typeInfo = typeInfos.pop() typeAttr = typeInfo.GetTypeAttr() if typeAttr.iid not in inspectedIIDs: inspectedIIDs[typeAttr.iid] = None for iFun in range(0, typeAttr.cFuncs): funDesc = typeInfo.GetFuncDesc(iFun) funName = typeInfo.GetNames(funDesc.memid)[0] if funName not in items_dict: items_dict[funName] = None # Inspect the type info of all implemented types # E.g. IShellDispatch5 implements IShellDispatch4 which implements IShellDispatch3 ... for iImplType in range(0, typeAttr.cImplTypes): iRefType = typeInfo.GetRefTypeOfImplType(iImplType) refTypeInfo = typeInfo.GetRefTypeInfo(iRefType) typeInfos.append(refTypeInfo) # Delegate comparison to the oleobjs, as they know how to do identity. def __eq__(self, other): other = getattr(other, "_oleobj_", other) return self._oleobj_ == other def __ne__(self, other): other = getattr(other, "_oleobj_", other) return self._oleobj_ != other def __int__(self): return int(self.__call__()) def __len__(self): invkind, dispid = self._find_dispatch_type_("Count") if invkind: return self._oleobj_.Invoke(dispid, LCID, invkind, 1) raise TypeError("This dispatch object does not define a Count method") def _NewEnum(self): try: invkind = pythoncom.DISPATCH_METHOD | pythoncom.DISPATCH_PROPERTYGET enum = self._oleobj_.InvokeTypes( pythoncom.DISPID_NEWENUM, LCID, invkind, (13, 10), () ) except pythoncom.com_error: return None # no enumerator for this object. from . import util return util.WrapEnum(enum, None) def __getitem__(self, index): # syver modified # Improved __getitem__ courtesy Syver Enstad # Must check _NewEnum before Item, to ensure b/w compat. if isinstance(index, int): if self.__dict__["_enum_"] is None: self.__dict__["_enum_"] = self._NewEnum() if self.__dict__["_enum_"] is not None: return self._get_good_object_(self._enum_.__getitem__(index)) # See if we have an "Item" method/property we can use (goes hand in hand with Count() above!) invkind, dispid = self._find_dispatch_type_("Item") if invkind is not None: return self._get_good_object_( self._oleobj_.Invoke(dispid, LCID, invkind, 1, index) ) raise TypeError("This object does not support enumeration") def __setitem__(self, index, *args): # XXX - todo - We should support calling Item() here too! # print("__setitem__ with", index, args) if self._olerepr_.defaultDispatchName: invkind, dispid = self._find_dispatch_type_( self._olerepr_.defaultDispatchName ) else: invkind, dispid = ( pythoncom.DISPATCH_PROPERTYPUT | pythoncom.DISPATCH_PROPERTYPUTREF, pythoncom.DISPID_VALUE, ) if invkind is not None: allArgs = (dispid, LCID, invkind, 0, index) + args return self._get_good_object_( self._oleobj_.Invoke(*allArgs), self._olerepr_.defaultDispatchName, None ) raise TypeError("This dispatch object does not define a default method") def _find_dispatch_type_(self, methodName): if methodName in self._olerepr_.mapFuncs: item = self._olerepr_.mapFuncs[methodName] return item.desc[4], item.dispid if methodName in self._olerepr_.propMapGet: item = self._olerepr_.propMapGet[methodName] return item.desc[4], item.dispid try: dispid = self._oleobj_.GetIDsOfNames(0, methodName) except: ### what error? return None, None return pythoncom.DISPATCH_METHOD | pythoncom.DISPATCH_PROPERTYGET, dispid def _ApplyTypes_(self, dispid, wFlags, retType, argTypes, user, resultCLSID, *args): result = self._oleobj_.InvokeTypes( *(dispid, LCID, wFlags, retType, argTypes) + args ) return self._get_good_object_(result, user, resultCLSID) def _wrap_dispatch_( self, ob, userName=None, returnCLSID=None, ): # Given a dispatch object, wrap it in a class return Dispatch(ob, userName) def _get_good_single_object_(self, ob, userName=None, ReturnCLSID=None): if isinstance(ob, PyIDispatchType): # make a new instance of (probably this) class. return self._wrap_dispatch_(ob, userName, ReturnCLSID) if isinstance(ob, PyIUnknownType): try: ob = ob.QueryInterface(pythoncom.IID_IDispatch) except pythoncom.com_error: # It is an IUnknown, but not an IDispatch, so just let it through. return ob return self._wrap_dispatch_(ob, userName, ReturnCLSID) return ob def _get_good_object_(self, ob, userName=None, ReturnCLSID=None): """Given an object (usually the retval from a method), make it a good object to return. Basically checks if it is a COM object, and wraps it up. Also handles the fact that a retval may be a tuple of retvals""" if ob is None: # Quick exit! return None elif isinstance(ob, tuple): return tuple( map( lambda o, s=self, oun=userName, rc=ReturnCLSID: s._get_good_single_object_(o, oun, rc), ob, ) ) else: return self._get_good_single_object_(ob) def _make_method_(self, name): "Make a method object - Assumes in olerepr funcmap" methodName = build.MakePublicAttributeName(name) # translate keywords etc. methodCodeList = self._olerepr_.MakeFuncMethod( self._olerepr_.mapFuncs[name], methodName, 0 ) methodCode = "\n".join(methodCodeList) try: # print(f"Method code for {self._username_} is:\n", methodCode) # self._print_details_() codeObject = compile(methodCode, "<COMObject %s>" % self._username_, "exec") # Exec the code object tempNameSpace = {} # "Dispatch" in the exec'd code is win32com.client.Dispatch, not ours. globNameSpace = globals().copy() globNameSpace["Dispatch"] = win32com.client.Dispatch exec( codeObject, globNameSpace, tempNameSpace ) # self.__dict__, self.__dict__ name = methodName # Save the function in map. fn = self._builtMethods_[name] = tempNameSpace[name] return MethodType(fn, self) except: debug_print("Error building OLE definition for code ", methodCode) traceback.print_exc() return None def _Release_(self): """Cleanup object - like a close - to force cleanup when you don't want to rely on Python's reference counting.""" for childCont in self._mapCachedItems_.values(): childCont._Release_() self._mapCachedItems_ = {} if self._oleobj_: self._oleobj_.Release() self.__dict__["_oleobj_"] = None if self._olerepr_: self.__dict__["_olerepr_"] = None self._enum_ = None def _proc_(self, name, *args): """Call the named method as a procedure, rather than function. Mainly used by Word.Basic, which whinges about such things.""" try: item = self._olerepr_.mapFuncs[name] dispId = item.dispid return self._get_good_object_( self._oleobj_.Invoke(*(dispId, LCID, item.desc[4], 0) + (args)) ) except KeyError: raise AttributeError(name) def _print_details_(self): "Debug routine - dumps what it knows about an object." print("AxDispatch container", self._username_) try: print("Methods:") for method in self._olerepr_.mapFuncs: print("\t", method) print("Props:") for prop, entry in self._olerepr_.propMap.items(): print(f"\t{prop} = 0x{entry.dispid:x} - {entry!r}") print("Get Props:") for prop, entry in self._olerepr_.propMapGet.items(): print(f"\t{prop} = 0x{entry.dispid:x} - {entry!r}") print("Put Props:") for prop, entry in self._olerepr_.propMapPut.items(): print(f"\t{prop} = 0x{entry.dispid:x} - {entry!r}") except: traceback.print_exc() def __LazyMap__(self, attr): try: if self._LazyAddAttr_(attr): debug_attr_print( f"{self._username_}.__LazyMap__({attr}) added something" ) return 1 except AttributeError: return 0 # Using the typecomp, lazily create a new attribute definition. def _LazyAddAttr_(self, attr): if self._lazydata_ is None: return 0 res = 0 typeinfo, typecomp = self._lazydata_ olerepr = self._olerepr_ # We need to explicitly check each invoke type individually - simply # specifying '0' will bind to "any member", which may not be the one # we are actually after (ie, we may be after prop_get, but returned # the info for the prop_put.) for i in ALL_INVOKE_TYPES: try: x, t = typecomp.Bind(attr, i) # Support 'Get' and 'Set' properties - see # bug 1587023 if x == 0 and attr[:3] in ("Set", "Get"): x, t = typecomp.Bind(attr[3:], i) if x == pythoncom.DESCKIND_FUNCDESC: # it's a FUNCDESC r = olerepr._AddFunc_(typeinfo, t, 0) elif x == pythoncom.DESCKIND_VARDESC: # it's a VARDESC r = olerepr._AddVar_(typeinfo, t, 0) else: # not found or TYPEDESC/IMPLICITAPP r = None if not r is None: key, map = r[0], r[1] item = map[key] if map == olerepr.propMapPut: olerepr._propMapPutCheck_(key, item) elif map == olerepr.propMapGet: olerepr._propMapGetCheck_(key, item) res = 1 except: pass return res def _FlagAsMethod(self, *methodNames): """Flag these attribute names as being methods. Some objects do not correctly differentiate methods and properties, leading to problems when calling these methods. Specifically, trying to say: ob.SomeFunc() may yield an exception "None object is not callable" In this case, an attempt to fetch the *property* has worked and returned None, rather than indicating it is really a method. Calling: ob._FlagAsMethod("SomeFunc") should then allow this to work. """ for name in methodNames: details = build.MapEntry(self.__AttrToID__(name), (name,)) self._olerepr_.mapFuncs[name] = details def __AttrToID__(self, attr): debug_attr_print( "Calling GetIDsOfNames for property {} in Dispatch container {}".format( attr, self._username_ ) ) return self._oleobj_.GetIDsOfNames(0, attr) def __getattr__(self, attr): if attr == "__iter__": # We can't handle this as a normal method, as if the attribute # exists, then it must return an iterable object. try: invkind = pythoncom.DISPATCH_METHOD | pythoncom.DISPATCH_PROPERTYGET enum = self._oleobj_.InvokeTypes( pythoncom.DISPID_NEWENUM, LCID, invkind, (13, 10), () ) except pythoncom.com_error: raise AttributeError("This object can not function as an iterator") # We must return a callable object. class Factory: def __init__(self, ob): self.ob = ob def __call__(self): import win32com.client.util return win32com.client.util.Iterator(self.ob) return Factory(enum) if attr.startswith("_") and attr.endswith("_"): # Fast-track. raise AttributeError(attr) # If a known method, create new instance and return. try: return MethodType(self._builtMethods_[attr], self) except KeyError: pass # XXX - Note that we current are case sensitive in the method. # debug_attr_print("GetAttr called for %s on DispatchContainer %s" % (attr,self._username_)) # First check if it is in the method map. Note that an actual method # must not yet exist, (otherwise we would not be here). This # means we create the actual method object - which also means # this code will never be asked for that method name again. if attr in self._olerepr_.mapFuncs: return self._make_method_(attr) # Delegate to property maps/cached items retEntry = None if self._olerepr_ and self._oleobj_: # first check general property map, then specific "put" map. retEntry = self._olerepr_.propMap.get(attr) if retEntry is None: retEntry = self._olerepr_.propMapGet.get(attr) # Not found so far - See what COM says. if retEntry is None: try: if self.__LazyMap__(attr): if attr in self._olerepr_.mapFuncs: return self._make_method_(attr) retEntry = self._olerepr_.propMap.get(attr) if retEntry is None: retEntry = self._olerepr_.propMapGet.get(attr) if retEntry is None: retEntry = build.MapEntry(self.__AttrToID__(attr), (attr,)) except pythoncom.ole_error: pass # No prop by that name - retEntry remains None. if retEntry is not None: # see if in my cache try: ret = self._mapCachedItems_[retEntry.dispid] debug_attr_print("Cached items has attribute!", ret) return ret except (KeyError, AttributeError): debug_attr_print("Attribute %s not in cache" % attr) # If we are still here, and have a retEntry, get the OLE item if retEntry is not None: invoke_type = _GetDescInvokeType(retEntry, pythoncom.INVOKE_PROPERTYGET) debug_attr_print( "Getting property Id 0x%x from OLE object" % retEntry.dispid ) try: ret = self._oleobj_.Invoke(retEntry.dispid, 0, invoke_type, 1) except pythoncom.com_error as details: if details.hresult in ERRORS_BAD_CONTEXT: # May be a method. self._olerepr_.mapFuncs[attr] = retEntry return self._make_method_(attr) raise debug_attr_print("OLE returned ", ret) return self._get_good_object_(ret) # no where else to look. raise AttributeError(f"{self._username_}.{attr}") def __setattr__(self, attr, value): if ( attr in self.__dict__ ): # Fast-track - if already in our dict, just make the assignment. # XXX - should maybe check method map - if someone assigns to a method, # it could mean something special (not sure what, tho!) self.__dict__[attr] = value return # Allow property assignment. debug_attr_print( f"SetAttr called for {self._username_}.{attr}={value!r} on DispatchContainer" ) if self._olerepr_: # Check the "general" property map. if attr in self._olerepr_.propMap: entry = self._olerepr_.propMap[attr] invoke_type = _GetDescInvokeType(entry, pythoncom.INVOKE_PROPERTYPUT) self._oleobj_.Invoke(entry.dispid, 0, invoke_type, 0, value) return # Check the specific "put" map. if attr in self._olerepr_.propMapPut: entry = self._olerepr_.propMapPut[attr] invoke_type = _GetDescInvokeType(entry, pythoncom.INVOKE_PROPERTYPUT) self._oleobj_.Invoke(entry.dispid, 0, invoke_type, 0, value) return # Try the OLE Object if self._oleobj_: if self.__LazyMap__(attr): # Check the "general" property map. if attr in self._olerepr_.propMap: entry = self._olerepr_.propMap[attr] invoke_type = _GetDescInvokeType( entry, pythoncom.INVOKE_PROPERTYPUT ) self._oleobj_.Invoke(entry.dispid, 0, invoke_type, 0, value) return # Check the specific "put" map. if attr in self._olerepr_.propMapPut: entry = self._olerepr_.propMapPut[attr] invoke_type = _GetDescInvokeType( entry, pythoncom.INVOKE_PROPERTYPUT ) self._oleobj_.Invoke(entry.dispid, 0, invoke_type, 0, value) return try: entry = build.MapEntry(self.__AttrToID__(attr), (attr,)) except pythoncom.com_error: # No attribute of that name entry = None if entry is not None: try: invoke_type = _GetDescInvokeType( entry, pythoncom.INVOKE_PROPERTYPUT ) self._oleobj_.Invoke(entry.dispid, 0, invoke_type, 0, value) self._olerepr_.propMap[attr] = entry debug_attr_print( "__setattr__ property {} (id=0x{:x}) in Dispatch container {}".format( attr, entry.dispid, self._username_ ) ) return except pythoncom.com_error: pass raise AttributeError(f"Property '{self._username_}.{attr}' can not be set.") 结合这个文件
09-10
#下面程序运行时报错: C:\Users\Administrator\AppData\Local\Programs\Python\Python312\python.exe C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\utils\generic.py Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\utils\generic.py", line 34, in <module> from ..utils import logging ImportError: attempted relative import with no known parent package 进程已结束,退出代码为 1 ------------------------------------------------------------------------------------------------ import inspect import json import os import tempfile import warnings from collections import OrderedDict, UserDict, defaultdict from collections.abc import Iterable, MutableMapping from contextlib import ExitStack, contextmanager from dataclasses import dataclass, fields, is_dataclass from enum import Enum from functools import partial, wraps from typing import Any, Callable, ContextManager, Optional, TypedDict import numpy as np from packaging import version from ..utils import logging from .import_utils import ( get_torch_version, is_flax_available, is_mlx_available, is_tf_available, is_torch_available, is_torch_fx_proxy, requires, ) _CAN_RECORD_REGISTRY = {} logger = logging.get_logger(__name__) if is_torch_available(): # required for @can_return_tuple decorator to work with torchdynamo import torch # noqa: F401 from ..model_debugging_utils import model_addition_debugger_context class cached_property(property): """ Descriptor that mimics @property but caches output in member variable. From tensorflow_datasets Built-in in functools from Python 3.8. """ def __get__(self, obj, objtype=None): # See docs.python.org/3/howto/descriptor.html#properties if obj is None: return self if self.fget is None: raise AttributeError("unreadable attribute") attr = "__cached_" + self.fget.__name__ cached = getattr(obj, attr, None) if cached is None: cached = self.fget(obj) setattr(obj, attr, cached) return cached # vendored from distutils.util def strtobool(val): """Convert a string representation of truth to true (1) or false (0). True values are 'y', 'yes', 't', 'true', 'on', and '1'; false values are 'n', 'no', 'f', 'false', 'off', and '0'. Raises ValueError if 'val' is anything else. """ val = val.lower() if val in {"y", "yes", "t", "true", "on", "1"}: return 1 if val in {"n", "no", "f", "false", "off", "0"}: return 0 raise ValueError(f"invalid truth value {val!r}") def infer_framework_from_repr(x): """ Tries to guess the framework of an object `x` from its repr (brittle but will help in `is_tensor` to try the frameworks in a smart order, without the need to import the frameworks). """ representation = str(type(x)) if representation.startswith("<class 'torch."): return "pt" elif representation.startswith("<class 'tensorflow."): return "tf" elif representation.startswith("<class 'jax"): return "jax" elif representation.startswith("<class 'numpy."): return "np" elif representation.startswith("<class 'mlx."): return "mlx" def _get_frameworks_and_test_func(x): """ Returns an (ordered since we are in Python 3.7+) dictionary framework to test function, which places the framework we can guess from the repr first, then Numpy, then the others. """ framework_to_test = { "pt": is_torch_tensor, "tf": is_tf_tensor, "jax": is_jax_tensor, "np": is_numpy_array, "mlx": is_mlx_array, } preferred_framework = infer_framework_from_repr(x) # We will test this one first, then numpy, then the others. frameworks = [] if preferred_framework is None else [preferred_framework] if preferred_framework != "np": frameworks.append("np") frameworks.extend([f for f in framework_to_test if f not in [preferred_framework, "np"]]) return {f: framework_to_test[f] for f in frameworks} def is_tensor(x): """ Tests if `x` is a `torch.Tensor`, `tf.Tensor`, `jaxlib.xla_extension.DeviceArray`, `np.ndarray` or `mlx.array` in the order defined by `infer_framework_from_repr` """ # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(x) for test_func in framework_to_test_func.values(): if test_func(x): return True # Tracers if is_torch_fx_proxy(x): return True if is_flax_available(): from jax.core import Tracer if isinstance(x, Tracer): return True return False def _is_numpy(x): return isinstance(x, np.ndarray) def is_numpy_array(x): """ Tests if `x` is a numpy array or not. """ return _is_numpy(x) def _is_torch(x): import torch return isinstance(x, torch.Tensor) def is_torch_tensor(x): """ Tests if `x` is a torch tensor or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch(x) def _is_torch_device(x): import torch return isinstance(x, torch.device) def is_torch_device(x): """ Tests if `x` is a torch device or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch_device(x) def _is_torch_dtype(x): import torch if isinstance(x, str): if hasattr(torch, x): x = getattr(torch, x) else: return False return isinstance(x, torch.dtype) def is_torch_dtype(x): """ Tests if `x` is a torch dtype or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch_dtype(x) def _is_tensorflow(x): import tensorflow as tf return isinstance(x, tf.Tensor) def is_tf_tensor(x): """ Tests if `x` is a tensorflow tensor or not. Safe to call even if tensorflow is not installed. """ return False if not is_tf_available() else _is_tensorflow(x) def _is_tf_symbolic_tensor(x): import tensorflow as tf # the `is_symbolic_tensor` predicate is only available starting with TF 2.14 if hasattr(tf, "is_symbolic_tensor"): return tf.is_symbolic_tensor(x) return isinstance(x, tf.Tensor) def is_tf_symbolic_tensor(x): """ Tests if `x` is a tensorflow symbolic tensor or not (ie. not eager). Safe to call even if tensorflow is not installed. """ return False if not is_tf_available() else _is_tf_symbolic_tensor(x) def _is_jax(x): import jax.numpy as jnp # noqa: F811 return isinstance(x, jnp.ndarray) def is_jax_tensor(x): """ Tests if `x` is a Jax tensor or not. Safe to call even if jax is not installed. """ return False if not is_flax_available() else _is_jax(x) def _is_mlx(x): import mlx.core as mx return isinstance(x, mx.array) def is_mlx_array(x): """ Tests if `x` is a mlx array or not. Safe to call even when mlx is not installed. """ return False if not is_mlx_available() else _is_mlx(x) def to_py_obj(obj): """ Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a python list. """ if isinstance(obj, (int, float)): return obj elif isinstance(obj, (dict, UserDict)): return {k: to_py_obj(v) for k, v in obj.items()} elif isinstance(obj, (list, tuple)): try: arr = np.array(obj) if np.issubdtype(arr.dtype, np.integer) or np.issubdtype(arr.dtype, np.floating): return arr.tolist() except Exception: pass return [to_py_obj(o) for o in obj] framework_to_py_obj = { "pt": lambda obj: obj.tolist(), "tf": lambda obj: obj.numpy().tolist(), "jax": lambda obj: np.asarray(obj).tolist(), "np": lambda obj: obj.tolist(), } # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(obj) for framework, test_func in framework_to_test_func.items(): if test_func(obj): return framework_to_py_obj[framework](obj) # tolist also works on 0d np arrays if isinstance(obj, np.number): return obj.tolist() else: return obj def to_numpy(obj): """ Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a Numpy array. """ framework_to_numpy = { "pt": lambda obj: obj.detach().cpu().numpy(), "tf": lambda obj: obj.numpy(), "jax": lambda obj: np.asarray(obj), "np": lambda obj: obj, } if isinstance(obj, (dict, UserDict)): return {k: to_numpy(v) for k, v in obj.items()} elif isinstance(obj, (list, tuple)): return np.array(obj) # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(obj) for framework, test_func in framework_to_test_func.items(): if test_func(obj): return framework_to_numpy[framework](obj) return obj class ModelOutput(OrderedDict): """ Base class for all model outputs as dataclass. Has a `__getitem__` that allows indexing by integer or slice (like a tuple) or strings (like a dictionary) that will ignore the `None` attributes. Otherwise behaves like a regular python dictionary. <Tip warning={true}> You can't unpack a `ModelOutput` directly. Use the [`~utils.ModelOutput.to_tuple`] method to convert it to a tuple before. </Tip> """ def __init_subclass__(cls) -> None: """Register subclasses as pytree nodes. This is necessary to synchronize gradients when using `torch.nn.parallel.DistributedDataParallel` with `static_graph=True` with modules that output `ModelOutput` subclasses. """ if is_torch_available(): if version.parse(get_torch_version()) >= version.parse("2.2"): from torch.utils._pytree import register_pytree_node register_pytree_node( cls, _model_output_flatten, partial(_model_output_unflatten, output_type=cls), serialized_type_name=f"{cls.__module__}.{cls.__name__}", ) else: from torch.utils._pytree import _register_pytree_node _register_pytree_node( cls, _model_output_flatten, partial(_model_output_unflatten, output_type=cls), ) def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) # Subclasses of ModelOutput must use the @dataclass decorator # This check is done in __init__ because the @dataclass decorator operates after __init_subclass__ # issubclass() would return True for issubclass(ModelOutput, ModelOutput) when False is needed # Just need to check that the current class is not ModelOutput is_modeloutput_subclass = self.__class__ != ModelOutput if is_modeloutput_subclass and not is_dataclass(self): raise TypeError( f"{self.__module__}.{self.__class__.__name__} is not a dataclass." " This is a subclass of ModelOutput and so must use the @dataclass decorator." ) def __post_init__(self): """Check the ModelOutput dataclass. Only occurs if @dataclass decorator has been used. """ class_fields = fields(self) # Safety and consistency checks if not len(class_fields): raise ValueError(f"{self.__class__.__name__} has no fields.") if not all(field.default is None for field in class_fields[1:]): raise ValueError(f"{self.__class__.__name__} should not have more than one required field.") first_field = getattr(self, class_fields[0].name) other_fields_are_none = all(getattr(self, field.name) is None for field in class_fields[1:]) if other_fields_are_none and not is_tensor(first_field): if isinstance(first_field, dict): iterator = first_field.items() first_field_iterator = True else: try: iterator = iter(first_field) first_field_iterator = True except TypeError: first_field_iterator = False # if we provided an iterator as first field and the iterator is a (key, value) iterator # set the associated fields if first_field_iterator: for idx, element in enumerate(iterator): if not isinstance(element, (list, tuple)) or len(element) != 2 or not isinstance(element[0], str): if idx == 0: # If we do not have an iterator of key/values, set it as attribute self[class_fields[0].name] = first_field else: # If we have a mixed iterator, raise an error raise ValueError( f"Cannot set key/value for {element}. It needs to be a tuple (key, value)." ) break setattr(self, element[0], element[1]) if element[1] is not None: self[element[0]] = element[1] elif first_field is not None: self[class_fields[0].name] = first_field else: for field in class_fields: v = getattr(self, field.name) if v is not None: self[field.name] = v def __delitem__(self, *args, **kwargs): raise Exception(f"You cannot use ``__delitem__`` on a {self.__class__.__name__} instance.") def setdefault(self, *args, **kwargs): raise Exception(f"You cannot use ``setdefault`` on a {self.__class__.__name__} instance.") def pop(self, *args, **kwargs): raise Exception(f"You cannot use ``pop`` on a {self.__class__.__name__} instance.") def update(self, *args, **kwargs): raise Exception(f"You cannot use ``update`` on a {self.__class__.__name__} instance.") def __getitem__(self, k): if isinstance(k, str): inner_dict = dict(self.items()) return inner_dict[k] else: return self.to_tuple()[k] def __setattr__(self, name, value): if name in self.keys() and value is not None: # Don't call self.__setitem__ to avoid recursion errors super().__setitem__(name, value) super().__setattr__(name, value) def __setitem__(self, key, value): # Will raise a KeyException if needed super().__setitem__(key, value) # Don't call self.__setattr__ to avoid recursion errors super().__setattr__(key, value) def __reduce__(self): if not is_dataclass(self): return super().__reduce__() callable, _args, *remaining = super().__reduce__() args = tuple(getattr(self, field.name) for field in fields(self)) return callable, args, *remaining def to_tuple(self) -> tuple[Any]: """ Convert self to a tuple containing all the attributes/keys that are not `None`. """ return tuple(self[k] for k in self.keys()) if is_torch_available(): import torch.utils._pytree as _torch_pytree def _model_output_flatten(output: ModelOutput) -> tuple[list[Any], "_torch_pytree.Context"]: return list(output.values()), list(output.keys()) def _model_output_unflatten( values: Iterable[Any], context: "_torch_pytree.Context", output_type=None, ) -> ModelOutput: return output_type(**dict(zip(context, values))) if version.parse(get_torch_version()) >= version.parse("2.2"): _torch_pytree.register_pytree_node( ModelOutput, _model_output_flatten, partial(_model_output_unflatten, output_type=ModelOutput), serialized_type_name=f"{ModelOutput.__module__}.{ModelOutput.__name__}", ) else: _torch_pytree._register_pytree_node( ModelOutput, _model_output_flatten, partial(_model_output_unflatten, output_type=ModelOutput), ) class ExplicitEnum(str, Enum): """ Enum with more explicit error message for missing values. """ @classmethod def _missing_(cls, value): raise ValueError( f"{value} is not a valid {cls.__name__}, please select one of {list(cls._value2member_map_.keys())}" ) class PaddingStrategy(ExplicitEnum): """ Possible values for the `padding` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. """ LONGEST = "longest" MAX_LENGTH = "max_length" DO_NOT_PAD = "do_not_pad" class TensorType(ExplicitEnum): """ Possible values for the `return_tensors` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. """ PYTORCH = "pt" TENSORFLOW = "tf" NUMPY = "np" JAX = "jax" MLX = "mlx" class ContextManagers: """ Wrapper for `contextlib.ExitStack` which enters a collection of context managers. Adaptation of `ContextManagers` in the `fastcore` library. """ def __init__(self, context_managers: list[ContextManager]): self.context_managers = context_managers self.stack = ExitStack() def __enter__(self): for context_manager in self.context_managers: self.stack.enter_context(context_manager) def __exit__(self, *args, **kwargs): self.stack.__exit__(*args, **kwargs) def can_return_loss(model_class): """ Check if a given model can return loss. Args: model_class (`type`): The class of the model. """ framework = infer_framework(model_class) if framework == "tf": signature = inspect.signature(model_class.call) # TensorFlow models elif framework == "pt": signature = inspect.signature(model_class.forward) # PyTorch models else: signature = inspect.signature(model_class.__call__) # Flax models for p in signature.parameters: if p == "return_loss" and signature.parameters[p].default is True: return True return False def find_labels(model_class): """ Find the labels used by a given model. Args: model_class (`type`): The class of the model. """ model_name = model_class.__name__ framework = infer_framework(model_class) if framework == "tf": signature = inspect.signature(model_class.call) # TensorFlow models elif framework == "pt": signature = inspect.signature(model_class.forward) # PyTorch models else: signature = inspect.signature(model_class.__call__) # Flax models if "QuestionAnswering" in model_name: return [p for p in signature.parameters if "label" in p or p in ("start_positions", "end_positions")] else: return [p for p in signature.parameters if "label" in p] def flatten_dict(d: MutableMapping, parent_key: str = "", delimiter: str = "."): """Flatten a nested dict into a single level dict.""" def _flatten_dict(d, parent_key="", delimiter="."): for k, v in d.items(): key = str(parent_key) + delimiter + str(k) if parent_key else k if v and isinstance(v, MutableMapping): yield from flatten_dict(v, key, delimiter=delimiter).items() else: yield key, v return dict(_flatten_dict(d, parent_key, delimiter)) @contextmanager def working_or_temp_dir(working_dir, use_temp_dir: bool = False): if use_temp_dir: with tempfile.TemporaryDirectory() as tmp_dir: yield tmp_dir else: yield working_dir def transpose(array, axes=None): """ Framework-agnostic version of `numpy.transpose` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.transpose(array, axes=axes) elif is_torch_tensor(array): return array.T if axes is None else array.permute(*axes) elif is_tf_tensor(array): import tensorflow as tf return tf.transpose(array, perm=axes) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.transpose(array, axes=axes) else: raise ValueError(f"Type not supported for transpose: {type(array)}.") def reshape(array, newshape): """ Framework-agnostic version of `numpy.reshape` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.reshape(array, newshape) elif is_torch_tensor(array): return array.reshape(*newshape) elif is_tf_tensor(array): import tensorflow as tf return tf.reshape(array, newshape) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.reshape(array, newshape) else: raise ValueError(f"Type not supported for reshape: {type(array)}.") def squeeze(array, axis=None): """ Framework-agnostic version of `numpy.squeeze` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.squeeze(array, axis=axis) elif is_torch_tensor(array): return array.squeeze() if axis is None else array.squeeze(dim=axis) elif is_tf_tensor(array): import tensorflow as tf return tf.squeeze(array, axis=axis) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.squeeze(array, axis=axis) else: raise ValueError(f"Type not supported for squeeze: {type(array)}.") def expand_dims(array, axis): """ Framework-agnostic version of `numpy.expand_dims` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.expand_dims(array, axis) elif is_torch_tensor(array): return array.unsqueeze(dim=axis) elif is_tf_tensor(array): import tensorflow as tf return tf.expand_dims(array, axis=axis) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.expand_dims(array, axis=axis) else: raise ValueError(f"Type not supported for expand_dims: {type(array)}.") def tensor_size(array): """ Framework-agnostic version of `numpy.size` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.size(array) elif is_torch_tensor(array): return array.numel() elif is_tf_tensor(array): import tensorflow as tf return tf.size(array) elif is_jax_tensor(array): return array.size else: raise ValueError(f"Type not supported for tensor_size: {type(array)}.") def infer_framework(model_class): """ Infers the framework of a given model without using isinstance(), because we cannot guarantee that the relevant classes are imported or available. """ for base_class in inspect.getmro(model_class): module = base_class.__module__ name = base_class.__name__ if module.startswith("tensorflow") or module.startswith("keras") or name == "TFPreTrainedModel": return "tf" elif module.startswith("torch") or name == "PreTrainedModel": return "pt" elif module.startswith("flax") or module.startswith("jax") or name == "FlaxPreTrainedModel": return "flax" else: raise TypeError(f"Could not infer framework from class {model_class}.") def torch_int(x): """ Casts an input to a torch int64 tensor if we are in a tracing context, otherwise to a Python int. """ if not is_torch_available(): return int(x) import torch return x.to(torch.int64) if torch.jit.is_tracing() and isinstance(x, torch.Tensor) else int(x) def torch_float(x): """ Casts an input to a torch float32 tensor if we are in a tracing context, otherwise to a Python float. """ if not is_torch_available(): return int(x) import torch return x.to(torch.float32) if torch.jit.is_tracing() and isinstance(x, torch.Tensor) else int(x) def filter_out_non_signature_kwargs(extra: Optional[list] = None): """ Decorator to filter out named arguments that are not in the function signature. This decorator ensures that only the keyword arguments that match the function's signature, or are specified in the `extra` list, are passed to the function. Any additional keyword arguments are filtered out and a warning is issued. Parameters: extra (`Optional[list]`, *optional*): A list of extra keyword argument names that are allowed even if they are not in the function's signature. Returns: Callable: A decorator that wraps the function and filters out invalid keyword arguments. Example usage: ```python @filter_out_non_signature_kwargs(extra=["allowed_extra_arg"]) def my_function(arg1, arg2, **kwargs): print(arg1, arg2, kwargs) my_function(arg1=1, arg2=2, allowed_extra_arg=3, invalid_arg=4) # This will print: 1 2 {"allowed_extra_arg": 3} # And issue a warning: "The following named arguments are not valid for `my_function` and were ignored: 'invalid_arg'" ``` """ extra = extra or [] extra_params_to_pass = set(extra) def decorator(func): sig = inspect.signature(func) function_named_args = set(sig.parameters.keys()) valid_kwargs_to_pass = function_named_args.union(extra_params_to_pass) # Required for better warning message is_instance_method = "self" in function_named_args is_class_method = "cls" in function_named_args # Mark function as decorated func._filter_out_non_signature_kwargs = True @wraps(func) def wrapper(*args, **kwargs): valid_kwargs = {} invalid_kwargs = {} for k, v in kwargs.items(): if k in valid_kwargs_to_pass: valid_kwargs[k] = v else: invalid_kwargs[k] = v if invalid_kwargs: invalid_kwargs_names = [f"'{k}'" for k in invalid_kwargs] invalid_kwargs_names = ", ".join(invalid_kwargs_names) # Get the class name for better warning message if is_instance_method: cls_prefix = args[0].__class__.__name__ + "." elif is_class_method: cls_prefix = args[0].__name__ + "." else: cls_prefix = "" warnings.warn( f"The following named arguments are not valid for `{cls_prefix}{func.__name__}`" f" and were ignored: {invalid_kwargs_names}", UserWarning, stacklevel=2, ) return func(*args, **valid_kwargs) return wrapper return decorator class TransformersKwargs(TypedDict, total=False): """ Keyword arguments to be passed to the loss function Attributes: num_items_in_batch (`Optional[torch.Tensor]`, *optional*): Number of items in the batch. It is recommended to pass it when you are doing gradient accumulation. output_hidden_states (`Optional[bool]`, *optional*): Most of the models support outputing all hidden states computed during the forward pass. output_attentions (`Optional[bool]`, *optional*): Turn this on to return the intermediary attention scores. output_router_logits (`Optional[bool]`, *optional*): For MoE models, this allows returning the router logits to compute the loss. cumulative_seqlens_q (`torch.LongTensor`, *optional*) Gets cumulative sequence length for query state. cumulative_seqlens_k (`torch.LongTensor`, *optional*) Gets cumulative sequence length for key state. max_length_q (`int`, *optional*): Maximum sequence length for query state. max_length_k (`int`, *optional*): Maximum sequence length for key state. """ num_items_in_batch: Optional["torch.Tensor"] output_hidden_states: Optional[bool] output_attentions: Optional[bool] output_router_logits: Optional[bool] cumulative_seqlens_q: Optional["torch.LongTensor"] cumulative_seqlens_k: Optional["torch.LongTensor"] max_length_q: Optional[int] max_length_k: Optional[int] def is_timm_config_dict(config_dict: dict[str, Any]) -> bool: """Checks whether a config dict is a timm config dict.""" return "pretrained_cfg" in config_dict def is_timm_local_checkpoint(pretrained_model_path: str) -> bool: """ Checks whether a checkpoint is a timm model checkpoint. """ if pretrained_model_path is None: return False # in case it's Path, not str pretrained_model_path = str(pretrained_model_path) is_file = os.path.isfile(pretrained_model_path) is_dir = os.path.isdir(pretrained_model_path) # pretrained_model_path is a file if is_file and pretrained_model_path.endswith(".json"): with open(pretrained_model_path) as f: config_dict = json.load(f) return is_timm_config_dict(config_dict) # pretrained_model_path is a directory with a config.json if is_dir and os.path.exists(os.path.join(pretrained_model_path, "config.json")): with open(os.path.join(pretrained_model_path, "config.json")) as f: config_dict = json.load(f) return is_timm_config_dict(config_dict) return False def set_attribute_for_modules(module: "torch.nn.Module", key: str, value: Any): """ Set a value to a module and all submodules. """ setattr(module, key, value) for submodule in module.children(): set_attribute_for_modules(submodule, key, value) def del_attribute_from_modules(module: "torch.nn.Module", key: str): """ Delete a value from a module and all submodules. """ # because we might remove it previously in case it's a shared module, e.g. activation function if hasattr(module, key): delattr(module, key) for submodule in module.children(): del_attribute_from_modules(submodule, key) def can_return_tuple(func): """ Decorator to wrap model method, to call output.to_tuple() if return_dict=False passed as a kwarg or use_return_dict=False is set in the config. Note: output.to_tuple() convert output to tuple skipping all `None` values. """ @wraps(func) def wrapper(self, *args, **kwargs): return_dict = self.config.return_dict if hasattr(self, "config") else True return_dict_passed = kwargs.pop("return_dict", return_dict) if return_dict_passed is not None: return_dict = return_dict_passed output = func(self, *args, **kwargs) if not return_dict and not isinstance(output, tuple): output = output.to_tuple() return output return wrapper # if is_torch_available(): # @torch._dynamo.disable @dataclass @requires(backends=("torch",)) class OutputRecorder: """ Configuration for recording outputs from a model via hooks. Attributes: target_class (Type): The class (e.g., nn.Module) to which the hook will be attached. index (Optional[int]): If the output is a tuple/list, optionally record only at a specific index. layer_name (Optional[str]): Name of the submodule to target (if needed), e.g., "transformer.layer.3.attn". class_name (Optional[str]): Name of the class to which the hook will be attached. Could be the suffix of class name in some cases. """ target_class: "type[torch.nn.Module]" index: Optional[int] = 0 layer_name: Optional[str] = None class_name: Optional[str] = None def check_model_inputs(func): """ Decorator to intercept specific layer outputs without using hooks. Compatible with torch.compile (Dynamo tracing). """ @wraps(func) def wrapper(self, *args, **kwargs): use_cache = kwargs.get("use_cache") if use_cache is None: use_cache = getattr(self.config, "use_cache", False) return_dict = kwargs.pop("return_dict", None) if return_dict is None: return_dict = getattr(self.config, "return_dict", True) if getattr(self, "gradient_checkpointing", False) and self.training and use_cache: logger.warning_once( "`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`." ) use_cache = False kwargs["use_cache"] = use_cache all_args = kwargs.copy() if "kwargs" in all_args: for k, v in all_args["kwargs"].items(): all_args[k] = v capture_flags = _CAN_RECORD_REGISTRY.get(str(self.__class__), {}) # there is a weak ref for executorch recordable_keys = { f"output_{k}": all_args.get( f"output_{k}", getattr( self.config, f"output_{k}", all_args.get("output_attentions", getattr(self.config, "output_attentions", False)), ), ) for k in capture_flags } collected_outputs = defaultdict(tuple) monkey_patched_layers = [] def make_capture_wrapper(module, orig_forward, key, index): @wraps(orig_forward) def wrapped_forward(*args, **kwargs): if key == "hidden_states" and len(collected_outputs[key]) == 0: collected_outputs[key] += (args[0],) if kwargs.get("debug_io", False): with model_addition_debugger_context( module, kwargs.get("debug_io_dir", "~/model_debug"), kwargs.get("prune_layers") ): output = orig_forward(*args, **kwargs) else: output = orig_forward(*args, **kwargs) if not isinstance(output, tuple): collected_outputs[key] += (output,) elif output[index] is not None: if key not in collected_outputs: collected_outputs[key] = (output[index],) else: collected_outputs[key] += (output[index],) return output return wrapped_forward if any(recordable_keys.values()): capture_tasks = [] for key, layer_specs in capture_flags.items(): if not recordable_keys.get(f"output_{key}", False): continue if not isinstance(layer_specs, list): layer_specs = [layer_specs] for specs in layer_specs: if not isinstance(specs, OutputRecorder): index = 0 if "hidden_states" in key else 1 class_name = None if not isinstance(specs, str) else specs target_class = specs if not isinstance(specs, str) else None specs = OutputRecorder(target_class=target_class, index=index, class_name=class_name) capture_tasks.append((key, specs)) for name, module in self.named_modules(): for key, specs in capture_tasks: # The second check is for multimodals where only backbone layer suffix is available if (specs.target_class is not None and isinstance(module, specs.target_class)) or ( specs.class_name is not None and name.endswith(specs.class_name) ): if specs.layer_name is not None and specs.layer_name not in name: continue # Monkey patch forward original_forward = module.forward module.forward = make_capture_wrapper(module, original_forward, key, specs.index) monkey_patched_layers.append((module, original_forward)) outputs = func(self, *args, **kwargs) # Restore original forward methods for module, original_forward in monkey_patched_layers: module.forward = original_forward # Inject collected outputs into model output for key in collected_outputs: if key == "hidden_states": collected_outputs[key] = collected_outputs[key][:-1] if hasattr(outputs, "vision_hidden_states"): collected_outputs[key] += (outputs.vision_hidden_states,) elif hasattr(outputs, "last_hidden_state"): collected_outputs[key] += (outputs.last_hidden_state,) outputs[key] = collected_outputs[key] elif key == "attentions": if isinstance(capture_flags[key], list) and len(capture_flags[key]) == 2: outputs[key] = collected_outputs[key][0::2] outputs["cross_" + key] = collected_outputs[key][1::2] else: outputs[key] = collected_outputs[key] else: outputs[key] = collected_outputs[key] if return_dict is False: outputs = outputs.to_tuple() return outputs return wrapper class GeneralInterface(MutableMapping): """ Dict-like object keeping track of a class-wide mapping, as well as a local one. Allows to have library-wide modifications though the class mapping, as well as local modifications in a single file with the local mapping. """ # Class instance object, so that a call to `register` can be reflected into all other files correctly, even if # a new instance is created (in order to locally override a given function) _global_mapping = {} def __init__(self): self._local_mapping = {} def __getitem__(self, key): # First check if instance has a local override if key in self._local_mapping: return self._local_mapping[key] return self._global_mapping[key] def __setitem__(self, key, value): # Allow local update of the default functions without impacting other instances self._local_mapping.update({key: value}) def __delitem__(self, key): del self._local_mapping[key] def __iter__(self): # Ensure we use all keys, with the overwritten ones on top return iter({**self._global_mapping, **self._local_mapping}) def __len__(self): return len(self._global_mapping.keys() | self._local_mapping.keys()) @classmethod def register(cls, key: str, value: Callable): cls._global_mapping.update({key: value}) def valid_keys(self) -> list[str]: return list(self.keys())
08-08
当出现 'NoneType' object has no attribute 'create_execution_context' 错误,意味着在代码里调用 `create_execution_context` 方法时,对象为 `None`,通常是反序列化 CUDA 引擎失败,导致 `engine` 变量为 `None`。以下是几种解决方法: - **重新生成或更换 engine/trt 文件并检查环境**:在执行 `runtime.deserialize_cuda_engine(f.read())` 后,若 `engine` 为 `None`,表明程序未能成功从 engine 文件反序列化出模型。可重新生成或更换 engine/trt 文件,同时检查 TensorRT 环境是否安装正确。例如,若之前的 engine 文件能正常使用,可考虑重新生成一个新的 engine 文件。 ```python def load_model(engine_path): f = open(engine_path, "rb") runtime = trt.Runtime(trt.Logger(trt.Logger.WARNING)) engine = runtime.deserialize_cuda_engine(f.read()) if engine is None: # 重新生成或更换 engine 文件的操作 print("Engine deserialization failed. Try regenerating the engine file.") return None, None context = engine.create_execution_context() return engine, context ``` - **检查 TensorRT 版本兼容性**:若在不同服务器间使用模型,可能因 TensorRT 版本不同而报错。例如在做 YOLO 模型训练时,训练服务器 TensorRT 版本为 10.2.0,生产服务器为 8.5.3.1,版本不同会导致问题。可卸载生产服务器的 TensorRT,下载与训练服务器相同版本的 TensorRT 并进行安装。 ```bash pip uninstall tensorrt # 前往官网下载与训练服务器版本相同的 tensorrt 并解压 # 复制文件到 CUDA 对应路径 # 安装 Python 相关包 ``` - **重新导出模型**:可以将训练服务器训练的 pt 模型下载到生产服务器导出。以使用半精度导出为例: ```bash python export.py --weights XXX.pt --include engine --half --device 0 ```
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值