NOTE_python_usage for ‘super‘

本文详细介绍了Python中的super()函数,包括其原理和用法。super()用于调用父类和兄弟类的方法,遵循MRO(Method Resolution Order)方法解析顺序。通过示例展示了super()如何在类的继承结构中找到正确的方法执行路径,以及在实际代码中的应用。文章还通过一个具体的例子解释了D类调用fun方法时,如何按照MRO顺序依次调用各父类的fun方法。

本文为精简笔记
参考链接:https://www.cnblogs.com/chenhuabin/p/10058594.html

super 是一个,实例化后得到的是一个代理的对象,而不是得到父类,并且我们使用这个对象来调用父类和兄弟类的方法。

原理

用法

super()
super(type , obj)
super(type_1 , type_2)

super(type, obj)

!!!!!!!!!!!!!!!!!!!!!极为重要!!!!!!!!!!!!!!!!!!!!
type:指出用继承顺序中的哪个元素。
obj(某个类的实例化对象):指出用哪个类的继承顺序

step1.根据_mro_(method resolution order方法解析顺序)机制理清类的继承顺序(注意: 不是输出顺序)。
step2.调用type对应的类在__mro__顺序中的下一个类的方法(该方法为super(type, obj).__init__中的init,或者super(type, obj).fun()的fun())。

realize step1:
class A(object):
    pass

class B(object):
    pass

class C(object):
    pass

class D(A,B):
    pass

class E(B, C):
    pass

class F(D, E):
    pass

print(F.__mro__)

输出元素顺序为:F->D->A->E->B->C->object
为何输出这样的顺序?
1.找入度为0的并且左侧优先的点。2.将该点拿出并将与之相连的箭头全部删除。3.所有点的拿出顺序就是继承顺序(不要忘了在结尾加上父类)。
在这里插入图片描述
当要生成F的继承顺序时,C3算法过程如下:首先将入度(指向该节点的箭头数量)为零的节点放入列表,并将F节点及与F节点有关的箭头从上图树中删除;继续找入度为0的节点,找到D和E,左侧优先,故而现将D放入列表,并从上图树中删除D,这是列表中就有了F、D。继续找入度为0的节点,有A和E满足,左侧优先,所以是A,将A从上图中取出放入列表,列表中顺序为F、D、E;接下来入度为0的节点只剩下E,取出E放入列表;只剩下B和C节点,且入度都为0,但左侧优先,二先将B放入列表,然后才是后才是C;不过别忘了,Python所有类都有一个共同的父类,那就是object类,所以,最好还会把object放入列表末尾。

realize step2

我们接着在上面的代码后接入下面的代码:

super(E , F()).fun() # 起始元素为E,F类继承元素顺序为:F->D->A->E->B->C->object,目标方法为fun(),输出结果:B.fun
super(D , F()).fun() # 输出结果:A.fun
super(F , F()).fun() # 输出结果:D.fun
super(B , F()).fun() # 输出结果:C.fun
super(B , E()).fun() # 输出结果:起始元素为E,E类继承元素顺序为:E->B->C->object,目标方法为fun(),C.fun
super(B , B()).fun() # 这是错误的,会报错

以及下面一段代码:

class A:
    def fun(self):
        print('A.fun')

class B(A):
    def fun(self):
        super(B , self).fun()
        print('B.fun')

class C(A):
    def fun(self):
        super(C , self).fun()
        print('C.fun')

class D(B , C):
    def fun(self):
        super(D , self).fun()
        print('D.fun')

D().fun()
#输出为
#A.fun
#C.fun
#B.fun
#D.fun

在这里插入图片描述
在这里插入图片描述
D类的__mro__顺序是D->B->C->A,在D类中调用fun方法,然后在D类fun方法中遇到super(D , self).fun(),这个self指的是D类的实例化对象,所以用的是D类的__mro__顺序,而且指明位置是D后面也就是B类,所以继续调用B类的fun方法,遇到super(B , self).fun(),这时候需要注意,这里的self还是原来的D类实例(千万注意不是B类实例),所以还是用D类的__mro__顺序,那就继续调用下一个C类的fun方法,同理继续调用下一个父类,也就是A类的fun方法,执行完A类的fun方法后,回到C的fun方法中,打印输出,然后回到B类的fun方法,直到D类的fun方法打印输出完。

import scrapy from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.chrome.options import Options from urllib.parse import urlencode import json import time import random from fp.fp import FreeProxy class XhsCarCommentSpider(scrapy.Spider): name = 'xhs_car_comments' allowed_domains = ['xiaohongshu.com'] def __init__(self, *args, **kwargs): super(XhsCarCommentSpider, self).__init__(*args, **kwargs) # 设置浏览器选项 chrome_options = Options() chrome_options.add_argument("--headless") # 无头模式 chrome_options.add_argument("--disable-gpu") chrome_options.add_argument("--no-sandbox") chrome_options.add_argument("--disable-dev-shm-usage") # 初始化Selenium WebDriver self.driver = webdriver.Chrome(options=chrome_options) # 登录小红书获取Cookie self.login_xiaohongshu() # 获取新能源汽车帖子的note_id列表 self.note_ids = self.get_new_energy_car_posts() def login_xiaohongshu(self): """使用Selenium模拟登录小红书""" self.driver.get("https://www.xiaohongshu.com/") try: # 等待登录弹窗出现 WebDriverWait(self.driver, 20).until( EC.presence_of_element_located((By.XPATH, '//button[contains(text(), "登录")]')) ).click() # 等待用户手动输入登录信息(实际使用时可考虑自动化) input("请手动完成登录后按回车继续...") # 获取登录后的Cookies self.cookies = {c['name']: c['value'] for c in self.driver.get_cookies()} except Exception as e: self.logger.error(f"登录失败: {str(e)}") self.driver.quit() raise def get_new_energy_car_posts(self): """搜索新能源汽车相关帖子并获取note_id""" self.driver.get("https://www.xiaohongshu.com/search_result/?keyword=新能源汽车") try: # 等待搜索结果加载 WebDriverWait(self.driver, 20).until( EC.presence_of_element_located((By.CLASS_NAME, 'note-item')) ) # 随机滚动页面加载更多内容 for _ in range(3): self.driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") time.sleep(random.uniform(1.5, 3)) # 提取帖子ID note_links = self.driver.find_elements(By.CSS_SELECTOR, 'a.note-item') note_ids = [link.get_attribute('href').split('/')[-1] for link in note_links] return note_ids[:10] # 取前10条帖子 except Exception as e: self.logger.error(f"获取帖子失败: {str(e)}") return [] def start_requests(self): """生成评论请求""" if not self.note_ids: self.logger.error("未找到相关帖子") return for note_id in self.note_ids: params = { "note_id": note_id, "top_comment_id": "", "image_formats": "jpg,webp", "num": 20, "cursor": "" } url = f"https://www.xiaohongshu.com/api/sns/web/v2/comment/page?{urlencode(params)}" # 使用轮换代理 proxy = FreeProxy().get() self.logger.info(f"使用代理: {proxy}") yield scrapy.Request( url, callback=self.parse_comments, meta={'note_id': note_id, 'cursor': "", 'proxy': proxy}, headers={ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36', 'Referer': f'https://www.xiaohongshu.com/explore/{note_id}', 'Cookie': "; ".join([f"{k}={v}" for k, v in self.cookies.items()]) } ) def parse_comments(self, response): """解析评论数据""" data = json.loads(response.text) note_id = response.meta['note_id'] cursor = response.meta['cursor'] if data.get('success'): comments = data.get('data', {}).get('comments', []) for comment in comments: # 跳过可能包含个人信息的字段 yield { 'note_id': note_id, 'comment_id': comment.get('id'), 'content': comment.get('content'), 'create_time': comment.get('create_time'), 'like_count': comment.get('like_count'), 'reply_count': comment.get('sub_comment_count'), 'is_author': comment.get('is_author', False) } # 处理分页 next_cursor = data.get('data', {}).get('cursor') if next_cursor and next_cursor != cursor: # 随机延迟避免请求过快 time.sleep(random.uniform(1, 3)) # 构造下一页请求 params = { "note_id": note_id, "top_comment_id": "", "image_formats": "jpg,webp", "num": 20, "cursor": next_cursor } url = f"https://www.xiaohongshu.com/api/sns/web/v2/comment/page?{urlencode(params)}" yield scrapy.Request( url, callback=self.parse_comments, meta={'note_id': note_id, 'cursor': next_cursor}, headers={ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36', 'Referer': f'https://www.xiaohongshu.com/explore/{note_id}', 'Cookie': "; ".join([f"{k}={v}" for k, v in self.cookies.items()]) } ) def close(self, reason): """关闭浏览器""" self.driver.quit()将此代码取消无头模式
06-09
# LSTM-Based Classical Music Generator using PyTorch (Extended with deeper model and audio modeling hooks) # Ensure you have installed: torch, pandas, numpy, pretty_midi, librosa import torch import torch.nn as nn import pandas as pd import numpy as np import pretty_midi import librosa from sklearn.preprocessing import MinMaxScaler from sklearn.model_selection import train_test_split from torch.utils.data import Dataset, DataLoader import os import glob # Set device device = torch.device("cpu") # force CPU only ### 1. Load and preprocess data ### data = pd.read_csv('/kaggle/input/musicnet-dataset/musicnet_metadata.csv') # Load preprocessed solo_piano data (already filtered) note_min = 21 solo_piano['note'] = solo_piano['note'] - note_min features = solo_piano[['composer', 'start_beat', 'end_beat', 'seconds']] # Target: note features = data_train[['composer', 'start_beat', 'end_beat', 'seconds']] note_min = 21 solo_piano['note'] = solo_piano['note'] - note_min features = solo_piano[['composer', 'start_beat', 'end_beat', 'seconds']] target = solo_piano['note'] scaler = MinMaxScaler() X_scaled = scaler.fit_transform(features) y = target.values # Create sequences SEQ_LEN = 50 def create_sequences(X, y, seq_len): sequences = [] labels = [] for i in range(len(X) - seq_len): sequences.append(X[i:i+seq_len]) labels.append(y[i+seq_len]) return np.array(sequences), np.array(labels) X_seq, y_seq = create_sequences(X_scaled, y, SEQ_LEN) # Train/Test split X_train, X_val, y_train, y_val = train_test_split(X_seq, y_seq, test_size=0.1, random_state=42) # Custom Dataset with optional audio features class PianoDataset(Dataset): def __init__(self, X, y, audio_folder=None, ids=None): self.X = torch.tensor(X, dtype=torch.float32) self.y = torch.tensor(y, dtype=torch.float32) self.audio_folder = audio_folder self.ids = ids def __len__(self): return len(self.X) def __getitem__(self, idx): sample = self.X[idx], self.y[idx] return sample train_loader = DataLoader(PianoDataset(X_train, y_train), batch_size=64, shuffle=True) val_loader = DataLoader(PianoDataset(X_val, y_val), batch_size=64) ### 2. Define LSTM Model (deeper, with LayerNorm) ### class MusicLSTM(nn.Module): def __init__(self, input_size, hidden_size, num_classes, num_layers=4): super(MusicLSTM, self).__init__() self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, dropout=0.3) self.norm = nn.LayerNorm(hidden_size) self.dropout = nn.Dropout(0.4) self.fc = nn.Linear(hidden_size, 1) def forward(self, x): out, _ = self.lstm(x) out = self.norm(out[:, -1, :]) out = self.dropout(out) out = self.fc(out) return out model = MusicLSTM(input_size=4, hidden_size=384, num_classes=1) # run on CPU # reduced hidden_size for lower GPU memory use criterion = nn.MSELoss() optimizer = torch.optim.Adam(model.parameters(), lr=0.001) scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=3) ### 3. Train model (with early stopping and LR scheduler) ### epochs = 50 best_val_loss = float('inf') patience = 7 patience_counter = 0 for epoch in range(epochs): model.train() total_loss = 0 for batch_X, batch_y in train_loader: batch_X, batch_y = batch_X, batch_y # stay on CPU optimizer.zero_grad() outputs = model(batch_X) loss = criterion(outputs, batch_y) loss.backward() optimizer.step() total_loss += loss.item() # Validation model.eval() val_loss = 0 with torch.no_grad(): for batch_X, batch_y in val_loader: batch_X, batch_y = batch_X.to(device), batch_y.to(device) outputs = model(batch_X) loss = criterion(outputs, batch_y) val_loss += loss.item() scheduler.step(val_loss) print(f"Epoch {epoch+1}/{epochs}, Train Loss: {total_loss:.4f}, Val Loss: {val_loss:.4f}") if val_loss < best_val_loss: best_val_loss = val_loss torch.save(model.state_dict(), "best_model.pt") patience_counter = 0 else: patience_counter += 1 if patience_counter >= patience: print("Early stopping triggered.") break ### 4. Generate music sequence ### model.load_state_dict(torch.load("best_model.pt", map_location=torch.device('cpu'))) model.eval() seed = torch.tensor(X_val[0:1], dtype=torch.float32) generated_notes = [] input_seq = seed.clone() for _ in range(100): output = model(input_seq) pred_note = round(output.item()) generated_notes.append(pred_note) next_input = input_seq[:, 1:, :] next_step = torch.tensor(X_val[_:_+1], dtype=torch.float32) input_seq = torch.cat([next_input, next_step], dim=1) ### 5. Write MIDI ### midi = pretty_midi.PrettyMIDI() instrument = pretty_midi.Instrument(program=0) start = 0.0 for note_num in generated_notes: note = pretty_midi.Note(velocity=100, pitch=note_num + note_min, start=start, end=start + 0.5) instrument.notes.append(note) start += 0.5 midi.instruments.append(instrument) midi.write("generated_music.mid") ### 6. (Optional) Hook: Extract audio feature from WAV ### def extract_mel_spectrogram(wav_path): y, sr = librosa.load(wav_path, sr=None) mel = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128) mel_db = librosa.power_to_db(mel, ref=np.max) return mel_db # Example usage: # mel_spec = extract_mel_spectrogram('path/to/audio.wav') 修改成能在kaggle的T4*2上跑的代码
05-11
Collecting omegaconf Using cached http://mirrors.cloud.aliyuncs.com/pypi/packages/e3/94/1843518e420fa3ed6919835845df698c7e27e183cb997394e4a670973a65/omegaconf-2.3.0-py3-none-any.whl (79 kB) Collecting antlr4-python3-runtime==4.9.* (from omegaconf) Using cached http://mirrors.cloud.aliyuncs.com/pypi/packages/3e/38/7859ff46355f76f8d19459005ca000b6e7012f2f1ca597746cbcd1fbfe5e/antlr4-python3-runtime-4.9.3.tar.gz (117 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [79 lines of output] /usr/local/lib/python3.12/dist-packages/setuptools/dist.py:599: SetuptoolsDeprecationWarning: Invalid dash-separated key 'index-url' in 'easy_install' (setup.cfg), please use the underscore name 'index_url' instead. !! ******************************************************************************** Usage of dash-separated 'index-url' will not be supported in future versions. Please use the underscore name 'index_url' instead. By 2026-Mar-03, you need to update your project and remove deprecated calls or your builds will no longer be supported. See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details. ******************************************************************************** !! opt = self._enforce_underscore(opt, section) /usr/local/lib/python3.12/dist-packages/setuptools/dist.py:599: SetuptoolsDeprecationWarning: Invalid dash-separated key 'index-url' in 'easy_install' (setup.cfg), please use the underscore name 'index_url' instead. !! ******************************************************************************** Usage of dash-separated 'index-url' will not be supported in future versions. Please use the underscore name 'index_url' instead. (Affected: antlr4-python3-runtime). By 2026-Mar-03, you need to update your project and remove deprecated calls or your builds will no longer be supported. See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details. ******************************************************************************** !! opt = self._enforce_underscore(opt, section) running egg_info creating /tmp/pip-pip-egg-info-s70m3zjj/antlr4_python3_runtime.egg-info writing /tmp/pip-pip-egg-info-s70m3zjj/antlr4_python3_runtime.egg-info/PKG-INFO writing dependency_links to /tmp/pip-pip-egg-info-s70m3zjj/antlr4_python3_runtime.egg-info/dependency_links.txt writing requirements to /tmp/pip-pip-egg-info-s70m3zjj/antlr4_python3_runtime.egg-info/requires.txt writing top-level names to /tmp/pip-pip-egg-info-s70m3zjj/antlr4_python3_runtime.egg-info/top_level.txt writing manifest file '/tmp/pip-pip-egg-info-s70m3zjj/antlr4_python3_runtime.egg-info/SOURCES.txt' reading manifest file '/tmp/pip-pip-egg-info-s70m3zjj/antlr4_python3_runtime.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no files found matching '*.py' under directory 'test' warning: no files found matching '*.c' under directory 'test' Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 35, in <module> File "/tmp/pip-install-_f814i5i/antlr4-python3-runtime_9149d6e2cb5247e6bbbe539757ef2daa/setup.py", line 3, in <module> setup( File "/usr/local/lib/python3.12/dist-packages/setuptools/__init__.py", line 115, in setup return distutils.core.setup(**attrs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/setuptools/_distutils/core.py", line 186, in setup return run_commands(dist) ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/setuptools/_distutils/core.py", line 202, in run_commands dist.run_commands() File "/usr/local/lib/python3.12/dist-packages/setuptools/_distutils/dist.py", line 1002, in run_commands self.run_command(cmd) File "/usr/local/lib/python3.12/dist-packages/setuptools/dist.py", line 1102, in run_command super().run_command(command) File "/usr/local/lib/python3.12/dist-packages/setuptools/_distutils/dist.py", line 1021, in run_command cmd_obj.run() File "/usr/local/lib/python3.12/dist-packages/setuptools/command/egg_info.py", line 312, in run self.find_sources() File "/usr/local/lib/python3.12/dist-packages/setuptools/command/egg_info.py", line 320, in find_sources mm.run() File "/usr/local/lib/python3.12/dist-packages/setuptools/command/egg_info.py", line 548, in run self.prune_file_list() File "/usr/local/lib/python3.12/dist-packages/setuptools/command/sdist.py", line 162, in prune_file_list super().prune_file_list() File "/usr/local/lib/python3.12/dist-packages/setuptools/_distutils/command/sdist.py", line 386, in prune_file_list base_dir = self.distribution.get_fullname() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/setuptools/_core_metadata.py", line 275, in get_fullname return _distribution_fullname(self.get_name(), self.get_version()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/setuptools/_core_metadata.py", line 293, in _distribution_fullname canonicalize_version(version, strip_trailing_zero=False), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: canonicalize_version() got an unexpected keyword argument 'strip_trailing_zero' [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details.
10-11
#下面程序运行时报错: C:\Users\Administrator\AppData\Local\Programs\Python\Python312\python.exe C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\utils\generic.py Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\utils\generic.py", line 34, in <module> from ..utils import logging ImportError: attempted relative import with no known parent package 进程已结束,退出代码为 1 ------------------------------------------------------------------------------------------------ import inspect import json import os import tempfile import warnings from collections import OrderedDict, UserDict, defaultdict from collections.abc import Iterable, MutableMapping from contextlib import ExitStack, contextmanager from dataclasses import dataclass, fields, is_dataclass from enum import Enum from functools import partial, wraps from typing import Any, Callable, ContextManager, Optional, TypedDict import numpy as np from packaging import version from ..utils import logging from .import_utils import ( get_torch_version, is_flax_available, is_mlx_available, is_tf_available, is_torch_available, is_torch_fx_proxy, requires, ) _CAN_RECORD_REGISTRY = {} logger = logging.get_logger(__name__) if is_torch_available(): # required for @can_return_tuple decorator to work with torchdynamo import torch # noqa: F401 from ..model_debugging_utils import model_addition_debugger_context class cached_property(property): """ Descriptor that mimics @property but caches output in member variable. From tensorflow_datasets Built-in in functools from Python 3.8. """ def __get__(self, obj, objtype=None): # See docs.python.org/3/howto/descriptor.html#properties if obj is None: return self if self.fget is None: raise AttributeError("unreadable attribute") attr = "__cached_" + self.fget.__name__ cached = getattr(obj, attr, None) if cached is None: cached = self.fget(obj) setattr(obj, attr, cached) return cached # vendored from distutils.util def strtobool(val): """Convert a string representation of truth to true (1) or false (0). True values are 'y', 'yes', 't', 'true', 'on', and '1'; false values are 'n', 'no', 'f', 'false', 'off', and '0'. Raises ValueError if 'val' is anything else. """ val = val.lower() if val in {"y", "yes", "t", "true", "on", "1"}: return 1 if val in {"n", "no", "f", "false", "off", "0"}: return 0 raise ValueError(f"invalid truth value {val!r}") def infer_framework_from_repr(x): """ Tries to guess the framework of an object `x` from its repr (brittle but will help in `is_tensor` to try the frameworks in a smart order, without the need to import the frameworks). """ representation = str(type(x)) if representation.startswith("<class 'torch."): return "pt" elif representation.startswith("<class 'tensorflow."): return "tf" elif representation.startswith("<class 'jax"): return "jax" elif representation.startswith("<class 'numpy."): return "np" elif representation.startswith("<class 'mlx."): return "mlx" def _get_frameworks_and_test_func(x): """ Returns an (ordered since we are in Python 3.7+) dictionary framework to test function, which places the framework we can guess from the repr first, then Numpy, then the others. """ framework_to_test = { "pt": is_torch_tensor, "tf": is_tf_tensor, "jax": is_jax_tensor, "np": is_numpy_array, "mlx": is_mlx_array, } preferred_framework = infer_framework_from_repr(x) # We will test this one first, then numpy, then the others. frameworks = [] if preferred_framework is None else [preferred_framework] if preferred_framework != "np": frameworks.append("np") frameworks.extend([f for f in framework_to_test if f not in [preferred_framework, "np"]]) return {f: framework_to_test[f] for f in frameworks} def is_tensor(x): """ Tests if `x` is a `torch.Tensor`, `tf.Tensor`, `jaxlib.xla_extension.DeviceArray`, `np.ndarray` or `mlx.array` in the order defined by `infer_framework_from_repr` """ # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(x) for test_func in framework_to_test_func.values(): if test_func(x): return True # Tracers if is_torch_fx_proxy(x): return True if is_flax_available(): from jax.core import Tracer if isinstance(x, Tracer): return True return False def _is_numpy(x): return isinstance(x, np.ndarray) def is_numpy_array(x): """ Tests if `x` is a numpy array or not. """ return _is_numpy(x) def _is_torch(x): import torch return isinstance(x, torch.Tensor) def is_torch_tensor(x): """ Tests if `x` is a torch tensor or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch(x) def _is_torch_device(x): import torch return isinstance(x, torch.device) def is_torch_device(x): """ Tests if `x` is a torch device or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch_device(x) def _is_torch_dtype(x): import torch if isinstance(x, str): if hasattr(torch, x): x = getattr(torch, x) else: return False return isinstance(x, torch.dtype) def is_torch_dtype(x): """ Tests if `x` is a torch dtype or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch_dtype(x) def _is_tensorflow(x): import tensorflow as tf return isinstance(x, tf.Tensor) def is_tf_tensor(x): """ Tests if `x` is a tensorflow tensor or not. Safe to call even if tensorflow is not installed. """ return False if not is_tf_available() else _is_tensorflow(x) def _is_tf_symbolic_tensor(x): import tensorflow as tf # the `is_symbolic_tensor` predicate is only available starting with TF 2.14 if hasattr(tf, "is_symbolic_tensor"): return tf.is_symbolic_tensor(x) return isinstance(x, tf.Tensor) def is_tf_symbolic_tensor(x): """ Tests if `x` is a tensorflow symbolic tensor or not (ie. not eager). Safe to call even if tensorflow is not installed. """ return False if not is_tf_available() else _is_tf_symbolic_tensor(x) def _is_jax(x): import jax.numpy as jnp # noqa: F811 return isinstance(x, jnp.ndarray) def is_jax_tensor(x): """ Tests if `x` is a Jax tensor or not. Safe to call even if jax is not installed. """ return False if not is_flax_available() else _is_jax(x) def _is_mlx(x): import mlx.core as mx return isinstance(x, mx.array) def is_mlx_array(x): """ Tests if `x` is a mlx array or not. Safe to call even when mlx is not installed. """ return False if not is_mlx_available() else _is_mlx(x) def to_py_obj(obj): """ Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a python list. """ if isinstance(obj, (int, float)): return obj elif isinstance(obj, (dict, UserDict)): return {k: to_py_obj(v) for k, v in obj.items()} elif isinstance(obj, (list, tuple)): try: arr = np.array(obj) if np.issubdtype(arr.dtype, np.integer) or np.issubdtype(arr.dtype, np.floating): return arr.tolist() except Exception: pass return [to_py_obj(o) for o in obj] framework_to_py_obj = { "pt": lambda obj: obj.tolist(), "tf": lambda obj: obj.numpy().tolist(), "jax": lambda obj: np.asarray(obj).tolist(), "np": lambda obj: obj.tolist(), } # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(obj) for framework, test_func in framework_to_test_func.items(): if test_func(obj): return framework_to_py_obj[framework](obj) # tolist also works on 0d np arrays if isinstance(obj, np.number): return obj.tolist() else: return obj def to_numpy(obj): """ Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a Numpy array. """ framework_to_numpy = { "pt": lambda obj: obj.detach().cpu().numpy(), "tf": lambda obj: obj.numpy(), "jax": lambda obj: np.asarray(obj), "np": lambda obj: obj, } if isinstance(obj, (dict, UserDict)): return {k: to_numpy(v) for k, v in obj.items()} elif isinstance(obj, (list, tuple)): return np.array(obj) # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(obj) for framework, test_func in framework_to_test_func.items(): if test_func(obj): return framework_to_numpy[framework](obj) return obj class ModelOutput(OrderedDict): """ Base class for all model outputs as dataclass. Has a `__getitem__` that allows indexing by integer or slice (like a tuple) or strings (like a dictionary) that will ignore the `None` attributes. Otherwise behaves like a regular python dictionary. <Tip warning={true}> You can't unpack a `ModelOutput` directly. Use the [`~utils.ModelOutput.to_tuple`] method to convert it to a tuple before. </Tip> """ def __init_subclass__(cls) -> None: """Register subclasses as pytree nodes. This is necessary to synchronize gradients when using `torch.nn.parallel.DistributedDataParallel` with `static_graph=True` with modules that output `ModelOutput` subclasses. """ if is_torch_available(): if version.parse(get_torch_version()) >= version.parse("2.2"): from torch.utils._pytree import register_pytree_node register_pytree_node( cls, _model_output_flatten, partial(_model_output_unflatten, output_type=cls), serialized_type_name=f"{cls.__module__}.{cls.__name__}", ) else: from torch.utils._pytree import _register_pytree_node _register_pytree_node( cls, _model_output_flatten, partial(_model_output_unflatten, output_type=cls), ) def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) # Subclasses of ModelOutput must use the @dataclass decorator # This check is done in __init__ because the @dataclass decorator operates after __init_subclass__ # issubclass() would return True for issubclass(ModelOutput, ModelOutput) when False is needed # Just need to check that the current class is not ModelOutput is_modeloutput_subclass = self.__class__ != ModelOutput if is_modeloutput_subclass and not is_dataclass(self): raise TypeError( f"{self.__module__}.{self.__class__.__name__} is not a dataclass." " This is a subclass of ModelOutput and so must use the @dataclass decorator." ) def __post_init__(self): """Check the ModelOutput dataclass. Only occurs if @dataclass decorator has been used. """ class_fields = fields(self) # Safety and consistency checks if not len(class_fields): raise ValueError(f"{self.__class__.__name__} has no fields.") if not all(field.default is None for field in class_fields[1:]): raise ValueError(f"{self.__class__.__name__} should not have more than one required field.") first_field = getattr(self, class_fields[0].name) other_fields_are_none = all(getattr(self, field.name) is None for field in class_fields[1:]) if other_fields_are_none and not is_tensor(first_field): if isinstance(first_field, dict): iterator = first_field.items() first_field_iterator = True else: try: iterator = iter(first_field) first_field_iterator = True except TypeError: first_field_iterator = False # if we provided an iterator as first field and the iterator is a (key, value) iterator # set the associated fields if first_field_iterator: for idx, element in enumerate(iterator): if not isinstance(element, (list, tuple)) or len(element) != 2 or not isinstance(element[0], str): if idx == 0: # If we do not have an iterator of key/values, set it as attribute self[class_fields[0].name] = first_field else: # If we have a mixed iterator, raise an error raise ValueError( f"Cannot set key/value for {element}. It needs to be a tuple (key, value)." ) break setattr(self, element[0], element[1]) if element[1] is not None: self[element[0]] = element[1] elif first_field is not None: self[class_fields[0].name] = first_field else: for field in class_fields: v = getattr(self, field.name) if v is not None: self[field.name] = v def __delitem__(self, *args, **kwargs): raise Exception(f"You cannot use ``__delitem__`` on a {self.__class__.__name__} instance.") def setdefault(self, *args, **kwargs): raise Exception(f"You cannot use ``setdefault`` on a {self.__class__.__name__} instance.") def pop(self, *args, **kwargs): raise Exception(f"You cannot use ``pop`` on a {self.__class__.__name__} instance.") def update(self, *args, **kwargs): raise Exception(f"You cannot use ``update`` on a {self.__class__.__name__} instance.") def __getitem__(self, k): if isinstance(k, str): inner_dict = dict(self.items()) return inner_dict[k] else: return self.to_tuple()[k] def __setattr__(self, name, value): if name in self.keys() and value is not None: # Don't call self.__setitem__ to avoid recursion errors super().__setitem__(name, value) super().__setattr__(name, value) def __setitem__(self, key, value): # Will raise a KeyException if needed super().__setitem__(key, value) # Don't call self.__setattr__ to avoid recursion errors super().__setattr__(key, value) def __reduce__(self): if not is_dataclass(self): return super().__reduce__() callable, _args, *remaining = super().__reduce__() args = tuple(getattr(self, field.name) for field in fields(self)) return callable, args, *remaining def to_tuple(self) -> tuple[Any]: """ Convert self to a tuple containing all the attributes/keys that are not `None`. """ return tuple(self[k] for k in self.keys()) if is_torch_available(): import torch.utils._pytree as _torch_pytree def _model_output_flatten(output: ModelOutput) -> tuple[list[Any], "_torch_pytree.Context"]: return list(output.values()), list(output.keys()) def _model_output_unflatten( values: Iterable[Any], context: "_torch_pytree.Context", output_type=None, ) -> ModelOutput: return output_type(**dict(zip(context, values))) if version.parse(get_torch_version()) >= version.parse("2.2"): _torch_pytree.register_pytree_node( ModelOutput, _model_output_flatten, partial(_model_output_unflatten, output_type=ModelOutput), serialized_type_name=f"{ModelOutput.__module__}.{ModelOutput.__name__}", ) else: _torch_pytree._register_pytree_node( ModelOutput, _model_output_flatten, partial(_model_output_unflatten, output_type=ModelOutput), ) class ExplicitEnum(str, Enum): """ Enum with more explicit error message for missing values. """ @classmethod def _missing_(cls, value): raise ValueError( f"{value} is not a valid {cls.__name__}, please select one of {list(cls._value2member_map_.keys())}" ) class PaddingStrategy(ExplicitEnum): """ Possible values for the `padding` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. """ LONGEST = "longest" MAX_LENGTH = "max_length" DO_NOT_PAD = "do_not_pad" class TensorType(ExplicitEnum): """ Possible values for the `return_tensors` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. """ PYTORCH = "pt" TENSORFLOW = "tf" NUMPY = "np" JAX = "jax" MLX = "mlx" class ContextManagers: """ Wrapper for `contextlib.ExitStack` which enters a collection of context managers. Adaptation of `ContextManagers` in the `fastcore` library. """ def __init__(self, context_managers: list[ContextManager]): self.context_managers = context_managers self.stack = ExitStack() def __enter__(self): for context_manager in self.context_managers: self.stack.enter_context(context_manager) def __exit__(self, *args, **kwargs): self.stack.__exit__(*args, **kwargs) def can_return_loss(model_class): """ Check if a given model can return loss. Args: model_class (`type`): The class of the model. """ framework = infer_framework(model_class) if framework == "tf": signature = inspect.signature(model_class.call) # TensorFlow models elif framework == "pt": signature = inspect.signature(model_class.forward) # PyTorch models else: signature = inspect.signature(model_class.__call__) # Flax models for p in signature.parameters: if p == "return_loss" and signature.parameters[p].default is True: return True return False def find_labels(model_class): """ Find the labels used by a given model. Args: model_class (`type`): The class of the model. """ model_name = model_class.__name__ framework = infer_framework(model_class) if framework == "tf": signature = inspect.signature(model_class.call) # TensorFlow models elif framework == "pt": signature = inspect.signature(model_class.forward) # PyTorch models else: signature = inspect.signature(model_class.__call__) # Flax models if "QuestionAnswering" in model_name: return [p for p in signature.parameters if "label" in p or p in ("start_positions", "end_positions")] else: return [p for p in signature.parameters if "label" in p] def flatten_dict(d: MutableMapping, parent_key: str = "", delimiter: str = "."): """Flatten a nested dict into a single level dict.""" def _flatten_dict(d, parent_key="", delimiter="."): for k, v in d.items(): key = str(parent_key) + delimiter + str(k) if parent_key else k if v and isinstance(v, MutableMapping): yield from flatten_dict(v, key, delimiter=delimiter).items() else: yield key, v return dict(_flatten_dict(d, parent_key, delimiter)) @contextmanager def working_or_temp_dir(working_dir, use_temp_dir: bool = False): if use_temp_dir: with tempfile.TemporaryDirectory() as tmp_dir: yield tmp_dir else: yield working_dir def transpose(array, axes=None): """ Framework-agnostic version of `numpy.transpose` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.transpose(array, axes=axes) elif is_torch_tensor(array): return array.T if axes is None else array.permute(*axes) elif is_tf_tensor(array): import tensorflow as tf return tf.transpose(array, perm=axes) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.transpose(array, axes=axes) else: raise ValueError(f"Type not supported for transpose: {type(array)}.") def reshape(array, newshape): """ Framework-agnostic version of `numpy.reshape` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.reshape(array, newshape) elif is_torch_tensor(array): return array.reshape(*newshape) elif is_tf_tensor(array): import tensorflow as tf return tf.reshape(array, newshape) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.reshape(array, newshape) else: raise ValueError(f"Type not supported for reshape: {type(array)}.") def squeeze(array, axis=None): """ Framework-agnostic version of `numpy.squeeze` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.squeeze(array, axis=axis) elif is_torch_tensor(array): return array.squeeze() if axis is None else array.squeeze(dim=axis) elif is_tf_tensor(array): import tensorflow as tf return tf.squeeze(array, axis=axis) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.squeeze(array, axis=axis) else: raise ValueError(f"Type not supported for squeeze: {type(array)}.") def expand_dims(array, axis): """ Framework-agnostic version of `numpy.expand_dims` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.expand_dims(array, axis) elif is_torch_tensor(array): return array.unsqueeze(dim=axis) elif is_tf_tensor(array): import tensorflow as tf return tf.expand_dims(array, axis=axis) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.expand_dims(array, axis=axis) else: raise ValueError(f"Type not supported for expand_dims: {type(array)}.") def tensor_size(array): """ Framework-agnostic version of `numpy.size` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.size(array) elif is_torch_tensor(array): return array.numel() elif is_tf_tensor(array): import tensorflow as tf return tf.size(array) elif is_jax_tensor(array): return array.size else: raise ValueError(f"Type not supported for tensor_size: {type(array)}.") def infer_framework(model_class): """ Infers the framework of a given model without using isinstance(), because we cannot guarantee that the relevant classes are imported or available. """ for base_class in inspect.getmro(model_class): module = base_class.__module__ name = base_class.__name__ if module.startswith("tensorflow") or module.startswith("keras") or name == "TFPreTrainedModel": return "tf" elif module.startswith("torch") or name == "PreTrainedModel": return "pt" elif module.startswith("flax") or module.startswith("jax") or name == "FlaxPreTrainedModel": return "flax" else: raise TypeError(f"Could not infer framework from class {model_class}.") def torch_int(x): """ Casts an input to a torch int64 tensor if we are in a tracing context, otherwise to a Python int. """ if not is_torch_available(): return int(x) import torch return x.to(torch.int64) if torch.jit.is_tracing() and isinstance(x, torch.Tensor) else int(x) def torch_float(x): """ Casts an input to a torch float32 tensor if we are in a tracing context, otherwise to a Python float. """ if not is_torch_available(): return int(x) import torch return x.to(torch.float32) if torch.jit.is_tracing() and isinstance(x, torch.Tensor) else int(x) def filter_out_non_signature_kwargs(extra: Optional[list] = None): """ Decorator to filter out named arguments that are not in the function signature. This decorator ensures that only the keyword arguments that match the function's signature, or are specified in the `extra` list, are passed to the function. Any additional keyword arguments are filtered out and a warning is issued. Parameters: extra (`Optional[list]`, *optional*): A list of extra keyword argument names that are allowed even if they are not in the function's signature. Returns: Callable: A decorator that wraps the function and filters out invalid keyword arguments. Example usage: ```python @filter_out_non_signature_kwargs(extra=["allowed_extra_arg"]) def my_function(arg1, arg2, **kwargs): print(arg1, arg2, kwargs) my_function(arg1=1, arg2=2, allowed_extra_arg=3, invalid_arg=4) # This will print: 1 2 {"allowed_extra_arg": 3} # And issue a warning: "The following named arguments are not valid for `my_function` and were ignored: 'invalid_arg'" ``` """ extra = extra or [] extra_params_to_pass = set(extra) def decorator(func): sig = inspect.signature(func) function_named_args = set(sig.parameters.keys()) valid_kwargs_to_pass = function_named_args.union(extra_params_to_pass) # Required for better warning message is_instance_method = "self" in function_named_args is_class_method = "cls" in function_named_args # Mark function as decorated func._filter_out_non_signature_kwargs = True @wraps(func) def wrapper(*args, **kwargs): valid_kwargs = {} invalid_kwargs = {} for k, v in kwargs.items(): if k in valid_kwargs_to_pass: valid_kwargs[k] = v else: invalid_kwargs[k] = v if invalid_kwargs: invalid_kwargs_names = [f"'{k}'" for k in invalid_kwargs] invalid_kwargs_names = ", ".join(invalid_kwargs_names) # Get the class name for better warning message if is_instance_method: cls_prefix = args[0].__class__.__name__ + "." elif is_class_method: cls_prefix = args[0].__name__ + "." else: cls_prefix = "" warnings.warn( f"The following named arguments are not valid for `{cls_prefix}{func.__name__}`" f" and were ignored: {invalid_kwargs_names}", UserWarning, stacklevel=2, ) return func(*args, **valid_kwargs) return wrapper return decorator class TransformersKwargs(TypedDict, total=False): """ Keyword arguments to be passed to the loss function Attributes: num_items_in_batch (`Optional[torch.Tensor]`, *optional*): Number of items in the batch. It is recommended to pass it when you are doing gradient accumulation. output_hidden_states (`Optional[bool]`, *optional*): Most of the models support outputing all hidden states computed during the forward pass. output_attentions (`Optional[bool]`, *optional*): Turn this on to return the intermediary attention scores. output_router_logits (`Optional[bool]`, *optional*): For MoE models, this allows returning the router logits to compute the loss. cumulative_seqlens_q (`torch.LongTensor`, *optional*) Gets cumulative sequence length for query state. cumulative_seqlens_k (`torch.LongTensor`, *optional*) Gets cumulative sequence length for key state. max_length_q (`int`, *optional*): Maximum sequence length for query state. max_length_k (`int`, *optional*): Maximum sequence length for key state. """ num_items_in_batch: Optional["torch.Tensor"] output_hidden_states: Optional[bool] output_attentions: Optional[bool] output_router_logits: Optional[bool] cumulative_seqlens_q: Optional["torch.LongTensor"] cumulative_seqlens_k: Optional["torch.LongTensor"] max_length_q: Optional[int] max_length_k: Optional[int] def is_timm_config_dict(config_dict: dict[str, Any]) -> bool: """Checks whether a config dict is a timm config dict.""" return "pretrained_cfg" in config_dict def is_timm_local_checkpoint(pretrained_model_path: str) -> bool: """ Checks whether a checkpoint is a timm model checkpoint. """ if pretrained_model_path is None: return False # in case it's Path, not str pretrained_model_path = str(pretrained_model_path) is_file = os.path.isfile(pretrained_model_path) is_dir = os.path.isdir(pretrained_model_path) # pretrained_model_path is a file if is_file and pretrained_model_path.endswith(".json"): with open(pretrained_model_path) as f: config_dict = json.load(f) return is_timm_config_dict(config_dict) # pretrained_model_path is a directory with a config.json if is_dir and os.path.exists(os.path.join(pretrained_model_path, "config.json")): with open(os.path.join(pretrained_model_path, "config.json")) as f: config_dict = json.load(f) return is_timm_config_dict(config_dict) return False def set_attribute_for_modules(module: "torch.nn.Module", key: str, value: Any): """ Set a value to a module and all submodules. """ setattr(module, key, value) for submodule in module.children(): set_attribute_for_modules(submodule, key, value) def del_attribute_from_modules(module: "torch.nn.Module", key: str): """ Delete a value from a module and all submodules. """ # because we might remove it previously in case it's a shared module, e.g. activation function if hasattr(module, key): delattr(module, key) for submodule in module.children(): del_attribute_from_modules(submodule, key) def can_return_tuple(func): """ Decorator to wrap model method, to call output.to_tuple() if return_dict=False passed as a kwarg or use_return_dict=False is set in the config. Note: output.to_tuple() convert output to tuple skipping all `None` values. """ @wraps(func) def wrapper(self, *args, **kwargs): return_dict = self.config.return_dict if hasattr(self, "config") else True return_dict_passed = kwargs.pop("return_dict", return_dict) if return_dict_passed is not None: return_dict = return_dict_passed output = func(self, *args, **kwargs) if not return_dict and not isinstance(output, tuple): output = output.to_tuple() return output return wrapper # if is_torch_available(): # @torch._dynamo.disable @dataclass @requires(backends=("torch",)) class OutputRecorder: """ Configuration for recording outputs from a model via hooks. Attributes: target_class (Type): The class (e.g., nn.Module) to which the hook will be attached. index (Optional[int]): If the output is a tuple/list, optionally record only at a specific index. layer_name (Optional[str]): Name of the submodule to target (if needed), e.g., "transformer.layer.3.attn". class_name (Optional[str]): Name of the class to which the hook will be attached. Could be the suffix of class name in some cases. """ target_class: "type[torch.nn.Module]" index: Optional[int] = 0 layer_name: Optional[str] = None class_name: Optional[str] = None def check_model_inputs(func): """ Decorator to intercept specific layer outputs without using hooks. Compatible with torch.compile (Dynamo tracing). """ @wraps(func) def wrapper(self, *args, **kwargs): use_cache = kwargs.get("use_cache") if use_cache is None: use_cache = getattr(self.config, "use_cache", False) return_dict = kwargs.pop("return_dict", None) if return_dict is None: return_dict = getattr(self.config, "return_dict", True) if getattr(self, "gradient_checkpointing", False) and self.training and use_cache: logger.warning_once( "`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`." ) use_cache = False kwargs["use_cache"] = use_cache all_args = kwargs.copy() if "kwargs" in all_args: for k, v in all_args["kwargs"].items(): all_args[k] = v capture_flags = _CAN_RECORD_REGISTRY.get(str(self.__class__), {}) # there is a weak ref for executorch recordable_keys = { f"output_{k}": all_args.get( f"output_{k}", getattr( self.config, f"output_{k}", all_args.get("output_attentions", getattr(self.config, "output_attentions", False)), ), ) for k in capture_flags } collected_outputs = defaultdict(tuple) monkey_patched_layers = [] def make_capture_wrapper(module, orig_forward, key, index): @wraps(orig_forward) def wrapped_forward(*args, **kwargs): if key == "hidden_states" and len(collected_outputs[key]) == 0: collected_outputs[key] += (args[0],) if kwargs.get("debug_io", False): with model_addition_debugger_context( module, kwargs.get("debug_io_dir", "~/model_debug"), kwargs.get("prune_layers") ): output = orig_forward(*args, **kwargs) else: output = orig_forward(*args, **kwargs) if not isinstance(output, tuple): collected_outputs[key] += (output,) elif output[index] is not None: if key not in collected_outputs: collected_outputs[key] = (output[index],) else: collected_outputs[key] += (output[index],) return output return wrapped_forward if any(recordable_keys.values()): capture_tasks = [] for key, layer_specs in capture_flags.items(): if not recordable_keys.get(f"output_{key}", False): continue if not isinstance(layer_specs, list): layer_specs = [layer_specs] for specs in layer_specs: if not isinstance(specs, OutputRecorder): index = 0 if "hidden_states" in key else 1 class_name = None if not isinstance(specs, str) else specs target_class = specs if not isinstance(specs, str) else None specs = OutputRecorder(target_class=target_class, index=index, class_name=class_name) capture_tasks.append((key, specs)) for name, module in self.named_modules(): for key, specs in capture_tasks: # The second check is for multimodals where only backbone layer suffix is available if (specs.target_class is not None and isinstance(module, specs.target_class)) or ( specs.class_name is not None and name.endswith(specs.class_name) ): if specs.layer_name is not None and specs.layer_name not in name: continue # Monkey patch forward original_forward = module.forward module.forward = make_capture_wrapper(module, original_forward, key, specs.index) monkey_patched_layers.append((module, original_forward)) outputs = func(self, *args, **kwargs) # Restore original forward methods for module, original_forward in monkey_patched_layers: module.forward = original_forward # Inject collected outputs into model output for key in collected_outputs: if key == "hidden_states": collected_outputs[key] = collected_outputs[key][:-1] if hasattr(outputs, "vision_hidden_states"): collected_outputs[key] += (outputs.vision_hidden_states,) elif hasattr(outputs, "last_hidden_state"): collected_outputs[key] += (outputs.last_hidden_state,) outputs[key] = collected_outputs[key] elif key == "attentions": if isinstance(capture_flags[key], list) and len(capture_flags[key]) == 2: outputs[key] = collected_outputs[key][0::2] outputs["cross_" + key] = collected_outputs[key][1::2] else: outputs[key] = collected_outputs[key] else: outputs[key] = collected_outputs[key] if return_dict is False: outputs = outputs.to_tuple() return outputs return wrapper class GeneralInterface(MutableMapping): """ Dict-like object keeping track of a class-wide mapping, as well as a local one. Allows to have library-wide modifications though the class mapping, as well as local modifications in a single file with the local mapping. """ # Class instance object, so that a call to `register` can be reflected into all other files correctly, even if # a new instance is created (in order to locally override a given function) _global_mapping = {} def __init__(self): self._local_mapping = {} def __getitem__(self, key): # First check if instance has a local override if key in self._local_mapping: return self._local_mapping[key] return self._global_mapping[key] def __setitem__(self, key, value): # Allow local update of the default functions without impacting other instances self._local_mapping.update({key: value}) def __delitem__(self, key): del self._local_mapping[key] def __iter__(self): # Ensure we use all keys, with the overwritten ones on top return iter({**self._global_mapping, **self._local_mapping}) def __len__(self): return len(self._global_mapping.keys() | self._local_mapping.keys()) @classmethod def register(cls, key: str, value: Callable): cls._global_mapping.update({key: value}) def valid_keys(self) -> list[str]: return list(self.keys())
08-08
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值