Array.longest ( Array.which_long?修改版 )

本文介绍了一个简洁的Ruby程序,用于从数组中找出所有长度最长的元素。通过利用Ruby的内置方法,该程序能够高效地处理包含不同数据类型的数组。
感謝在Ruby-talk上的:
Chris Carter
David A. Black
Harry
Robert Dober
James Edward
:)
原本的程式碼太長,而且使用內建的功能組合起來就好
再者,原本的程式會把陣列的元素強制轉型為String

新的程式碼為:
class Array
  def longest
    # Harry < http://www.kakueki.com/ruby<wbr></wbr>/list.html>
    self.select{|r| r.to_s.size == self.max{|x, y| x.to_s.size <=> y.to_s.size}.to_s.size}
  end
end
這個程式是由Harry所寫出的,底下轉貼原文:

On 4/29/07, Billy Hsu <ruby.maillist@gmail.com> wrote:
> Thanks for your reply, I learned more on this thread :P
> But I have a question:
> If I have an array contain:
>   ary = [1, 12, 234, "456"]
> there has two elements which size is 3, but the longest method just returned
> one of them.
> I can't solve it :(
>

Is this what you are looking for?
Do you want all longest elements?

big = [1, 12, 234,45,978, "456"].max {|x,y| x.to_s.size <=> y.to_s.size}
p [1, 12, 234,45,978, "456"].select {|r| r.to_s.size == big.to_s.size}
由於Harry不喜歡被張貼信箱,因此我將他的網站給貼上來:
http://www.kakueki.com/ruby<wbr></wbr>/list.html
A Look into Japanese Ruby List in English

再一次謝謝Harry的幫助:)
也謝謝其他人,讓我學到許多東西:D
Thanks again and again!!

CFC --

<script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
</script>
<script type="text/javascript">
_uacct = "UA-1447561-1";
urchinTracker();
</script>
#下面程序运行时报错: C:\Users\Administrator\AppData\Local\Programs\Python\Python312\python.exe C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\utils\generic.py Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\utils\generic.py", line 34, in <module> from ..utils import logging ImportError: attempted relative import with no known parent package 进程已结束,退出代码为 1 ------------------------------------------------------------------------------------------------ import inspect import json import os import tempfile import warnings from collections import OrderedDict, UserDict, defaultdict from collections.abc import Iterable, MutableMapping from contextlib import ExitStack, contextmanager from dataclasses import dataclass, fields, is_dataclass from enum import Enum from functools import partial, wraps from typing import Any, Callable, ContextManager, Optional, TypedDict import numpy as np from packaging import version from ..utils import logging from .import_utils import ( get_torch_version, is_flax_available, is_mlx_available, is_tf_available, is_torch_available, is_torch_fx_proxy, requires, ) _CAN_RECORD_REGISTRY = {} logger = logging.get_logger(__name__) if is_torch_available(): # required for @can_return_tuple decorator to work with torchdynamo import torch # noqa: F401 from ..model_debugging_utils import model_addition_debugger_context class cached_property(property): """ Descriptor that mimics @property but caches output in member variable. From tensorflow_datasets Built-in in functools from Python 3.8. """ def __get__(self, obj, objtype=None): # See docs.python.org/3/howto/descriptor.html#properties if obj is None: return self if self.fget is None: raise AttributeError("unreadable attribute") attr = "__cached_" + self.fget.__name__ cached = getattr(obj, attr, None) if cached is None: cached = self.fget(obj) setattr(obj, attr, cached) return cached # vendored from distutils.util def strtobool(val): """Convert a string representation of truth to true (1) or false (0). True values are 'y', 'yes', 't', 'true', 'on', and '1'; false values are 'n', 'no', 'f', 'false', 'off', and '0'. Raises ValueError if 'val' is anything else. """ val = val.lower() if val in {"y", "yes", "t", "true", "on", "1"}: return 1 if val in {"n", "no", "f", "false", "off", "0"}: return 0 raise ValueError(f"invalid truth value {val!r}") def infer_framework_from_repr(x): """ Tries to guess the framework of an object `x` from its repr (brittle but will help in `is_tensor` to try the frameworks in a smart order, without the need to import the frameworks). """ representation = str(type(x)) if representation.startswith("<class 'torch."): return "pt" elif representation.startswith("<class 'tensorflow."): return "tf" elif representation.startswith("<class 'jax"): return "jax" elif representation.startswith("<class 'numpy."): return "np" elif representation.startswith("<class 'mlx."): return "mlx" def _get_frameworks_and_test_func(x): """ Returns an (ordered since we are in Python 3.7+) dictionary framework to test function, which places the framework we can guess from the repr first, then Numpy, then the others. """ framework_to_test = { "pt": is_torch_tensor, "tf": is_tf_tensor, "jax": is_jax_tensor, "np": is_numpy_array, "mlx": is_mlx_array, } preferred_framework = infer_framework_from_repr(x) # We will test this one first, then numpy, then the others. frameworks = [] if preferred_framework is None else [preferred_framework] if preferred_framework != "np": frameworks.append("np") frameworks.extend([f for f in framework_to_test if f not in [preferred_framework, "np"]]) return {f: framework_to_test[f] for f in frameworks} def is_tensor(x): """ Tests if `x` is a `torch.Tensor`, `tf.Tensor`, `jaxlib.xla_extension.DeviceArray`, `np.ndarray` or `mlx.array` in the order defined by `infer_framework_from_repr` """ # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(x) for test_func in framework_to_test_func.values(): if test_func(x): return True # Tracers if is_torch_fx_proxy(x): return True if is_flax_available(): from jax.core import Tracer if isinstance(x, Tracer): return True return False def _is_numpy(x): return isinstance(x, np.ndarray) def is_numpy_array(x): """ Tests if `x` is a numpy array or not. """ return _is_numpy(x) def _is_torch(x): import torch return isinstance(x, torch.Tensor) def is_torch_tensor(x): """ Tests if `x` is a torch tensor or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch(x) def _is_torch_device(x): import torch return isinstance(x, torch.device) def is_torch_device(x): """ Tests if `x` is a torch device or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch_device(x) def _is_torch_dtype(x): import torch if isinstance(x, str): if hasattr(torch, x): x = getattr(torch, x) else: return False return isinstance(x, torch.dtype) def is_torch_dtype(x): """ Tests if `x` is a torch dtype or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch_dtype(x) def _is_tensorflow(x): import tensorflow as tf return isinstance(x, tf.Tensor) def is_tf_tensor(x): """ Tests if `x` is a tensorflow tensor or not. Safe to call even if tensorflow is not installed. """ return False if not is_tf_available() else _is_tensorflow(x) def _is_tf_symbolic_tensor(x): import tensorflow as tf # the `is_symbolic_tensor` predicate is only available starting with TF 2.14 if hasattr(tf, "is_symbolic_tensor"): return tf.is_symbolic_tensor(x) return isinstance(x, tf.Tensor) def is_tf_symbolic_tensor(x): """ Tests if `x` is a tensorflow symbolic tensor or not (ie. not eager). Safe to call even if tensorflow is not installed. """ return False if not is_tf_available() else _is_tf_symbolic_tensor(x) def _is_jax(x): import jax.numpy as jnp # noqa: F811 return isinstance(x, jnp.ndarray) def is_jax_tensor(x): """ Tests if `x` is a Jax tensor or not. Safe to call even if jax is not installed. """ return False if not is_flax_available() else _is_jax(x) def _is_mlx(x): import mlx.core as mx return isinstance(x, mx.array) def is_mlx_array(x): """ Tests if `x` is a mlx array or not. Safe to call even when mlx is not installed. """ return False if not is_mlx_available() else _is_mlx(x) def to_py_obj(obj): """ Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a python list. """ if isinstance(obj, (int, float)): return obj elif isinstance(obj, (dict, UserDict)): return {k: to_py_obj(v) for k, v in obj.items()} elif isinstance(obj, (list, tuple)): try: arr = np.array(obj) if np.issubdtype(arr.dtype, np.integer) or np.issubdtype(arr.dtype, np.floating): return arr.tolist() except Exception: pass return [to_py_obj(o) for o in obj] framework_to_py_obj = { "pt": lambda obj: obj.tolist(), "tf": lambda obj: obj.numpy().tolist(), "jax": lambda obj: np.asarray(obj).tolist(), "np": lambda obj: obj.tolist(), } # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(obj) for framework, test_func in framework_to_test_func.items(): if test_func(obj): return framework_to_py_obj[framework](obj) # tolist also works on 0d np arrays if isinstance(obj, np.number): return obj.tolist() else: return obj def to_numpy(obj): """ Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a Numpy array. """ framework_to_numpy = { "pt": lambda obj: obj.detach().cpu().numpy(), "tf": lambda obj: obj.numpy(), "jax": lambda obj: np.asarray(obj), "np": lambda obj: obj, } if isinstance(obj, (dict, UserDict)): return {k: to_numpy(v) for k, v in obj.items()} elif isinstance(obj, (list, tuple)): return np.array(obj) # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(obj) for framework, test_func in framework_to_test_func.items(): if test_func(obj): return framework_to_numpy[framework](obj) return obj class ModelOutput(OrderedDict): """ Base class for all model outputs as dataclass. Has a `__getitem__` that allows indexing by integer or slice (like a tuple) or strings (like a dictionary) that will ignore the `None` attributes. Otherwise behaves like a regular python dictionary. <Tip warning={true}> You can't unpack a `ModelOutput` directly. Use the [`~utils.ModelOutput.to_tuple`] method to convert it to a tuple before. </Tip> """ def __init_subclass__(cls) -> None: """Register subclasses as pytree nodes. This is necessary to synchronize gradients when using `torch.nn.parallel.DistributedDataParallel` with `static_graph=True` with modules that output `ModelOutput` subclasses. """ if is_torch_available(): if version.parse(get_torch_version()) >= version.parse("2.2"): from torch.utils._pytree import register_pytree_node register_pytree_node( cls, _model_output_flatten, partial(_model_output_unflatten, output_type=cls), serialized_type_name=f"{cls.__module__}.{cls.__name__}", ) else: from torch.utils._pytree import _register_pytree_node _register_pytree_node( cls, _model_output_flatten, partial(_model_output_unflatten, output_type=cls), ) def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) # Subclasses of ModelOutput must use the @dataclass decorator # This check is done in __init__ because the @dataclass decorator operates after __init_subclass__ # issubclass() would return True for issubclass(ModelOutput, ModelOutput) when False is needed # Just need to check that the current class is not ModelOutput is_modeloutput_subclass = self.__class__ != ModelOutput if is_modeloutput_subclass and not is_dataclass(self): raise TypeError( f"{self.__module__}.{self.__class__.__name__} is not a dataclass." " This is a subclass of ModelOutput and so must use the @dataclass decorator." ) def __post_init__(self): """Check the ModelOutput dataclass. Only occurs if @dataclass decorator has been used. """ class_fields = fields(self) # Safety and consistency checks if not len(class_fields): raise ValueError(f"{self.__class__.__name__} has no fields.") if not all(field.default is None for field in class_fields[1:]): raise ValueError(f"{self.__class__.__name__} should not have more than one required field.") first_field = getattr(self, class_fields[0].name) other_fields_are_none = all(getattr(self, field.name) is None for field in class_fields[1:]) if other_fields_are_none and not is_tensor(first_field): if isinstance(first_field, dict): iterator = first_field.items() first_field_iterator = True else: try: iterator = iter(first_field) first_field_iterator = True except TypeError: first_field_iterator = False # if we provided an iterator as first field and the iterator is a (key, value) iterator # set the associated fields if first_field_iterator: for idx, element in enumerate(iterator): if not isinstance(element, (list, tuple)) or len(element) != 2 or not isinstance(element[0], str): if idx == 0: # If we do not have an iterator of key/values, set it as attribute self[class_fields[0].name] = first_field else: # If we have a mixed iterator, raise an error raise ValueError( f"Cannot set key/value for {element}. It needs to be a tuple (key, value)." ) break setattr(self, element[0], element[1]) if element[1] is not None: self[element[0]] = element[1] elif first_field is not None: self[class_fields[0].name] = first_field else: for field in class_fields: v = getattr(self, field.name) if v is not None: self[field.name] = v def __delitem__(self, *args, **kwargs): raise Exception(f"You cannot use ``__delitem__`` on a {self.__class__.__name__} instance.") def setdefault(self, *args, **kwargs): raise Exception(f"You cannot use ``setdefault`` on a {self.__class__.__name__} instance.") def pop(self, *args, **kwargs): raise Exception(f"You cannot use ``pop`` on a {self.__class__.__name__} instance.") def update(self, *args, **kwargs): raise Exception(f"You cannot use ``update`` on a {self.__class__.__name__} instance.") def __getitem__(self, k): if isinstance(k, str): inner_dict = dict(self.items()) return inner_dict[k] else: return self.to_tuple()[k] def __setattr__(self, name, value): if name in self.keys() and value is not None: # Don't call self.__setitem__ to avoid recursion errors super().__setitem__(name, value) super().__setattr__(name, value) def __setitem__(self, key, value): # Will raise a KeyException if needed super().__setitem__(key, value) # Don't call self.__setattr__ to avoid recursion errors super().__setattr__(key, value) def __reduce__(self): if not is_dataclass(self): return super().__reduce__() callable, _args, *remaining = super().__reduce__() args = tuple(getattr(self, field.name) for field in fields(self)) return callable, args, *remaining def to_tuple(self) -> tuple[Any]: """ Convert self to a tuple containing all the attributes/keys that are not `None`. """ return tuple(self[k] for k in self.keys()) if is_torch_available(): import torch.utils._pytree as _torch_pytree def _model_output_flatten(output: ModelOutput) -> tuple[list[Any], "_torch_pytree.Context"]: return list(output.values()), list(output.keys()) def _model_output_unflatten( values: Iterable[Any], context: "_torch_pytree.Context", output_type=None, ) -> ModelOutput: return output_type(**dict(zip(context, values))) if version.parse(get_torch_version()) >= version.parse("2.2"): _torch_pytree.register_pytree_node( ModelOutput, _model_output_flatten, partial(_model_output_unflatten, output_type=ModelOutput), serialized_type_name=f"{ModelOutput.__module__}.{ModelOutput.__name__}", ) else: _torch_pytree._register_pytree_node( ModelOutput, _model_output_flatten, partial(_model_output_unflatten, output_type=ModelOutput), ) class ExplicitEnum(str, Enum): """ Enum with more explicit error message for missing values. """ @classmethod def _missing_(cls, value): raise ValueError( f"{value} is not a valid {cls.__name__}, please select one of {list(cls._value2member_map_.keys())}" ) class PaddingStrategy(ExplicitEnum): """ Possible values for the `padding` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. """ LONGEST = "longest" MAX_LENGTH = "max_length" DO_NOT_PAD = "do_not_pad" class TensorType(ExplicitEnum): """ Possible values for the `return_tensors` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. """ PYTORCH = "pt" TENSORFLOW = "tf" NUMPY = "np" JAX = "jax" MLX = "mlx" class ContextManagers: """ Wrapper for `contextlib.ExitStack` which enters a collection of context managers. Adaptation of `ContextManagers` in the `fastcore` library. """ def __init__(self, context_managers: list[ContextManager]): self.context_managers = context_managers self.stack = ExitStack() def __enter__(self): for context_manager in self.context_managers: self.stack.enter_context(context_manager) def __exit__(self, *args, **kwargs): self.stack.__exit__(*args, **kwargs) def can_return_loss(model_class): """ Check if a given model can return loss. Args: model_class (`type`): The class of the model. """ framework = infer_framework(model_class) if framework == "tf": signature = inspect.signature(model_class.call) # TensorFlow models elif framework == "pt": signature = inspect.signature(model_class.forward) # PyTorch models else: signature = inspect.signature(model_class.__call__) # Flax models for p in signature.parameters: if p == "return_loss" and signature.parameters[p].default is True: return True return False def find_labels(model_class): """ Find the labels used by a given model. Args: model_class (`type`): The class of the model. """ model_name = model_class.__name__ framework = infer_framework(model_class) if framework == "tf": signature = inspect.signature(model_class.call) # TensorFlow models elif framework == "pt": signature = inspect.signature(model_class.forward) # PyTorch models else: signature = inspect.signature(model_class.__call__) # Flax models if "QuestionAnswering" in model_name: return [p for p in signature.parameters if "label" in p or p in ("start_positions", "end_positions")] else: return [p for p in signature.parameters if "label" in p] def flatten_dict(d: MutableMapping, parent_key: str = "", delimiter: str = "."): """Flatten a nested dict into a single level dict.""" def _flatten_dict(d, parent_key="", delimiter="."): for k, v in d.items(): key = str(parent_key) + delimiter + str(k) if parent_key else k if v and isinstance(v, MutableMapping): yield from flatten_dict(v, key, delimiter=delimiter).items() else: yield key, v return dict(_flatten_dict(d, parent_key, delimiter)) @contextmanager def working_or_temp_dir(working_dir, use_temp_dir: bool = False): if use_temp_dir: with tempfile.TemporaryDirectory() as tmp_dir: yield tmp_dir else: yield working_dir def transpose(array, axes=None): """ Framework-agnostic version of `numpy.transpose` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.transpose(array, axes=axes) elif is_torch_tensor(array): return array.T if axes is None else array.permute(*axes) elif is_tf_tensor(array): import tensorflow as tf return tf.transpose(array, perm=axes) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.transpose(array, axes=axes) else: raise ValueError(f"Type not supported for transpose: {type(array)}.") def reshape(array, newshape): """ Framework-agnostic version of `numpy.reshape` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.reshape(array, newshape) elif is_torch_tensor(array): return array.reshape(*newshape) elif is_tf_tensor(array): import tensorflow as tf return tf.reshape(array, newshape) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.reshape(array, newshape) else: raise ValueError(f"Type not supported for reshape: {type(array)}.") def squeeze(array, axis=None): """ Framework-agnostic version of `numpy.squeeze` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.squeeze(array, axis=axis) elif is_torch_tensor(array): return array.squeeze() if axis is None else array.squeeze(dim=axis) elif is_tf_tensor(array): import tensorflow as tf return tf.squeeze(array, axis=axis) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.squeeze(array, axis=axis) else: raise ValueError(f"Type not supported for squeeze: {type(array)}.") def expand_dims(array, axis): """ Framework-agnostic version of `numpy.expand_dims` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.expand_dims(array, axis) elif is_torch_tensor(array): return array.unsqueeze(dim=axis) elif is_tf_tensor(array): import tensorflow as tf return tf.expand_dims(array, axis=axis) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.expand_dims(array, axis=axis) else: raise ValueError(f"Type not supported for expand_dims: {type(array)}.") def tensor_size(array): """ Framework-agnostic version of `numpy.size` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.size(array) elif is_torch_tensor(array): return array.numel() elif is_tf_tensor(array): import tensorflow as tf return tf.size(array) elif is_jax_tensor(array): return array.size else: raise ValueError(f"Type not supported for tensor_size: {type(array)}.") def infer_framework(model_class): """ Infers the framework of a given model without using isinstance(), because we cannot guarantee that the relevant classes are imported or available. """ for base_class in inspect.getmro(model_class): module = base_class.__module__ name = base_class.__name__ if module.startswith("tensorflow") or module.startswith("keras") or name == "TFPreTrainedModel": return "tf" elif module.startswith("torch") or name == "PreTrainedModel": return "pt" elif module.startswith("flax") or module.startswith("jax") or name == "FlaxPreTrainedModel": return "flax" else: raise TypeError(f"Could not infer framework from class {model_class}.") def torch_int(x): """ Casts an input to a torch int64 tensor if we are in a tracing context, otherwise to a Python int. """ if not is_torch_available(): return int(x) import torch return x.to(torch.int64) if torch.jit.is_tracing() and isinstance(x, torch.Tensor) else int(x) def torch_float(x): """ Casts an input to a torch float32 tensor if we are in a tracing context, otherwise to a Python float. """ if not is_torch_available(): return int(x) import torch return x.to(torch.float32) if torch.jit.is_tracing() and isinstance(x, torch.Tensor) else int(x) def filter_out_non_signature_kwargs(extra: Optional[list] = None): """ Decorator to filter out named arguments that are not in the function signature. This decorator ensures that only the keyword arguments that match the function's signature, or are specified in the `extra` list, are passed to the function. Any additional keyword arguments are filtered out and a warning is issued. Parameters: extra (`Optional[list]`, *optional*): A list of extra keyword argument names that are allowed even if they are not in the function's signature. Returns: Callable: A decorator that wraps the function and filters out invalid keyword arguments. Example usage: ```python @filter_out_non_signature_kwargs(extra=["allowed_extra_arg"]) def my_function(arg1, arg2, **kwargs): print(arg1, arg2, kwargs) my_function(arg1=1, arg2=2, allowed_extra_arg=3, invalid_arg=4) # This will print: 1 2 {"allowed_extra_arg": 3} # And issue a warning: "The following named arguments are not valid for `my_function` and were ignored: 'invalid_arg'" ``` """ extra = extra or [] extra_params_to_pass = set(extra) def decorator(func): sig = inspect.signature(func) function_named_args = set(sig.parameters.keys()) valid_kwargs_to_pass = function_named_args.union(extra_params_to_pass) # Required for better warning message is_instance_method = "self" in function_named_args is_class_method = "cls" in function_named_args # Mark function as decorated func._filter_out_non_signature_kwargs = True @wraps(func) def wrapper(*args, **kwargs): valid_kwargs = {} invalid_kwargs = {} for k, v in kwargs.items(): if k in valid_kwargs_to_pass: valid_kwargs[k] = v else: invalid_kwargs[k] = v if invalid_kwargs: invalid_kwargs_names = [f"'{k}'" for k in invalid_kwargs] invalid_kwargs_names = ", ".join(invalid_kwargs_names) # Get the class name for better warning message if is_instance_method: cls_prefix = args[0].__class__.__name__ + "." elif is_class_method: cls_prefix = args[0].__name__ + "." else: cls_prefix = "" warnings.warn( f"The following named arguments are not valid for `{cls_prefix}{func.__name__}`" f" and were ignored: {invalid_kwargs_names}", UserWarning, stacklevel=2, ) return func(*args, **valid_kwargs) return wrapper return decorator class TransformersKwargs(TypedDict, total=False): """ Keyword arguments to be passed to the loss function Attributes: num_items_in_batch (`Optional[torch.Tensor]`, *optional*): Number of items in the batch. It is recommended to pass it when you are doing gradient accumulation. output_hidden_states (`Optional[bool]`, *optional*): Most of the models support outputing all hidden states computed during the forward pass. output_attentions (`Optional[bool]`, *optional*): Turn this on to return the intermediary attention scores. output_router_logits (`Optional[bool]`, *optional*): For MoE models, this allows returning the router logits to compute the loss. cumulative_seqlens_q (`torch.LongTensor`, *optional*) Gets cumulative sequence length for query state. cumulative_seqlens_k (`torch.LongTensor`, *optional*) Gets cumulative sequence length for key state. max_length_q (`int`, *optional*): Maximum sequence length for query state. max_length_k (`int`, *optional*): Maximum sequence length for key state. """ num_items_in_batch: Optional["torch.Tensor"] output_hidden_states: Optional[bool] output_attentions: Optional[bool] output_router_logits: Optional[bool] cumulative_seqlens_q: Optional["torch.LongTensor"] cumulative_seqlens_k: Optional["torch.LongTensor"] max_length_q: Optional[int] max_length_k: Optional[int] def is_timm_config_dict(config_dict: dict[str, Any]) -> bool: """Checks whether a config dict is a timm config dict.""" return "pretrained_cfg" in config_dict def is_timm_local_checkpoint(pretrained_model_path: str) -> bool: """ Checks whether a checkpoint is a timm model checkpoint. """ if pretrained_model_path is None: return False # in case it's Path, not str pretrained_model_path = str(pretrained_model_path) is_file = os.path.isfile(pretrained_model_path) is_dir = os.path.isdir(pretrained_model_path) # pretrained_model_path is a file if is_file and pretrained_model_path.endswith(".json"): with open(pretrained_model_path) as f: config_dict = json.load(f) return is_timm_config_dict(config_dict) # pretrained_model_path is a directory with a config.json if is_dir and os.path.exists(os.path.join(pretrained_model_path, "config.json")): with open(os.path.join(pretrained_model_path, "config.json")) as f: config_dict = json.load(f) return is_timm_config_dict(config_dict) return False def set_attribute_for_modules(module: "torch.nn.Module", key: str, value: Any): """ Set a value to a module and all submodules. """ setattr(module, key, value) for submodule in module.children(): set_attribute_for_modules(submodule, key, value) def del_attribute_from_modules(module: "torch.nn.Module", key: str): """ Delete a value from a module and all submodules. """ # because we might remove it previously in case it's a shared module, e.g. activation function if hasattr(module, key): delattr(module, key) for submodule in module.children(): del_attribute_from_modules(submodule, key) def can_return_tuple(func): """ Decorator to wrap model method, to call output.to_tuple() if return_dict=False passed as a kwarg or use_return_dict=False is set in the config. Note: output.to_tuple() convert output to tuple skipping all `None` values. """ @wraps(func) def wrapper(self, *args, **kwargs): return_dict = self.config.return_dict if hasattr(self, "config") else True return_dict_passed = kwargs.pop("return_dict", return_dict) if return_dict_passed is not None: return_dict = return_dict_passed output = func(self, *args, **kwargs) if not return_dict and not isinstance(output, tuple): output = output.to_tuple() return output return wrapper # if is_torch_available(): # @torch._dynamo.disable @dataclass @requires(backends=("torch",)) class OutputRecorder: """ Configuration for recording outputs from a model via hooks. Attributes: target_class (Type): The class (e.g., nn.Module) to which the hook will be attached. index (Optional[int]): If the output is a tuple/list, optionally record only at a specific index. layer_name (Optional[str]): Name of the submodule to target (if needed), e.g., "transformer.layer.3.attn". class_name (Optional[str]): Name of the class to which the hook will be attached. Could be the suffix of class name in some cases. """ target_class: "type[torch.nn.Module]" index: Optional[int] = 0 layer_name: Optional[str] = None class_name: Optional[str] = None def check_model_inputs(func): """ Decorator to intercept specific layer outputs without using hooks. Compatible with torch.compile (Dynamo tracing). """ @wraps(func) def wrapper(self, *args, **kwargs): use_cache = kwargs.get("use_cache") if use_cache is None: use_cache = getattr(self.config, "use_cache", False) return_dict = kwargs.pop("return_dict", None) if return_dict is None: return_dict = getattr(self.config, "return_dict", True) if getattr(self, "gradient_checkpointing", False) and self.training and use_cache: logger.warning_once( "`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`." ) use_cache = False kwargs["use_cache"] = use_cache all_args = kwargs.copy() if "kwargs" in all_args: for k, v in all_args["kwargs"].items(): all_args[k] = v capture_flags = _CAN_RECORD_REGISTRY.get(str(self.__class__), {}) # there is a weak ref for executorch recordable_keys = { f"output_{k}": all_args.get( f"output_{k}", getattr( self.config, f"output_{k}", all_args.get("output_attentions", getattr(self.config, "output_attentions", False)), ), ) for k in capture_flags } collected_outputs = defaultdict(tuple) monkey_patched_layers = [] def make_capture_wrapper(module, orig_forward, key, index): @wraps(orig_forward) def wrapped_forward(*args, **kwargs): if key == "hidden_states" and len(collected_outputs[key]) == 0: collected_outputs[key] += (args[0],) if kwargs.get("debug_io", False): with model_addition_debugger_context( module, kwargs.get("debug_io_dir", "~/model_debug"), kwargs.get("prune_layers") ): output = orig_forward(*args, **kwargs) else: output = orig_forward(*args, **kwargs) if not isinstance(output, tuple): collected_outputs[key] += (output,) elif output[index] is not None: if key not in collected_outputs: collected_outputs[key] = (output[index],) else: collected_outputs[key] += (output[index],) return output return wrapped_forward if any(recordable_keys.values()): capture_tasks = [] for key, layer_specs in capture_flags.items(): if not recordable_keys.get(f"output_{key}", False): continue if not isinstance(layer_specs, list): layer_specs = [layer_specs] for specs in layer_specs: if not isinstance(specs, OutputRecorder): index = 0 if "hidden_states" in key else 1 class_name = None if not isinstance(specs, str) else specs target_class = specs if not isinstance(specs, str) else None specs = OutputRecorder(target_class=target_class, index=index, class_name=class_name) capture_tasks.append((key, specs)) for name, module in self.named_modules(): for key, specs in capture_tasks: # The second check is for multimodals where only backbone layer suffix is available if (specs.target_class is not None and isinstance(module, specs.target_class)) or ( specs.class_name is not None and name.endswith(specs.class_name) ): if specs.layer_name is not None and specs.layer_name not in name: continue # Monkey patch forward original_forward = module.forward module.forward = make_capture_wrapper(module, original_forward, key, specs.index) monkey_patched_layers.append((module, original_forward)) outputs = func(self, *args, **kwargs) # Restore original forward methods for module, original_forward in monkey_patched_layers: module.forward = original_forward # Inject collected outputs into model output for key in collected_outputs: if key == "hidden_states": collected_outputs[key] = collected_outputs[key][:-1] if hasattr(outputs, "vision_hidden_states"): collected_outputs[key] += (outputs.vision_hidden_states,) elif hasattr(outputs, "last_hidden_state"): collected_outputs[key] += (outputs.last_hidden_state,) outputs[key] = collected_outputs[key] elif key == "attentions": if isinstance(capture_flags[key], list) and len(capture_flags[key]) == 2: outputs[key] = collected_outputs[key][0::2] outputs["cross_" + key] = collected_outputs[key][1::2] else: outputs[key] = collected_outputs[key] else: outputs[key] = collected_outputs[key] if return_dict is False: outputs = outputs.to_tuple() return outputs return wrapper class GeneralInterface(MutableMapping): """ Dict-like object keeping track of a class-wide mapping, as well as a local one. Allows to have library-wide modifications though the class mapping, as well as local modifications in a single file with the local mapping. """ # Class instance object, so that a call to `register` can be reflected into all other files correctly, even if # a new instance is created (in order to locally override a given function) _global_mapping = {} def __init__(self): self._local_mapping = {} def __getitem__(self, key): # First check if instance has a local override if key in self._local_mapping: return self._local_mapping[key] return self._global_mapping[key] def __setitem__(self, key, value): # Allow local update of the default functions without impacting other instances self._local_mapping.update({key: value}) def __delitem__(self, key): del self._local_mapping[key] def __iter__(self): # Ensure we use all keys, with the overwritten ones on top return iter({**self._global_mapping, **self._local_mapping}) def __len__(self): return len(self._global_mapping.keys() | self._local_mapping.keys()) @classmethod def register(cls, key: str, value: Callable): cls._global_mapping.update({key: value}) def valid_keys(self) -> list[str]: return list(self.keys())
08-08
import torch import torchaudio from typing import Callable, List import warnings languages = ['ru', 'en', 'de', 'es'] class OnnxWrapper(): def __init__(self, path, force_onnx_cpu=False): import numpy as np global np import onnxruntime opts = onnxruntime.SessionOptions() opts.inter_op_num_threads = 1 opts.intra_op_num_threads = 1 if force_onnx_cpu and 'CPUExecutionProvider' in onnxruntime.get_available_providers(): self.session = onnxruntime.InferenceSession(path, providers=['CPUExecutionProvider'], sess_options=opts) else: self.session = onnxruntime.InferenceSession(path, sess_options=opts) self.reset_states() if '16k' in path: warnings.warn('This model support only 16000 sampling rate!') self.sample_rates = [16000] else: self.sample_rates = [8000, 16000] def _validate_input(self, x, sr: int): if x.dim() == 1: x = x.unsqueeze(0) if x.dim() > 2: raise ValueError(f"Too many dimensions for input audio chunk {x.dim()}") if sr != 16000 and (sr % 16000 == 0): step = sr // 16000 x = x[:,::step] sr = 16000 if sr not in self.sample_rates: raise ValueError(f"Supported sampling rates: {self.sample_rates} (or multiply of 16000)") if sr / x.shape[1] > 31.25: raise ValueError("Input audio chunk is too short") return x, sr def reset_states(self, batch_size=1): self._state = torch.zeros((2, batch_size, 128)).float() self._context = torch.zeros(0) self._last_sr = 0 self._last_batch_size = 0 def __call__(self, x, sr: int): x, sr = self._validate_input(x, sr) num_samples = 512 if sr == 16000 else 256 if x.shape[-1] != num_samples: raise ValueError(f"Provided number of samples is {x.shape[-1]} (Supported values: 256 for 8000 sample rate, 512 for 16000)") batch_size = x.shape[0] context_size = 64 if sr == 16000 else 32 if not self._last_batch_size: self.reset_states(batch_size) if (self._last_sr) and (self._last_sr != sr): self.reset_states(batch_size) if (self._last_batch_size) and (self._last_batch_size != batch_size): self.reset_states(batch_size) if not len(self._context): self._context = torch.zeros(batch_size, context_size) x = torch.cat([self._context, x], dim=1) if sr in [8000, 16000]: ort_inputs = {'input': x.numpy(), 'state': self._state.numpy(), 'sr': np.array(sr, dtype='int64')} ort_outs = self.session.run(None, ort_inputs) out, state = ort_outs self._state = torch.from_numpy(state) else: raise ValueError() self._context = x[..., -context_size:] self._last_sr = sr self._last_batch_size = batch_size out = torch.from_numpy(out) return out def audio_forward(self, x, sr: int): outs = [] x, sr = self._validate_input(x, sr) self.reset_states() num_samples = 512 if sr == 16000 else 256 if x.shape[1] % num_samples: pad_num = num_samples - (x.shape[1] % num_samples) x = torch.nn.functional.pad(x, (0, pad_num), 'constant', value=0.0) for i in range(0, x.shape[1], num_samples): wavs_batch = x[:, i:i+num_samples] out_chunk = self.__call__(wavs_batch, sr) outs.append(out_chunk) stacked = torch.cat(outs, dim=1) return stacked.cpu() class Validator(): def __init__(self, url, force_onnx_cpu): self.onnx = True if url.endswith('.onnx') else False torch.hub.download_url_to_file(url, 'inf.model') if self.onnx: import onnxruntime if force_onnx_cpu and 'CPUExecutionProvider' in onnxruntime.get_available_providers(): self.model = onnxruntime.InferenceSession('inf.model', providers=['CPUExecutionProvider']) else: self.model = onnxruntime.InferenceSession('inf.model') else: self.model = init_jit_model(model_path='inf.model') def __call__(self, inputs: torch.Tensor): with torch.no_grad(): if self.onnx: ort_inputs = {'input': inputs.cpu().numpy()} outs = self.model.run(None, ort_inputs) outs = [torch.Tensor(x) for x in outs] else: outs = self.model(inputs) return outs def read_audio(path: str, sampling_rate: int = 16000): list_backends = torchaudio.list_audio_backends() assert len(list_backends) > 0, 'The list of available backends is empty, please install backend manually. \ \n Recommendations: \n \tSox (UNIX OS) \n \tSoundfile (Windows OS, UNIX OS) \n \tffmpeg (Windows OS, UNIX OS)' try: effects = [ ['channels', '1'], ['rate', str(sampling_rate)] ] wav, sr = torchaudio.sox_effects.apply_effects_file(path, effects=effects) except: wav, sr = torchaudio.load(path) if wav.size(0) > 1: wav = wav.mean(dim=0, keepdim=True) if sr != sampling_rate: transform = torchaudio.transforms.Resample(orig_freq=sr, new_freq=sampling_rate) wav = transform(wav) sr = sampling_rate assert sr == sampling_rate return wav.squeeze(0) def save_audio(path: str, tensor: torch.Tensor, sampling_rate: int = 16000): torchaudio.save(path, tensor.unsqueeze(0), sampling_rate, bits_per_sample=16) def init_jit_model(model_path: str, device=torch.device('cpu')): model = torch.jit.load(model_path, map_location=device) model.eval() return model def make_visualization(probs, step): import pandas as pd pd.DataFrame({'probs': probs}, index=[x * step for x in range(len(probs))]).plot(figsize=(16, 8), kind='area', ylim=[0, 1.05], xlim=[0, len(probs) * step], xlabel='seconds', ylabel='speech probability', colormap='tab20') @torch.no_grad() def get_speech_timestamps(audio: torch.Tensor, model, threshold: float = 0.5, sampling_rate: int = 16000, min_speech_duration_ms: int = 250, max_speech_duration_s: float = float('inf'), min_silence_duration_ms: int = 100, speech_pad_ms: int = 30, return_seconds: bool = False, time_resolution: int = 1, visualize_probs: bool = False, progress_tracking_callback: Callable[[float], None] = None, neg_threshold: float = None, window_size_samples: int = 512, min_silence_at_max_speech: float = 98, use_max_poss_sil_at_max_speech: bool = True): """ This method is used for splitting long audios into speech chunks using silero VAD Parameters ---------- audio: torch.Tensor, one dimensional One dimensional float torch.Tensor, other types are casted to torch if possible model: preloaded .jit/.onnx silero VAD model threshold: float (default - 0.5) Speech threshold. Silero VAD outputs speech probabilities for each audio chunk, probabilities ABOVE this value are considered as SPEECH. It is better to tune this parameter for each dataset separately, but "lazy" 0.5 is pretty good for most datasets. sampling_rate: int (default - 16000) Currently silero VAD models support 8000 and 16000 (or multiply of 16000) sample rates min_speech_duration_ms: int (default - 250 milliseconds) Final speech chunks shorter min_speech_duration_ms are thrown out max_speech_duration_s: int (default - inf) Maximum duration of speech chunks in seconds Chunks longer than max_speech_duration_s will be split at the timestamp of the last silence that lasts more than 100ms (if any), to prevent agressive cutting. Otherwise, they will be split aggressively just before max_speech_duration_s. min_silence_duration_ms: int (default - 100 milliseconds) In the end of each speech chunk wait for min_silence_duration_ms before separating it speech_pad_ms: int (default - 30 milliseconds) Final speech chunks are padded by speech_pad_ms each side return_seconds: bool (default - False) whether return timestamps in seconds (default - samples) time_resolution: bool (default - 1) time resolution of speech coordinates when requested as seconds visualize_probs: bool (default - False) whether draw prob hist or not progress_tracking_callback: Callable[[float], None] (default - None) callback function taking progress in percents as an argument neg_threshold: float (default = threshold - 0.15) Negative threshold (noise or exit threshold). If model's current state is SPEECH, values BELOW this value are considered as NON-SPEECH. min_silence_at_max_speech: float (default - 98ms) Minimum silence duration in ms which is used to avoid abrupt cuts when max_speech_duration_s is reached use_max_poss_sil_at_max_speech: bool (default - True) Whether to use the maximum possible silence at max_speech_duration_s or not. If not, the last silence is used. window_size_samples: int (default - 512 samples) !!! DEPRECATED, DOES NOTHING !!! Returns ---------- speeches: list of dicts list containing ends and beginnings of speech chunks (samples or seconds based on return_seconds) """ if not torch.is_tensor(audio): try: audio = torch.Tensor(audio) except: raise TypeError("Audio cannot be casted to tensor. Cast it manually") if len(audio.shape) > 1: for i in range(len(audio.shape)): # trying to squeeze empty dimensions audio = audio.squeeze(0) if len(audio.shape) > 1: raise ValueError("More than one dimension in audio. Are you trying to process audio with 2 channels?") if sampling_rate > 16000 and (sampling_rate % 16000 == 0): step = sampling_rate // 16000 sampling_rate = 16000 audio = audio[::step] warnings.warn('Sampling rate is a multiply of 16000, casting to 16000 manually!') else: step = 1 if sampling_rate not in [8000, 16000]: raise ValueError("Currently silero VAD models support 8000 and 16000 (or multiply of 16000) sample rates") window_size_samples = 512 if sampling_rate == 16000 else 256 hop_size_samples = int(window_size_samples) model.reset_states() min_speech_samples = sampling_rate * min_speech_duration_ms / 1000 speech_pad_samples = sampling_rate * speech_pad_ms / 1000 max_speech_samples = sampling_rate * max_speech_duration_s - window_size_samples - 2 * speech_pad_samples min_silence_samples = sampling_rate * min_silence_duration_ms / 1000 min_silence_samples_at_max_speech = sampling_rate * min_silence_at_max_speech / 1000 audio_length_samples = len(audio) speech_probs = [] for current_start_sample in range(0, audio_length_samples, hop_size_samples): chunk = audio[current_start_sample: current_start_sample + window_size_samples] if len(chunk) < window_size_samples: chunk = torch.nn.functional.pad(chunk, (0, int(window_size_samples - len(chunk)))) try: speech_prob = model(chunk, sampling_rate).item() except Exception as e: import ipdb; ipdb.set_trace() speech_probs.append(speech_prob) # caculate progress and seng it to callback function progress = current_start_sample + hop_size_samples if progress > audio_length_samples: progress = audio_length_samples progress_percent = (progress / audio_length_samples) * 100 if progress_tracking_callback: progress_tracking_callback(progress_percent) triggered = False speeches = [] current_speech = {} if neg_threshold is None: neg_threshold = max(threshold - 0.15, 0.01) temp_end = 0 # to save potential segment end (and tolerate some silence) prev_end = next_start = 0 # to save potential segment limits in case of maximum segment size reached possible_ends = [] for i, speech_prob in enumerate(speech_probs): if (speech_prob >= threshold) and temp_end: if temp_end != 0: sil_dur = (hop_size_samples * i) - temp_end if sil_dur > min_silence_samples_at_max_speech: possible_ends.append((temp_end, sil_dur)) temp_end = 0 if next_start < prev_end: next_start = hop_size_samples * i if (speech_prob >= threshold) and not triggered: triggered = True current_speech['start'] = hop_size_samples * i continue if triggered and (hop_size_samples * i) - current_speech['start'] > max_speech_samples: if possible_ends: if use_max_poss_sil_at_max_speech: prev_end, dur = max(possible_ends, key=lambda x: x[1]) # use the longest possible silence segment in the current speech chunk else: prev_end, dur = possible_ends[-1] # use the last possible silence segement current_speech['end'] = prev_end speeches.append(current_speech) current_speech = {} next_start = prev_end + dur if next_start < prev_end + hop_size_samples * i: # previously reached silence (< neg_thres) and is still not speech (< thres) #triggered = False current_speech['start'] = next_start else: triggered = False #current_speech['start'] = next_start prev_end = next_start = temp_end = 0 possible_ends = [] else: current_speech['end'] = hop_size_samples * i speeches.append(current_speech) current_speech = {} prev_end = next_start = temp_end = 0 triggered = False possible_ends = [] continue if (speech_prob < neg_threshold) and triggered: if not temp_end: temp_end = hop_size_samples * i # if ((hop_size_samples * i) - temp_end) > min_silence_samples_at_max_speech: # condition to avoid cutting in very short silence # prev_end = temp_end if (hop_size_samples * i) - temp_end < min_silence_samples: continue else: current_speech['end'] = temp_end if (current_speech['end'] - current_speech['start']) > min_speech_samples: speeches.append(current_speech) current_speech = {} prev_end = next_start = temp_end = 0 triggered = False possible_ends = [] continue if current_speech and (audio_length_samples - current_speech['start']) > min_speech_samples: current_speech['end'] = audio_length_samples speeches.append(current_speech) for i, speech in enumerate(speeches): if i == 0: speech['start'] = int(max(0, speech['start'] - speech_pad_samples)) if i != len(speeches) - 1: silence_duration = speeches[i+1]['start'] - speech['end'] if silence_duration < 2 * speech_pad_samples: speech['end'] += int(silence_duration // 2) speeches[i+1]['start'] = int(max(0, speeches[i+1]['start'] - silence_duration // 2)) else: speech['end'] = int(min(audio_length_samples, speech['end'] + speech_pad_samples)) speeches[i+1]['start'] = int(max(0, speeches[i+1]['start'] - speech_pad_samples)) else: speech['end'] = int(min(audio_length_samples, speech['end'] + speech_pad_samples)) if return_seconds: audio_length_seconds = audio_length_samples / sampling_rate for speech_dict in speeches: speech_dict['start'] = max(round(speech_dict['start'] / sampling_rate, time_resolution), 0) speech_dict['end'] = min(round(speech_dict['end'] / sampling_rate, time_resolution), audio_length_seconds) elif step > 1: for speech_dict in speeches: speech_dict['start'] *= step speech_dict['end'] *= step if visualize_probs: make_visualization(speech_probs, hop_size_samples / sampling_rate) return speeches class VADIterator: def __init__(self, model, threshold: float = 0.5, sampling_rate: int = 16000, min_silence_duration_ms: int = 100, speech_pad_ms: int = 30 ): """ Class for stream imitation Parameters ---------- model: preloaded .jit/.onnx silero VAD model threshold: float (default - 0.5) Speech threshold. Silero VAD outputs speech probabilities for each audio chunk, probabilities ABOVE this value are considered as SPEECH. It is better to tune this parameter for each dataset separately, but "lazy" 0.5 is pretty good for most datasets. sampling_rate: int (default - 16000) Currently silero VAD models support 8000 and 16000 sample rates min_silence_duration_ms: int (default - 100 milliseconds) In the end of each speech chunk wait for min_silence_duration_ms before separating it speech_pad_ms: int (default - 30 milliseconds) Final speech chunks are padded by speech_pad_ms each side """ self.model = model self.threshold = threshold self.sampling_rate = sampling_rate if sampling_rate not in [8000, 16000]: raise ValueError('VADIterator does not support sampling rates other than [8000, 16000]') self.min_silence_samples = sampling_rate * min_silence_duration_ms / 1000 self.speech_pad_samples = sampling_rate * speech_pad_ms / 1000 self.reset_states() def reset_states(self): self.model.reset_states() self.triggered = False self.temp_end = 0 self.current_sample = 0 @torch.no_grad() def __call__(self, x, return_seconds=False, time_resolution: int = 1): """ x: torch.Tensor audio chunk (see examples in repo) return_seconds: bool (default - False) whether return timestamps in seconds (default - samples) time_resolution: int (default - 1) time resolution of speech coordinates when requested as seconds """ if not torch.is_tensor(x): try: x = torch.Tensor(x) except: raise TypeError("Audio cannot be casted to tensor. Cast it manually") window_size_samples = len(x[0]) if x.dim() == 2 else len(x) self.current_sample += window_size_samples speech_prob = self.model(x, self.sampling_rate).item() if (speech_prob >= self.threshold) and self.temp_end: self.temp_end = 0 if (speech_prob >= self.threshold) and not self.triggered: self.triggered = True speech_start = max(0, self.current_sample - self.speech_pad_samples - window_size_samples) return {'start': int(speech_start) if not return_seconds else round(speech_start / self.sampling_rate, time_resolution)} if (speech_prob < self.threshold - 0.15) and self.triggered: if not self.temp_end: self.temp_end = self.current_sample if self.current_sample - self.temp_end < self.min_silence_samples: return None else: speech_end = self.temp_end + self.speech_pad_samples - window_size_samples self.temp_end = 0 self.triggered = False return {'end': int(speech_end) if not return_seconds else round(speech_end / self.sampling_rate, time_resolution)} return None def collect_chunks(tss: List[dict], wav: torch.Tensor, seconds: bool = False, sampling_rate: int = None) -> torch.Tensor: """Collect audio chunks from a longer audio clip This method extracts audio chunks from an audio clip, using a list of provided coordinates, and concatenates them together. Coordinates can be passed either as sample numbers or in seconds, in which case the audio sampling rate is also needed. Parameters ---------- tss: List[dict] Coordinate list of the clips to collect from the audio. wav: torch.Tensor, one dimensional One dimensional float torch.Tensor, containing the audio to clip. seconds: bool (default - False) Whether input coordinates are passed as seconds or samples. sampling_rate: int (default - None) Input audio sampling rate. Required if seconds is True. Returns ------- torch.Tensor, one dimensional One dimensional float torch.Tensor of the concatenated clipped audio chunks. Raises ------ ValueError Raised if sampling_rate is not provided when seconds is True. """ if seconds and not sampling_rate: raise ValueError('sampling_rate must be provided when seconds is True') chunks = list() _tss = _seconds_to_samples_tss(tss, sampling_rate) if seconds else tss for i in _tss: chunks.append(wav[i['start']:i['end']]) return torch.cat(chunks) def drop_chunks(tss: List[dict], wav: torch.Tensor, seconds: bool = False, sampling_rate: int = None) -> torch.Tensor: """Drop audio chunks from a longer audio clip This method extracts audio chunks from an audio clip, using a list of provided coordinates, and drops them. Coordinates can be passed either as sample numbers or in seconds, in which case the audio sampling rate is also needed. Parameters ---------- tss: List[dict] Coordinate list of the clips to drop from from the audio. wav: torch.Tensor, one dimensional One dimensional float torch.Tensor, containing the audio to clip. seconds: bool (default - False) Whether input coordinates are passed as seconds or samples. sampling_rate: int (default - None) Input audio sampling rate. Required if seconds is True. Returns ------- torch.Tensor, one dimensional One dimensional float torch.Tensor of the input audio minus the dropped chunks. Raises ------ ValueError Raised if sampling_rate is not provided when seconds is True. """ if seconds and not sampling_rate: raise ValueError('sampling_rate must be provided when seconds is True') chunks = list() cur_start = 0 _tss = _seconds_to_samples_tss(tss, sampling_rate) if seconds else tss for i in _tss: chunks.append((wav[cur_start: i['start']])) cur_start = i['end'] return torch.cat(chunks) def _seconds_to_samples_tss(tss: List[dict], sampling_rate: int) -> List[dict]: """Convert coordinates expressed in seconds to sample coordinates. """ return [{ 'start': round(crd['start']) * sampling_rate, 'end': round(crd['end']) * sampling_rate } for crd in tss] 这个是silero vad 的模型框架吗
09-24
基于粒子群优化算法的p-Hub选址优化(Matlab代码实现)内容概要:本文介绍了基于粒子群优化算法(PSO)的p-Hub选址优化问题的研究与实现,重点利用Matlab进行算法编程和仿真。p-Hub选址是物流与交通网络中的关键问题,旨在通过确定最优的枢纽节点位置和非枢纽节点的分配方式,最小化网络总成本。文章详细阐述了粒子群算法的基本原理及其在解决组合优化问题中的适应性改进,结合p-Hub中转网络的特点构建数学模型,并通过Matlab代码实现算法流程,包括初始化、适应度计算、粒子更新与收敛判断等环节。同时可能涉及对算法参数设置、收敛性能及不同规模案例的仿真结果分析,以验证方法的有效性和鲁棒性。; 适合人群:具备一定Matlab编程基础和优化算法理论知识的高校研究生、科研人员及从事物流网络规划、交通系统设计等相关领域的工程技术人员。; 使用场景及目标:①解决物流、航空、通信等网络中的枢纽选址与路径优化问题;②学习并掌握粒子群算法在复杂组合优化问题中的建模与实现方法;③为相关科研项目或实际工程应用提供算法支持与代码参考。; 阅读建议:建议读者结合Matlab代码逐段理解算法实现逻辑,重点关注目标函数建模、粒子编码方式及约束处理策略,并尝试调整参数或拓展模型以加深对算法性能的理解。
classdef DoubleComper < handle properties %% 比较器实例 Comper1_Point Comper2_Point %% 比较器相关参数 DoubleComperMap % 输出值 len % 长度 RawData1 % 频点1 原始数据 RawData2 % 频点2 原始数据 %% 频点1 基础比较器相关参数 % 阈值相关 Comper1_MAXIMUM_VALUE_OF_BURRS = 2; % 毛刺的最大值 Comper1_TOUCH_THRESHOLD = 5; % 触摸阈值 Comper1_CONFIRM_THE_TIME_SLICE_THRESHOLD = 3; % 确认稳定的触摸阈值 % 确认参数相关 Comper1_TIME_SLICE_CONFIRMATION_TIMES = 5; % 时间片确认次数阈值 Comper1_THE_THRESHOLD_FOR_THE_OSCILLATING_FIFO = 14; % 震荡FIFO确认的阈值 Comper1_IDLE_FIFO_THRESHOLD = 9; % 空闲FIFO确认的阈值 Comper1_MAXIMUM_VALUE_CONFIRMATION_TIME = 3; % 最大值确认次数阈值 Comper1_MINIMUM_STABLE_FREQUENCY = 10; % 确认稳定的次数阈值 % 窗口大小 Comper1_THE_SIZE_OF_THE_OSCILLATING_FIFO = 16; % 震荡FIFO的窗口大小 Comper1_IDLE_FIFO_WINDOW_TIME = 6; % 空闲FIFO的窗口大小 % 时间相关 Comper1_LONGEST_INITIAL_DETECTION_TIME = 48; % 最长初次检测时间 Comper1_LONGEST_MATCH_WAITING_TIME = 80; % 最长匹配等待时间 Comper1_COOLING_TIME_AFTER_TOUCH = 25; % 触摸后冷却时间 Comper1_MAXIMUM_LONG_PRESS_DURATION = uint16(2000/7) % 最长长按时间 %% 频点2 基础比较器相关参数 % 阈值相关 Comper2_MAXIMUM_VALUE_OF_BURRS = 2; % 毛刺的最大值 Comper2_TOUCH_THRESHOLD = 5; % 触摸阈值 Comper2_CONFIRM_THE_TIME_SLICE_THRESHOLD = 3; % 确认稳定的触摸阈值 % 确认参数相关 Comper2_TIME_SLICE_CONFIRMATION_TIMES = 5; % 时间片确认次数阈值 Comper2_THE_THRESHOLD_FOR_THE_OSCILLATING_FIFO = 14; % 震荡FIFO确认的阈值 Comper2_IDLE_FIFO_THRESHOLD = 9; % 空闲FIFO确认的阈值 Comper2_MAXIMUM_VALUE_CONFIRMATION_TIME = 3; % 最大值确认次数阈值 Comper2_MINIMUM_STABLE_FREQUENCY = 10; % 确认稳定的次数阈值 % 窗口大小 Comper2_THE_SIZE_OF_THE_OSCILLATING_FIFO = 16; % 震荡FIFO的窗口大小 Comper2_IDLE_FIFO_WINDOW_TIME = 6; % 空闲FIFO的窗口大小 % 时间相关 Comper2_LONGEST_INITIAL_DETECTION_TIME = 48; % 最长初次检测时间 Comper2_LONGEST_MATCH_WAITING_TIME = 80; % 最长匹配等待时间 Comper2_COOLING_TIME_AFTER_TOUCH = 25; % 触摸后冷却时间 Comper2_MAXIMUM_LONG_PRESS_DURATION = uint16(2000/7) % 最长长按时间 end methods %% 构造函数 function obj = DoubleComper(RawData1 , RawData2) obj.Comper1_Point = TouchComper(); obj.Comper2_Point = TouchComper(); obj.len = length(RawData1); obj.len = length(RawData2); obj.RawData1 = RawData1; obj.RawData2 = RawData2; end %% 处理函数 function Map = DoubleComper_Process(obj) % 峰值 Comper1MHz_Point_PeakValue = zeros(obj.len,1,"double"); Comper13MHz_Point_PeakValue = zeros(obj.len,1,"double"); % 极性 Comper1MHz_Point_hiddenPolarity = zeros(obj.len,1,"double"); Comper13MHz_Point_hiddenPolarity = zeros(obj.len,1,"double"); % 计时器 Comper1MHz_Point_TimeCount = zeros(obj.len,1,"double"); Comper13MHz_Point_TimeCount = zeros(obj.len,1,"double"); % 震荡计数器 Comper1MHz_Point_TSD_ShakeCnt = zeros(obj.len,1,"double"); Comper13MHz_Point_TSD_ShakeCnt = zeros(obj.len,1,"double"); % FIFO的和 Comper1MHz_Point_FifoSum = zeros(obj.len,1,"double"); Comper13MHz_Point_FifoSum = zeros(obj.len,1,"double"); % 当前触摸状态 Comper1MHz_Point_CurrentTouchSt = zeros(obj.len,1,"double"); Comper13MHz_Point_CurrentTouchSt = zeros(obj.len,1,"double"); % 过去触摸状态 Comper1MHz_Point_PurrentTouchSt = zeros(obj.len,1,"double"); Comper13MHz_Point_PurrentTouchSt = zeros(obj.len,1,"double"); % 运算 for i = 1 : obj.len FilterResult1 = obj.RawData1(i); obj.Comper1_Point.New_TouchTheComparatorFunction(FilterResult1); % 峰值 Comper1MHz_Point_PeakValue(i) = obj.Comper1_Point.PeakValue; % 极性 Comper1MHz_Point_hiddenPolarity(i) = obj.Comper1_Point.hiddenPolarity; % 计时器 Comper1MHz_Point_TimeCount(i) = obj.Comper1_Point.TimeCount; % 震荡计数器 Comper1MHz_Point_TSD_ShakeCnt(i) = obj.Comper1_Point.TSD_ShakeCnt; % FIFO的和 Comper1MHz_Point_FifoSum(i) = obj.Comper1_Point.Fifo_Sum; % 当前触摸状态 Comper1MHz_Point_CurrentTouchSt(i) = obj.Comper1_Point.Current_TouchSt; % 过去触摸状态 Comper1MHz_Point_PurrentTouchSt(i) = obj.Comper1_Point.previous_TouchSt; end for i = 1 : obj.len FilterResult2 = obj.RawData1(i); obj.Comper1_Point.New_TouchTheComparatorFunction(FilterResult2); % 峰值 Comper13MHz_Point_PeakValue(i) = obj.Comper1_Point.PeakValue + 100 ; % 极性 Comper13MHz_Point_hiddenPolarity(i) = obj.Comper1_Point.hiddenPolarity + 100 ; % 计时器 Comper13MHz_Point_TimeCount(i) = obj.Comper1_Point.TimeCount + 100 ; % 震荡计数器 Comper13MHz_Point_TSD_ShakeCnt(i) = obj.Comper1_Point.TSD_ShakeCnt + 100 ; % FIFO的和 Comper13MHz_Point_FifoSum(i) = obj.Comper1_Point.Fifo_Sum + 100 ; % 当前触摸状态 Comper13MHz_Point_CurrentTouchSt(i) = obj.Comper1_Point.Current_TouchSt + 100 ; % 过去触摸状态 Comper13MHz_Point_PurrentTouchSt(i) = obj.Comper1_Point.previous_TouchSt + 100 ; end obj.DoubleComperMap = containers.Map(); obj.DoubleComperMap('1MHz的峰值') = Comper1MHz_Point_PeakValue; obj.DoubleComperMap('1MHz的极性') = Comper1MHz_Point_hiddenPolarity; obj.DoubleComperMap('1MHz的计时计数器') = Comper1MHz_Point_TimeCount; obj.DoubleComperMap('1MHz的FIFO的和') = Comper1MHz_Point_FifoSum; obj.DoubleComperMap('1MHz的超过阈值计数器') = Comper1MHz_Point_TSD_ShakeCnt; obj.DoubleComperMap('1MHz的比较器当前状态') = Comper1MHz_Point_CurrentTouchSt; obj.DoubleComperMap('1MHz的比较器过去状态') = Comper1MHz_Point_PurrentTouchSt; obj.DoubleComperMap('13MHz的峰值') = Comper13MHz_Point_PeakValue; obj.DoubleComperMap('13MHz的极性') = Comper13MHz_Point_hiddenPolarity; obj.DoubleComperMap('13MHz的计时计数器') = Comper13MHz_Point_TimeCount; obj.DoubleComperMap('13MHz的FIFO的和') = Comper13MHz_Point_FifoSum; obj.DoubleComperMap('13MHz的超过阈值计数器') = Comper13MHz_Point_TSD_ShakeCnt; obj.DoubleComperMap('13MHz的比较器当前状态') = Comper13MHz_Point_CurrentTouchSt; obj.DoubleComperMap('13MHz的比较器过去状态') = Comper13MHz_Point_PurrentTouchSt; Map = obj.DoubleComperMap; end %% 修改频点1 各种阈值修改函数 function Map = ChangeComper1yuzhi(obj,Comper_MAXIMUM_VALUE_OF_BURRS, ... Comper_TOUCH_THRESHOLD, ... Comper_CONFIRM_THE_TIME_SLICE_THRESHOLD) obj.Comper1_Point.Changeyuzhi(Comper_MAXIMUM_VALUE_OF_BURRS, ... Comper_TOUCH_THRESHOLD, ... Comper_CONFIRM_THE_TIME_SLICE_THRESHOLD); obj.DoubleComperMap = obj.DoubleComper_Process(); Map = obj.DoubleComperMap; end %% 修改频点1 各种确认参数修改函数 function Map = ChangeComper1queren(obj, ... Comper_TIME_SLICE_CONFIRMATION_TIMES, ... Comper_THE_THRESHOLD_FOR_THE_OSCILLATING_FIFO, ... Comper_IDLE_FIFO_THRESHOLD, ... Comper_MAXIMUM_VALUE_CONFIRMATION_TIME, ... Comper_MINIMUM_STABLE_FREQUENCY) obj.Comper1_Point.Changequerencanshu(Comper_TIME_SLICE_CONFIRMATION_TIMES, ... Comper_THE_THRESHOLD_FOR_THE_OSCILLATING_FIFO, ... Comper_IDLE_FIFO_THRESHOLD, ... Comper_MAXIMUM_VALUE_CONFIRMATION_TIME, ... Comper_MINIMUM_STABLE_FREQUENCY) obj.DoubleComperMap = obj.DoubleComper_Process(); Map = obj.DoubleComperMap; end %% 修改频点1 各种窗口大小修改函数 function Map = ChangeComper1Window(obj, ... Comper_THE_SIZE_OF_THE_OSCILLATING_FIFO, ... Comper_IDLE_FIFO_WINDOW_TIME) obj.Comper1_Point.Changechuangkou(Comper_THE_SIZE_OF_THE_OSCILLATING_FIFO, ... Comper_IDLE_FIFO_WINDOW_TIME); obj.DoubleComperMap = obj.DoubleComper_Process(); Map = obj.DoubleComperMap; end %% 修改频点2 各种阈值修改函数 function Map = ChangeComper2yuzhi(obj,Comper_MAXIMUM_VALUE_OF_BURRS, ... Comper_TOUCH_THRESHOLD, ... Comper_CONFIRM_THE_TIME_SLICE_THRESHOLD) obj.Comper2_Point.Changeyuzhi(Comper_MAXIMUM_VALUE_OF_BURRS, ... Comper_TOUCH_THRESHOLD, ... Comper_CONFIRM_THE_TIME_SLICE_THRESHOLD); obj.DoubleComperMap = obj.DoubleComper_Process(); Map = obj.DoubleComperMap; end %% 修改频点2 各种确认参数修改函数 function Map = ChangeComper2queren(obj, ... Comper_TIME_SLICE_CONFIRMATION_TIMES, ... Comper_THE_THRESHOLD_FOR_THE_OSCILLATING_FIFO, ... Comper_IDLE_FIFO_THRESHOLD, ... Comper_MAXIMUM_VALUE_CONFIRMATION_TIME, ... Comper_MINIMUM_STABLE_FREQUENCY) obj.Comper2_Point.Changequerencanshu(Comper_TIME_SLICE_CONFIRMATION_TIMES, ... Comper_THE_THRESHOLD_FOR_THE_OSCILLATING_FIFO, ... Comper_IDLE_FIFO_THRESHOLD, ... Comper_MAXIMUM_VALUE_CONFIRMATION_TIME, ... Comper_MINIMUM_STABLE_FREQUENCY) obj.DoubleComperMap = obj.DoubleComper_Process(); Map = obj.DoubleComperMap; end %% 修改频点2 各种窗口大小修改函数 function Map = ChangeComper2Window(obj, ... Comper_THE_SIZE_OF_THE_OSCILLATING_FIFO, ... Comper_IDLE_FIFO_WINDOW_TIME) obj.Comper2_Point.Changechuangkou(Comper_THE_SIZE_OF_THE_OSCILLATING_FIFO, ... Comper_IDLE_FIFO_WINDOW_TIME); obj.DoubleComperMap = obj.DoubleComper_Process(); Map = obj.DoubleComperMap; end end end 检查代码中存在的问题
10-14
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值