So Long To DIM(), ARRAY_SIZE(), and...

本文介绍了在CE6中,为了减少运行时错误并提高代码质量,将数组元素计数宏dim替换为_countof的过程。_countof能有效避免对指针进行错误计数的情况,并在尝试这样做时给出明确的编译时错误。

原文地址

 

I’ve been doing some tidy up work in the driver code, and would like to draw your attention to a little something you will know too well:

     #define dim(x) (sizeof(x)/sizeof(x[0]))

Sadly, its time has come to be deprecated. What did we replace them with?

     _countof(x)

Why?

a)       It’s bad form to use Clipboard Inheritance (aka copy-n-paste)

b)       _countof(x) is new and improved!

In CE6 we added _countof() to the standard headers. It is a drop in replacement to get the count of an array with a nice little side effect. Notice the following:

     int y[3];

     int* x = y;

     const int count = dim(x);

Huge bug! The code tries to get the count of a pointer, the compiler happily compiles it, and now there is a nasty run-time bug to find.

But if _countof() is used and done so in c++, there will be compiler error:

     error C2784: 'char (*__countof_helper(_CountofType (&)[_SizeOfArray]))[_SizeOfArray]' : could not deduce template argument for '_CountofType (&)[_SizeOfArray]' from 'int *'

It’s a bit of a nasty error – due to _countof() using some template tricks to verify ‘x’ is really an array, not a pointer. But, as the saying goes, “Never put off to run-time, what can be done at compile time”. Those c++ gurus in the audience might be interested in the definition of _countof(), which is in public/common/sdk/inc/stdlib.h (as well as some ATL/MFC headers).

So what happens in a C file when that macro is used?

The same behavior as previously: it complies and you have a nasty bug to find. So this macro has the same behavior as previously in all cases except attempting to get the count of a pointer in a c++ file, in which it gives a “nice” loud compiler error.

So I implore you to start using _countof() in CE6 projects. 

 

For those who wish to go further, ‘dim’ and ‘ARRAY_SIZE’ can be depreciated by adding some code to a common BSP header:

     #ifndef dim

     // Prefer to use _countof() directly, replacing dim()

     #pragma deprecated("dim")

     #define dim(x) _countof(x)

     #endif

The complier will generate this warning if ‘dim’ is used:

     warning C4995: 'dim': name was marked as #pragma deprecated

 

But for now an ode to those fallen macros in my clean ups, so long to:

     ARRAYSIZE, ARRAY_SIZE, ARRAYSIZEOF, ARRSIZE, SIZEOF_ARRAY, ARRAY_LENGTH

     NUM_ELEMENTS, NELEMS, NUM, NUMBER_OF_ARRAY

     TABLE_COUNT, COUNTOF, ItemCount

     Dim, DIMOF

     CCHSIZEOF

Traceback (most recent call last): File "E:\earthquake\train_V2\train.py", line 321, in <module> trainOne("D1_TSLANet") File "E:\earthquake\train_V2\train.py", line 300, in trainOne doTrain(configs, model_name, train_loader, valid_loader) File "E:\earthquake\train_V2\train.py", line 192, in doTrain outputs = model(inputs) File "C:\Users\DELL\.conda\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "C:\Users\DELL\.conda\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "E:\earthquake\train_V2\Nets\D1_TSLANet.py", line 172, in forward x = tsla_blk(x) File "C:\Users\DELL\.conda\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "C:\Users\DELL\.conda\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "E:\earthquake\train_V2\Nets\D1_TSLANet.py", line 120, in forward x = x + self.drop_path(self.icb(self.norm2(self.asb(self.norm1(x))))) File "C:\Users\DELL\.conda\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "C:\Users\DELL\.conda\envs\torchgpu\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "E:\earthquake\train_V2\Nets\D1_TSLANet.py", line 92, in forward freq_mask = self.create_adaptive_high_freq_mask(x_fft) File "E:\earthquake\train_V2\Nets\D1_TSLANet.py", line 64, in create_adaptive_high_freq_mask energy = torch.abs(x_fft).pow(2).sum(dim=-1) RuntimeError: #ifdef __HIPCC__ #define ERROR_UNSUPPORTED_CAST ; // corresponds to aten/src/ATen/native/cuda/thread_constants.h #define CUDA_OR_ROCM_NUM_THREADS 256 // corresponds to aten/src/ATen/cuda/detail/OffsetCalculator.cuh #define MAX_DIMS 16 #ifndef __forceinline__ #define __forceinline__ inline __attribute__((always_inline)) #endif #else //TODO use _assert_fail, because assert is disabled in non-debug builds #define ERROR_UNSUPPORTED_CAST assert(false); #define CUDA_OR_ROCM_NUM_THREADS 128 #define MAX_DIMS 25 #endif #define POS_INFINITY __int_as_float(0x7f800000) #define INFINITY POS_INFINITY #define NEG_INFINITY __int_as_float(0xff800000) #define NAN __int_as_float(0x7fffffff) typedef long long int int64_t; typedef unsigned int uint32_t; typedef signed char int8_t; typedef unsigned char uint8_t; // NOTE: this MUST be "unsigned char"! "char" is equivalent to "signed char" typedef short int16_t; static_assert(sizeof(int64_t) == 8, "expected size does not match"); static_assert(sizeof(uint32_t) == 4, "expected size does not match"); static_assert(sizeof(int8_t) == 1, "expected size does not match"); constexpr int num_threads = CUDA_OR_ROCM_NUM_THREADS; constexpr int thread_work_size = 4; // TODO: make template substitution once we decide where those vars live constexpr int block_work_size = thread_work_size * num_threads; namespace std { template <class _Tp> _Tp&& __declval(int); template <class _Tp> _Tp __declval(long); template <class _Tp> decltype(__declval<_Tp>(0)) declval() noexcept; template <class _Tp, _Tp __v> struct integral_constant { static const _Tp value = __v; typedef _Tp value_type; typedef integral_constant type; }; typedef integral_constant<bool, true> true_type; typedef integral_constant<bool, false> false_type; // is_same, functional template <class _Tp, class _Up> struct is_same : public false_type {}; template <class _Tp> struct is_same<_Tp, _Tp> : public true_type {}; // is_integral, for some types. template <class _Tp> struct is_integral : public integral_constant<bool, false> {}; template <> struct is_integral<bool> : public integral_constant<bool, true> {}; template <> struct is_integral<char> : public integral_constant<bool, true> {}; template <> struct is_integral<short> : public integral_constant<bool, true> {}; template <> struct is_integral<int> : public integral_constant<bool, true> {}; template <> struct is_integral<long> : public integral_constant<bool, true> {}; template <> struct is_integral<long long> : public integral_constant<bool, true> {}; // enable_if, functional template <bool _C, typename _Tp> struct enable_if{}; template <typename _Tp> struct enable_if<true, _Tp>{ using type = _Tp; }; template <bool b, class T=void> using enable_if_t = typename enable_if<b,T>::type; template <class _Tp> struct remove_const {typedef _Tp type;}; template <class _Tp> struct remove_const<const _Tp> {typedef _Tp type;}; template <class _Tp> using remove_const_t = typename remove_const<_Tp>::type; template <class _Tp> struct remove_volatile {typedef _Tp type;}; template <class _Tp> struct remove_volatile<volatile _Tp> {typedef _Tp type;}; template <class _Tp> using remove_volatile_t = typename remove_volatile<_Tp>::type; template <class _Tp> struct remove_cv {typedef typename remove_volatile<typename remove_const<_Tp>::type>::type type;}; template <class _Tp> using remove_cv_t = typename remove_cv<_Tp>::type; template <class _Tp> struct __libcpp_is_floating_point : public false_type {}; template <> struct __libcpp_is_floating_point<float> : public true_type {}; template <> struct __libcpp_is_floating_point<double> : public true_type {}; template <> struct __libcpp_is_floating_point<long double> : public true_type {}; template <class _Tp> struct is_floating_point : public __libcpp_is_floating_point<typename remove_cv<_Tp>::type> {}; template <class _Tp> struct is_arithmetic : public integral_constant<bool, is_integral<_Tp>::value || is_floating_point<_Tp>::value> {}; template <class _Tp> inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value; template <class _Tp> struct __numeric_type { static void __test(...); static float __test(float); static double __test(char); static double __test(int); static double __test(unsigned); static double __test(long); static double __test(unsigned long); static double __test(long long); static double __test(unsigned long long); static double __test(double); static long double __test(long double); typedef decltype(__test(declval<_Tp>())) type; static const bool value = !is_same<type, void>::value; }; template <> struct __numeric_type<void> { static const bool value = true; }; // __promote template <class _A1, class _A2 = void, class _A3 = void, bool = __numeric_type<_A1>::value && __numeric_type<_A2>::value && __numeric_type<_A3>::value> class __promote_imp { public: static const bool value = false; }; template <class _A1, class _A2, class _A3> class __promote_imp<_A1, _A2, _A3, true> { private: typedef typename __promote_imp<_A1>::type __type1; typedef typename __promote_imp<_A2>::type __type2; typedef typename __promote_imp<_A3>::type __type3; public: typedef decltype(__type1() + __type2() + __type3()) type; static const bool value = true; }; template <class _A1, class _A2> class __promote_imp<_A1, _A2, void, true> { private: typedef typename __promote_imp<_A1>::type __type1; typedef typename __promote_imp<_A2>::type __type2; public: typedef decltype(__type1() + __type2()) type; static const bool value = true; }; template <class _A1> class __promote_imp<_A1, void, void, true> { public: typedef typename __numeric_type<_A1>::type type; static const bool value = true; }; template <class _A1, class _A2 = void, class _A3 = void> class __promote : public __promote_imp<_A1, _A2, _A3> {}; } // namespace std namespace std { using ::signbit; using ::isfinite; using ::isinf; using ::isnan; using ::abs; using ::acos; using ::acosf; using ::asin; using ::asinf; using ::atan; using ::atanf; using ::atan2; using ::atan2f; using ::ceil; using ::ceilf; using ::cos; using ::cosf; using ::cosh; using ::coshf; using ::exp; using ::expf; using ::fabs; using ::fabsf; using ::floor; using ::floorf; using ::fmod; using ::fmodf; using ::frexp; using ::frexpf; using ::ldexp; using ::ldexpf; using ::log; using ::logf; using ::log10; using ::log10f; using ::modf; using ::modff; using ::pow; using ::powf; using ::sin; using ::sinf; using ::sinh; using ::sinhf; using ::sqrt; using ::sqrtf; using ::tan; using ::tanf; using ::tanh; using ::tanhf; using ::acosh; using ::acoshf; using ::asinh; using ::asinhf; using ::atanh; using ::atanhf; using ::cbrt; using ::cbrtf; using ::copysign; using ::copysignf; using ::erf; using ::erff; using ::erfc; using ::erfcf; using ::exp2; using ::exp2f; using ::expm1; using ::expm1f; using ::fdim; using ::fdimf; using ::fmaf; using ::fma; using ::fmax; using ::fmaxf; using ::fmin; using ::fminf; using ::hypot; using ::hypotf; using ::ilogb; using ::ilogbf; using ::lgamma; using ::lgammaf; using ::llrint; using ::llrintf; using ::llround; using ::llroundf; using ::log1p; using ::log1pf; using ::log2; using ::log2f; using ::logb; using ::logbf; using ::lrint; using ::lrintf; using ::lround; using ::lroundf; using ::nan; using ::nanf; using ::nearbyint; using ::nearbyintf; using ::nextafter; using ::nextafterf; using ::remainder; using ::remainderf; using ::remquo; using ::remquof; using ::rint; using ::rintf; using ::round; using ::roundf; using ::scalbln; using ::scalblnf; using ::scalbn; using ::scalbnf; using ::tgamma; using ::tgammaf; using ::trunc; using ::truncf; } // namespace std // NB: Order matters for this macro; it is relied upon in // _promoteTypesLookup and the serialization format. // Note, some types have ctype as void because we don't support them in codegen #define AT_FORALL_SCALAR_TYPES_WITH_COMPLEX(_) \ _(uint8_t, Byte) /* 0 */ \ _(int8_t, Char) /* 1 */ \ _(int16_t, Short) /* 2 */ \ _(int, Int) /* 3 */ \ _(int64_t, Long) /* 4 */ \ _(at::Half, Half) /* 5 */ \ _(float, Float) /* 6 */ \ _(double, Double) /* 7 */ \ _(std::complex<at::Half>, ComplexHalf) /* 8 */ \ _(std::complex<float>, ComplexFloat) /* 9 */ \ _(std::complex<double>, ComplexDouble) /* 10 */ \ _(bool, Bool) /* 11 */ \ _(void, QInt8) /* 12 */ \ _(void, QUInt8) /* 13 */ \ _(void, QInt32) /* 14 */ \ _(at::BFloat16, BFloat16) /* 15 */ \ #define AT_FORALL_SCALAR_TYPES_WITH_COMPLEX_EXCEPT_QINT(_) \ _(uint8_t, Byte) \ _(int8_t, Char) \ _(int16_t, Short) \ _(int, Int) \ _(int64_t, Long) \ _(at::Half, Half) \ _(float, Float) \ _(double, Double) \ _(std::complex<at::Half>, ComplexHalf) \ _(std::complex<float>, ComplexFloat) \ _(std::complex<double>, ComplexDouble) \ _(bool, Bool) \ _(at::BFloat16, BFloat16) enum class ScalarType : int8_t { #define DEFINE_ENUM(_1, n) n, AT_FORALL_SCALAR_TYPES_WITH_COMPLEX(DEFINE_ENUM) #undef DEFINE_ENUM Undefined, NumOptions }; template <typename T, int size> struct Array { T data[size]; __device__ T operator[](int i) const { return data[i]; } __device__ T& operator[](int i) { return data[i]; } Array() = default; Array(const Array&) = default; Array& operator=(const Array&) = default; __device__ Array(T x) { for (int i = 0; i < size; i++) { data[i] = x; } } }; namespace std { template<class _Tp> class complex; template<class _Tp> complex<_Tp> operator*(const complex<_Tp>& __z, const complex<_Tp>& __w); template<class _Tp> complex<_Tp> operator/(const complex<_Tp>& __x, const complex<_Tp>& __y); template<class _Tp> class complex { public: typedef _Tp value_type; private: value_type __re_; value_type __im_; public: constexpr complex(const value_type& __re = value_type(), const value_type& __im = value_type()) : __re_(__re), __im_(__im) {} template<class _Xp> constexpr complex(const complex<_Xp>& __c) : __re_(__c.real()), __im_(__c.imag()) {} constexpr value_type real() const {return __re_;} constexpr value_type imag() const {return __im_;} void real(value_type __re) {__re_ = __re;} void imag(value_type __im) {__im_ = __im;} constexpr operator bool() const { return real() || imag(); } complex& operator= (const value_type& __re) {__re_ = __re; __im_ = value_type(); return *this;} complex& operator+=(const value_type& __re) {__re_ += __re; return *this;} complex& operator-=(const value_type& __re) {__re_ -= __re; return *this;} complex& operator*=(const value_type& __re) {__re_ *= __re; __im_ *= __re; return *this;} complex& operator/=(const value_type& __re) {__re_ /= __re; __im_ /= __re; return *this;} template<class _Xp> complex& operator= (const complex<_Xp>& __c) { __re_ = __c.real(); __im_ = __c.imag(); return *this; } template<class _Xp> complex& operator+=(const complex<_Xp>& __c) { __re_ += __c.real(); __im_ += __c.imag(); return *this; } template<class _Xp> complex& operator-=(const complex<_Xp>& __c) { __re_ -= __c.real(); __im_ -= __c.imag(); return *this; } template<class _Xp> complex& operator*=(const complex<_Xp>& __c) { *this = *this * complex(__c.real(), __c.imag()); return *this; } template<class _Xp> complex& operator/=(const complex<_Xp>& __c) { *this = *this / complex(__c.real(), __c.imag()); return *this; } }; template<> class complex<double>; template<> class complex<float> { float __re_; float __im_; public: typedef float value_type; constexpr complex(float __re = 0.0f, float __im = 0.0f) : __re_(__re), __im_(__im) {} explicit constexpr complex(const complex<double>& __c); constexpr float real() const {return __re_;} constexpr float imag() const {return __im_;} void real(value_type __re) {__re_ = __re;} void imag(value_type __im) {__im_ = __im;} constexpr operator bool() const { return real() || imag(); } complex& operator= (float __re) {__re_ = __re; __im_ = value_type(); return *this;} complex& operator+=(float __re) {__re_ += __re; return *this;} complex& operator-=(float __re) {__re_ -= __re; return *this;} complex& operator*=(float __re) {__re_ *= __re; __im_ *= __re; return *this;} complex& operator/=(float __re) {__re_ /= __re; __im_ /= __re; return *this;} template<class _Xp> complex& operator= (const complex<_Xp>& __c) { __re_ = __c.real(); __im_ = __c.imag(); return *this; } template<class _Xp> complex& operator+=(const complex<_Xp>& __c) { __re_ += __c.real(); __im_ += __c.imag(); return *this; } template<class _Xp> complex& operator-=(const complex<_Xp>& __c) { __re_ -= __c.real(); __im_ -= __c.imag(); return *this; } template<class _Xp> complex& operator*=(const complex<_Xp>& __c) { *this = *this * complex(__c.real(), __c.imag()); return *this; } template<class _Xp> complex& operator/=(const complex<_Xp>& __c) { *this = *this / complex(__c.real(), __c.imag()); return *this; } }; template<> class complex<double> { double __re_; double __im_; public: typedef double value_type; constexpr complex(double __re = 0.0, double __im = 0.0) : __re_(__re), __im_(__im) {} constexpr complex(const complex<float>& __c); constexpr double real() const {return __re_;} constexpr double imag() const {return __im_;} void real(value_type __re) {__re_ = __re;} void imag(value_type __im) {__im_ = __im;} constexpr operator bool() const { return real() || imag(); } complex& operator= (double __re) {__re_ = __re; __im_ = value_type(); return *this;} complex& operator+=(double __re) {__re_ += __re; return *this;} complex& operator-=(double __re) {__re_ -= __re; return *this;} complex& operator*=(double __re) {__re_ *= __re; __im_ *= __re; return *this;} complex& operator/=(double __re) {__re_ /= __re; __im_ /= __re; return *this;} template<class _Xp> complex& operator= (const complex<_Xp>& __c) { __re_ = __c.real(); __im_ = __c.imag(); return *this; } template<class _Xp> complex& operator+=(const complex<_Xp>& __c) { __re_ += __c.real(); __im_ += __c.imag(); return *this; } template<class _Xp> complex& operator-=(const complex<_Xp>& __c) { __re_ -= __c.real(); __im_ -= __c.imag(); return *this; } template<class _Xp> complex& operator*=(const complex<_Xp>& __c) { *this = *this * complex(__c.real(), __c.imag()); return *this; } template<class _Xp> complex& operator/=(const complex<_Xp>& __c) { *this = *this / complex(__c.real(), __c.imag()); return *this; } }; inline constexpr complex<float>::complex(const complex<double>& __c) : __re_(__c.real()), __im_(__c.imag()) {} inline constexpr complex<double>::complex(const complex<float>& __c) : __re_(__c.real()), __im_(__c.imag()) {} // 26.3.6 operators: template<class _Tp> inline complex<_Tp> operator+(const complex<_Tp>& __x, const complex<_Tp>& __y) { complex<_Tp> __t(__x); __t += __y; return __t; } template<class _Tp> inline complex<_Tp> operator+(const complex<_Tp>& __x, const _Tp& __y) { complex<_Tp> __t(__x); __t += __y; return __t; } template<class _Tp> inline complex<_Tp> operator+(const _Tp& __x, const complex<_Tp>& __y) { complex<_Tp> __t(__y); __t += __x; return __t; } template<class _Tp> inline complex<_Tp> operator-(const complex<_Tp>& __x, const complex<_Tp>& __y) { complex<_Tp> __t(__x); __t -= __y; return __t; } template<class _Tp> inline complex<_Tp> operator-(const complex<_Tp>& __x, const _Tp& __y) { complex<_Tp> __t(__x); __t -= __y; return __t; } template<class _Tp> inline complex<_Tp> operator-(const _Tp& __x, const complex<_Tp>& __y) { complex<_Tp> __t(-__y); __t += __x; return __t; } template<class _Tp> complex<_Tp> operator*(const complex<_Tp>& __z, const complex<_Tp>& __w) { _Tp __a = __z.real(); _Tp __b = __z.imag(); _Tp __c = __w.real(); _Tp __d = __w.imag(); _Tp __ac = __a * __c; _Tp __bd = __b * __d; _Tp __ad = __a * __d; _Tp __bc = __b * __c; _Tp __x = __ac - __bd; _Tp __y = __ad + __bc; if (isnan(__x) && isnan(__y)) { bool __recalc = false; if (isinf(__a) || isinf(__b)) { __a = copysign(isinf(__a) ? _Tp(1) : _Tp(0), __a); __b = copysign(isinf(__b) ? _Tp(1) : _Tp(0), __b); if (isnan(__c)) __c = copysign(_Tp(0), __c); if (isnan(__d)) __d = copysign(_Tp(0), __d); __recalc = true; } if (isinf(__c) || isinf(__d)) { __c = copysign(isinf(__c) ? _Tp(1) : _Tp(0), __c); __d = copysign(isinf(__d) ? _Tp(1) : _Tp(0), __d); if (isnan(__a)) __a = copysign(_Tp(0), __a); if (isnan(__b)) __b = copysign(_Tp(0), __b); __recalc = true; } if (!__recalc && (isinf(__ac) || isinf(__bd) || isinf(__ad) || isinf(__bc))) { if (isnan(__a)) __a = copysign(_Tp(0), __a); if (isnan(__b)) __b = copysign(_Tp(0), __b); if (isnan(__c)) __c = copysign(_Tp(0), __c); if (isnan(__d)) __d = copysign(_Tp(0), __d); __recalc = true; } if (__recalc) { __x = _Tp(INFINITY) * (__a * __c - __b * __d); __y = _Tp(INFINITY) * (__a * __d + __b * __c); } } return complex<_Tp>(__x, __y); } template<class _Tp> inline complex<_Tp> operator*(const complex<_Tp>& __x, const _Tp& __y) { complex<_Tp> __t(__x); __t *= __y; return __t; } template<class _Tp> inline complex<_Tp> operator*(const _Tp& __x, const complex<_Tp>& __y) { complex<_Tp> __t(__y); __t *= __x; return __t; } template<class _Tp> complex<_Tp> operator/(const complex<_Tp>& __z, const complex<_Tp>& __w) { int __ilogbw = 0; _Tp __a = __z.real(); _Tp __b = __z.imag(); _Tp __c = __w.real(); _Tp __d = __w.imag(); _Tp __logbw = logb(fmax(fabs(__c), fabs(__d))); if (isfinite(__logbw)) { __ilogbw = static_cast<int>(__logbw); __c = scalbn(__c, -__ilogbw); __d = scalbn(__d, -__ilogbw); } _Tp __denom = __c * __c + __d * __d; _Tp __x = scalbn((__a * __c + __b * __d) / __denom, -__ilogbw); _Tp __y = scalbn((__b * __c - __a * __d) / __denom, -__ilogbw); if (isnan(__x) && isnan(__y)) { if ((__denom == _Tp(0)) && (!isnan(__a) || !isnan(__b))) { __x = copysign(_Tp(INFINITY), __c) * __a; __y = copysign(_Tp(INFINITY), __c) * __b; } else if ((isinf(__a) || isinf(__b)) && isfinite(__c) && isfinite(__d)) { __a = copysign(isinf(__a) ? _Tp(1) : _Tp(0), __a); __b = copysign(isinf(__b) ? _Tp(1) : _Tp(0), __b); __x = _Tp(INFINITY) * (__a * __c + __b * __d); __y = _Tp(INFINITY) * (__b * __c - __a * __d); } else if (isinf(__logbw) && __logbw > _Tp(0) && isfinite(__a) && isfinite(__b)) { __c = copysign(isinf(__c) ? _Tp(1) : _Tp(0), __c); __d = copysign(isinf(__d) ? _Tp(1) : _Tp(0), __d); __x = _Tp(0) * (__a * __c + __b * __d); __y = _Tp(0) * (__b * __c - __a * __d); } } return complex<_Tp>(__x, __y); } template<class _Tp> inline complex<_Tp> operator/(const complex<_Tp>& __x, const _Tp& __y) { return complex<_Tp>(__x.real() / __y, __x.imag() / __y); } template<class _Tp> inline complex<_Tp> operator/(const _Tp& __x, const complex<_Tp>& __y) { complex<_Tp> __t(__x); __t /= __y; return __t; } template<class _Tp> inline complex<_Tp> operator+(const complex<_Tp>& __x) { return __x; } template<class _Tp> inline complex<_Tp> operator-(const complex<_Tp>& __x) { return complex<_Tp>(-__x.real(), -__x.imag()); } template<class _Tp> inline constexpr bool operator==(const complex<_Tp>& __x, const complex<_Tp>& __y) { return __x.real() == __y.real() && __x.imag() == __y.imag(); } template<class _Tp> inline constexpr bool operator==(const complex<_Tp>& __x, const _Tp& __y) { return __x.real() == __y && __x.imag() == 0; } template<class _Tp> inline constexpr bool operator==(const _Tp& __x, const complex<_Tp>& __y) { return __x == __y.real() && 0 == __y.imag(); } template<class _Tp> inline constexpr bool operator!=(const complex<_Tp>& __x, const complex<_Tp>& __y) { return !(__x == __y); } template<class _Tp> inline constexpr bool operator!=(const complex<_Tp>& __x, const _Tp& __y) { return !(__x == __y); } template<class _Tp> inline constexpr bool operator!=(const _Tp& __x, const complex<_Tp>& __y) { return !(__x == __y); } template<class _Tp> inline constexpr bool operator&&(const complex<_Tp>& __x, const complex<_Tp>& __y) { return bool(__x) && bool(__y); } template<class _Tp> inline constexpr bool isnan(const complex<_Tp>& __x) { return isnan(__x.real()) || isnan(__x.imag()); } template<class _Tp> inline constexpr bool operator||(const complex<_Tp>& __x, const complex<_Tp>& __y) { return bool(__x) || bool(__y); } // 26.3.7 values: template <class _Tp, bool = is_integral<_Tp>::value, bool = is_floating_point<_Tp>::value > struct __libcpp_complex_overload_traits {}; // Integral Types template <class _Tp> struct __libcpp_complex_overload_traits<_Tp, true, false> { typedef double _ValueType; typedef complex<double> _ComplexType; }; // Floating point types template <class _Tp> struct __libcpp_complex_overload_traits<_Tp, false, true> { typedef _Tp _ValueType; typedef complex<_Tp> _ComplexType; }; // real template<class _Tp> inline constexpr _Tp real(const complex<_Tp>& __c) { return __c.real(); } template <class _Tp> inline constexpr typename __libcpp_complex_overload_traits<_Tp>::_ValueType real(_Tp __re) { return __re; } // imag template<class _Tp> inline constexpr _Tp imag(const complex<_Tp>& __c) { return __c.imag(); } template <class _Tp> inline constexpr typename __libcpp_complex_overload_traits<_Tp>::_ValueType imag(_Tp) { return 0; } // abs template<class _Tp> inline _Tp abs(const complex<_Tp>& __c) { return hypot(__c.real(), __c.imag()); } // arg template<class _Tp> inline _Tp arg(const complex<_Tp>& __c) { return atan2(__c.imag(), __c.real()); } template<class _Tp> inline typename enable_if < is_integral<_Tp>::value || is_same<_Tp, double>::value, double >::type arg(_Tp __re) { return atan2(0., __re); } template <class _Tp> inline typename enable_if< is_same<_Tp, float>::value, float >::type arg(_Tp __re) { return atan2f(0.F, __re); } } namespace std { // norm template<class _Tp> inline _Tp norm(const complex<_Tp>& __c) { if (isinf(__c.real())) return abs(__c.real()); if (isinf(__c.imag())) return abs(__c.imag()); return __c.real() * __c.real() + __c.imag() * __c.imag(); } template <class _Tp> inline typename __libcpp_complex_overload_traits<_Tp>::_ValueType norm(_Tp __re) { typedef typename __libcpp_complex_overload_traits<_Tp>::_ValueType _ValueType; return static_cast<_ValueType>(__re) * __re; } // conj template<class _Tp> inline complex<_Tp> conj(const complex<_Tp>& __c) { return complex<_Tp>(__c.real(), -__c.imag()); } template <class _Tp> inline typename __libcpp_complex_overload_traits<_Tp>::_ComplexType conj(_Tp __re) { typedef typename __libcpp_complex_overload_traits<_Tp>::_ComplexType _ComplexType; return _ComplexType(__re); } // proj template<class _Tp> inline complex<_Tp> proj(const complex<_Tp>& __c) { complex<_Tp> __r = __c; if (isinf(__c.real()) || isinf(__c.imag())) __r = complex<_Tp>(INFINITY, copysign(_Tp(0), __c.imag())); return __r; } template <class _Tp> inline typename enable_if < is_floating_point<_Tp>::value, typename __libcpp_complex_overload_traits<_Tp>::_ComplexType >::type proj(_Tp __re) { if (isinf(__re)) __re = abs(__re); return complex<_Tp>(__re); } template <class _Tp> inline typename enable_if < is_integral<_Tp>::value, typename __libcpp_complex_overload_traits<_Tp>::_ComplexType >::type proj(_Tp __re) { typedef typename __libcpp_complex_overload_traits<_Tp>::_ComplexType _ComplexType; return _ComplexType(__re); } // polar template<class _Tp> complex<_Tp> polar(const _Tp& __rho, const _Tp& __theta = _Tp()) { if (isnan(__rho) || signbit(__rho)) return complex<_Tp>(_Tp(NAN), _Tp(NAN)); if (isnan(__theta)) { if (isinf(__rho)) return complex<_Tp>(__rho, __theta); return complex<_Tp>(__theta, __theta); } if (isinf(__theta)) { if (isinf(__rho)) return complex<_Tp>(__rho, _Tp(NAN)); return complex<_Tp>(_Tp(NAN), _Tp(NAN)); } _Tp __x = __rho * cos(__theta); if (isnan(__x)) __x = 0; _Tp __y = __rho * sin(__theta); if (isnan(__y)) __y = 0; return complex<_Tp>(__x, __y); } // log template<class _Tp> inline complex<_Tp> log(const complex<_Tp>& __x) { return complex<_Tp>(log(abs(__x)), arg(__x)); } // log10 template<class _Tp> inline complex<_Tp> log10(const complex<_Tp>& __x) { return log(__x) / log(_Tp(10)); } // log2 template<class _Tp> inline complex<_Tp> log2(const complex<_Tp>& __x) { return log(__x) / log(_Tp(2)); } // sqrt template<class _Tp> complex<_Tp> sqrt(const complex<_Tp>& __x) { if (isinf(__x.imag())) return complex<_Tp>(_Tp(INFINITY), __x.imag()); if (isinf(__x.real())) { if (__x.real() > _Tp(0)) return complex<_Tp>(__x.real(), isnan(__x.imag()) ? __x.imag() : copysign(_Tp(0), __x.imag())); return complex<_Tp>(isnan(__x.imag()) ? __x.imag() : _Tp(0), copysign(__x.real(), __x.imag())); } return polar(sqrt(abs(__x)), arg(__x) / _Tp(2)); } // exp template<class _Tp> complex<_Tp> exp(const complex<_Tp>& __x) { _Tp __i = __x.imag(); if (__i == 0) { return complex<_Tp>(exp(__x.real()), copysign(_Tp(0), __x.imag())); } if (isinf(__x.real())) { if (__x.real() < _Tp(0)) { if (!isfinite(__i)) __i = _Tp(1); } else if (__i == 0 || !isfinite(__i)) { if (isinf(__i)) __i = _Tp(NAN); return complex<_Tp>(__x.real(), __i); } } _Tp __e = exp(__x.real()); return complex<_Tp>(__e * cos(__i), __e * sin(__i)); } // pow template<class _Tp> inline complex<_Tp> pow(const complex<_Tp>& __x, const complex<_Tp>& __y) { return exp(__y * log(__x)); } template<class _Tp, class _Up> inline complex<typename __promote<_Tp, _Up>::type> pow(const complex<_Tp>& __x, const complex<_Up>& __y) { typedef complex<typename __promote<_Tp, _Up>::type> result_type; return std::pow(result_type(__x), result_type(__y)); } template<class _Tp, class _Up> inline typename enable_if < is_arithmetic<_Up>::value, complex<typename __promote<_Tp, _Up>::type> >::type pow(const complex<_Tp>& __x, const _Up& __y) { typedef complex<typename __promote<_Tp, _Up>::type> result_type; return std::pow(result_type(__x), result_type(__y)); } template<class _Tp, class _Up> inline typename enable_if < is_arithmetic<_Tp>::value, complex<typename __promote<_Tp, _Up>::type> >::type pow(const _Tp& __x, const complex<_Up>& __y) { typedef complex<typename __promote<_Tp, _Up>::type> result_type; return std::pow(result_type(__x), result_type(__y)); } // __sqr, computes pow(x, 2) template<class _Tp> inline complex<_Tp> __sqr(const complex<_Tp>& __x) { return complex<_Tp>((__x.real() - __x.imag()) * (__x.real() + __x.imag()), _Tp(2) * __x.real() * __x.imag()); } // asinh template<class _Tp> complex<_Tp> asinh(const complex<_Tp>& __x) { const _Tp __pi(atan2(+0., -0.)); if (isinf(__x.real())) { if (isnan(__x.imag())) return __x; if (isinf(__x.imag())) return complex<_Tp>(__x.real(), copysign(__pi * _Tp(0.25), __x.imag())); return complex<_Tp>(__x.real(), copysign(_Tp(0), __x.imag())); } if (isnan(__x.real())) { if (isinf(__x.imag())) return complex<_Tp>(__x.imag(), __x.real()); if (__x.imag() == 0) return __x; return complex<_Tp>(__x.real(), __x.real()); } if (isinf(__x.imag())) return complex<_Tp>(copysign(__x.imag(), __x.real()), copysign(__pi/_Tp(2), __x.imag())); complex<_Tp> __z = log(__x + sqrt(__sqr(__x) + _Tp(1))); return complex<_Tp>(copysign(__z.real(), __x.real()), copysign(__z.imag(), __x.imag())); } // acosh template<class _Tp> complex<_Tp> acosh(const complex<_Tp>& __x) { const _Tp __pi(atan2(+0., -0.)); if (isinf(__x.real())) { if (isnan(__x.imag())) return complex<_Tp>(abs(__x.real()), __x.imag()); if (isinf(__x.imag())) { if (__x.real() > 0) return complex<_Tp>(__x.real(), copysign(__pi * _Tp(0.25), __x.imag())); else return complex<_Tp>(-__x.real(), copysign(__pi * _Tp(0.75), __x.imag())); } if (__x.real() < 0) return complex<_Tp>(-__x.real(), copysign(__pi, __x.imag())); return complex<_Tp>(__x.real(), copysign(_Tp(0), __x.imag())); } if (isnan(__x.real())) { if (isinf(__x.imag())) return complex<_Tp>(abs(__x.imag()), __x.real()); return complex<_Tp>(__x.real(), __x.real()); } if (isinf(__x.imag())) return complex<_Tp>(abs(__x.imag()), copysign(__pi/_Tp(2), __x.imag())); complex<_Tp> __z = log(__x + sqrt(__sqr(__x) - _Tp(1))); return complex<_Tp>(copysign(__z.real(), _Tp(0)), copysign(__z.imag(), __x.imag())); } // atanh template<class _Tp> complex<_Tp> atanh(const complex<_Tp>& __x) { const _Tp __pi(atan2(+0., -0.)); if (isinf(__x.imag())) { return complex<_Tp>(copysign(_Tp(0), __x.real()), copysign(__pi/_Tp(2), __x.imag())); } if (isnan(__x.imag())) { if (isinf(__x.real()) || __x.real() == 0) return complex<_Tp>(copysign(_Tp(0), __x.real()), __x.imag()); return complex<_Tp>(__x.imag(), __x.imag()); } if (isnan(__x.real())) { return complex<_Tp>(__x.real(), __x.real()); } if (isinf(__x.real())) { return complex<_Tp>(copysign(_Tp(0), __x.real()), copysign(__pi/_Tp(2), __x.imag())); } if (abs(__x.real()) == _Tp(1) && __x.imag() == _Tp(0)) { return complex<_Tp>(copysign(_Tp(INFINITY), __x.real()), copysign(_Tp(0), __x.imag())); } complex<_Tp> __z = log((_Tp(1) + __x) / (_Tp(1) - __x)) / _Tp(2); return complex<_Tp>(copysign(__z.real(), __x.real()), copysign(__z.imag(), __x.imag())); } // sinh template<class _Tp> complex<_Tp> sinh(const complex<_Tp>& __x) { if (isinf(__x.real()) && !isfinite(__x.imag())) return complex<_Tp>(__x.real(), _Tp(NAN)); if (__x.real() == 0 && !isfinite(__x.imag())) return complex<_Tp>(__x.real(), _Tp(NAN)); if (__x.imag() == 0 && !isfinite(__x.real())) return __x; return complex<_Tp>(sinh(__x.real()) * cos(__x.imag()), cosh(__x.real()) * sin(__x.imag())); } // cosh template<class _Tp> complex<_Tp> cosh(const complex<_Tp>& __x) { if (isinf(__x.real()) && !isfinite(__x.imag())) return complex<_Tp>(abs(__x.real()), _Tp(NAN)); if (__x.real() == 0 && !isfinite(__x.imag())) return complex<_Tp>(_Tp(NAN), __x.real()); if (__x.real() == 0 && __x.imag() == 0) return complex<_Tp>(_Tp(1), __x.imag()); if (__x.imag() == 0 && !isfinite(__x.real())) return complex<_Tp>(abs(__x.real()), __x.imag()); return complex<_Tp>(cosh(__x.real()) * cos(__x.imag()), sinh(__x.real()) * sin(__x.imag())); } // tanh template<class _Tp> complex<_Tp> tanh(const complex<_Tp>& __x) { if (isinf(__x.real())) { if (!isfinite(__x.imag())) return complex<_Tp>(copysign(_Tp(1), __x.real()), _Tp(0)); return complex<_Tp>(copysign(_Tp(1), __x.real()), copysign(_Tp(0), sin(_Tp(2) * __x.imag()))); } if (isnan(__x.real()) && __x.imag() == 0) return __x; _Tp __2r(_Tp(2) * __x.real()); _Tp __2i(_Tp(2) * __x.imag()); _Tp __d(cosh(__2r) + cos(__2i)); _Tp __2rsh(sinh(__2r)); if (isinf(__2rsh) && isinf(__d)) return complex<_Tp>(__2rsh > _Tp(0) ? _Tp(1) : _Tp(-1), __2i > _Tp(0) ? _Tp(0) : _Tp(-0.)); return complex<_Tp>(__2rsh/__d, sin(__2i)/__d); } // asin template<class _Tp> complex<_Tp> asin(const complex<_Tp>& __x) { complex<_Tp> __z = asinh(complex<_Tp>(-__x.imag(), __x.real())); return complex<_Tp>(__z.imag(), -__z.real()); } // acos template<class _Tp> complex<_Tp> acos(const complex<_Tp>& __x) { const _Tp __pi(atan2(+0., -0.)); if (isinf(__x.real())) { if (isnan(__x.imag())) return complex<_Tp>(__x.imag(), __x.real()); if (isinf(__x.imag())) { if (__x.real() < _Tp(0)) return complex<_Tp>(_Tp(0.75) * __pi, -__x.imag()); return complex<_Tp>(_Tp(0.25) * __pi, -__x.imag()); } if (__x.real() < _Tp(0)) return complex<_Tp>(__pi, signbit(__x.imag()) ? -__x.real() : __x.real()); return complex<_Tp>(_Tp(0), signbit(__x.imag()) ? __x.real() : -__x.real()); } if (isnan(__x.real())) { if (isinf(__x.imag())) return complex<_Tp>(__x.real(), -__x.imag()); return complex<_Tp>(__x.real(), __x.real()); } if (isinf(__x.imag())) return complex<_Tp>(__pi/_Tp(2), -__x.imag()); if (__x.real() == 0 && (__x.imag() == 0 || isnan(__x.imag()))) return complex<_Tp>(__pi/_Tp(2), -__x.imag()); complex<_Tp> __z = log(__x + sqrt(__sqr(__x) - _Tp(1))); if (signbit(__x.imag())) return complex<_Tp>(abs(__z.imag()), abs(__z.real())); return complex<_Tp>(abs(__z.imag()), -abs(__z.real())); } // atan template<class _Tp> complex<_Tp> atan(const complex<_Tp>& __x) { complex<_Tp> __z = atanh(complex<_Tp>(-__x.imag(), __x.real())); return complex<_Tp>(__z.imag(), -__z.real()); } // sin template<class _Tp> complex<_Tp> sin(const complex<_Tp>& __x) { complex<_Tp> __z = sinh(complex<_Tp>(-__x.imag(), __x.real())); return complex<_Tp>(__z.imag(), -__z.real()); } // cos template<class _Tp> inline complex<_Tp> cos(const complex<_Tp>& __x) { return cosh(complex<_Tp>(-__x.imag(), __x.real())); } // tan template<class _Tp> complex<_Tp> tan(const complex<_Tp>& __x) { complex<_Tp> __z = tanh(complex<_Tp>(-__x.imag(), __x.real())); return complex<_Tp>(__z.imag(), -__z.real()); } // Literal suffix for complex number literals [complex.literals] inline namespace literals { inline namespace complex_literals { constexpr complex<double> operator""i(long double __im) { return { 0.0, static_cast<double>(__im) }; } constexpr complex<double> operator""i(unsigned long long __im) { return { 0.0, static_cast<double>(__im) }; } constexpr complex<float> operator""if(long double __im) { return { 0.0f, static_cast<float>(__im) }; } constexpr complex<float> operator""if(unsigned long long __im) { return { 0.0f, static_cast<float>(__im) }; } } // namespace complex_literals } // namespace literals } // namespace std namespace c10 { template <typename T> struct LoadImpl { __device__ static T apply(const void *src) { return *reinterpret_cast<const T*>(src); } }; template <> struct LoadImpl<bool> { __device__ static bool apply(const void *src) { static_assert(sizeof(bool) == sizeof(char), ""); return LoadImpl<char>::apply(src); } }; template <typename T> __device__ T load(const void *src) { return LoadImpl<T>::apply(src); } template <typename scalar_t> __device__ scalar_t load(const scalar_t *src) { return LoadImpl<scalar_t>::apply(src); } } // namespace c10 template <typename scalar_t> __device__ __inline__ scalar_t load(char* base_ptr, uint32_t offset) { return c10::load(reinterpret_cast<scalar_t*>(base_ptr) + offset); } template<typename scalar_t> __device__ __inline__ void store(scalar_t value, char *base_ptr, uint32_t offset) { *(reinterpret_cast<scalar_t *>(base_ptr) + offset) = value; } // aligned vector generates vectorized load/store on CUDA template<typename scalar_t, int vec_size> struct alignas(sizeof(scalar_t) * vec_size) aligned_vector { scalar_t val[vec_size]; }; template <int vec_size, typename scalar_t> __device__ aligned_vector<scalar_t, vec_size> load_vector(const scalar_t *base_ptr, uint32_t offset) { using vec_t = aligned_vector<scalar_t, vec_size>; auto *from = reinterpret_cast<const vec_t *>(base_ptr); return from[offset]; } template <int vec_size> __device__ aligned_vector<bool, vec_size> load_vector(const bool *base_ptr, uint32_t offset) { // See NOTE [Loading boolean values] auto tmp = load_vector<vec_size>(reinterpret_cast<const uint8_t*>(base_ptr), offset); aligned_vector<bool, vec_size> ret; for (int i = 0; i < vec_size; ++i) { ret.val[i] = bool(tmp.val[i]); } return ret; } template <typename T> T abs_kernel(T x) { return std::abs(x); } // TODO: setup grid-stride loop extern "C" __global__ void abs_kernel_vectorized4_kernel( const int N, Array<char*, 1+1> data, std::complex<float> scalar_val) //[1+1], { constexpr int vec_size = 4; using scalar_t = std::complex<float>; int remaining = N - block_work_size * blockIdx.x; int thread_idx = threadIdx.x; int idx = blockIdx.x; std::complex<float> arg0[4]; std::complex<float> out0[4]; if (remaining < block_work_size) { #pragma unroll for (int j = 0; j < thread_work_size; j++){ if (thread_idx >= remaining) { break; } int linear_idx = thread_idx + block_work_size * idx; arg0[j] = load<std::complex<float>>(data[1], linear_idx); thread_idx += num_threads; } #pragma unroll for (int j = 0; j < thread_work_size; j++) { if ((threadIdx.x + j*num_threads) < remaining) { out0[j] = abs_kernel<std::complex<float>>(arg0[j] ); } } thread_idx = threadIdx.x; #pragma unroll for (int j = 0; j < thread_work_size; j++) { if (thread_idx >= remaining) { break; } int linear_idx = thread_idx + block_work_size * idx; store<std::complex<float>>(out0[j], data[0], linear_idx); thread_idx += num_threads; } } else { static constexpr int loop_size = thread_work_size / vec_size; //actual loading auto * input0 = reinterpret_cast<const scalar_t*>(data[0+1]) + block_work_size * idx; #pragma unroll for (int i = 0; i<loop_size; i++){ const auto vec0 = load_vector<vec_size>(input0, thread_idx); #pragma unroll for (int j=0; j < vec_size; j++){ arg0[vec_size * i + j] = vec0.val[j]; } thread_idx += num_threads; } #pragma unroll for (int j = 0; j < thread_work_size; j++) { out0[j] = abs_kernel<std::complex<float>>(arg0[j] ); } using vec_t_output = aligned_vector<std::complex<float>, vec_size>; vec_t_output* to_0 = reinterpret_cast<vec_t_output*>(data[0]) + block_work_size / vec_size * idx; int thread_idx = threadIdx.x; #pragma unroll for (int i = 0; i<loop_size; i++){ vec_t_output v; #pragma unroll for (int j=0; j<vec_size; j++){ v.val[j] = out0[vec_size * i + j]; } to_0[thread_idx] = v; thread_idx += num_threads; } } } nvrtc: error: failed to open nvrtc-builtins64_121.dll. Make sure that nvrtc-builtins64_121.dll is installed correctly. 分析这个报错,用中文回答
10-03
#下面程序运行时报错: C:\Users\Administrator\AppData\Local\Programs\Python\Python312\python.exe C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\utils\generic.py Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\utils\generic.py", line 34, in <module> from ..utils import logging ImportError: attempted relative import with no known parent package 进程已结束,退出代码为 1 ------------------------------------------------------------------------------------------------ import inspect import json import os import tempfile import warnings from collections import OrderedDict, UserDict, defaultdict from collections.abc import Iterable, MutableMapping from contextlib import ExitStack, contextmanager from dataclasses import dataclass, fields, is_dataclass from enum import Enum from functools import partial, wraps from typing import Any, Callable, ContextManager, Optional, TypedDict import numpy as np from packaging import version from ..utils import logging from .import_utils import ( get_torch_version, is_flax_available, is_mlx_available, is_tf_available, is_torch_available, is_torch_fx_proxy, requires, ) _CAN_RECORD_REGISTRY = {} logger = logging.get_logger(__name__) if is_torch_available(): # required for @can_return_tuple decorator to work with torchdynamo import torch # noqa: F401 from ..model_debugging_utils import model_addition_debugger_context class cached_property(property): """ Descriptor that mimics @property but caches output in member variable. From tensorflow_datasets Built-in in functools from Python 3.8. """ def __get__(self, obj, objtype=None): # See docs.python.org/3/howto/descriptor.html#properties if obj is None: return self if self.fget is None: raise AttributeError("unreadable attribute") attr = "__cached_" + self.fget.__name__ cached = getattr(obj, attr, None) if cached is None: cached = self.fget(obj) setattr(obj, attr, cached) return cached # vendored from distutils.util def strtobool(val): """Convert a string representation of truth to true (1) or false (0). True values are 'y', 'yes', 't', 'true', 'on', and '1'; false values are 'n', 'no', 'f', 'false', 'off', and '0'. Raises ValueError if 'val' is anything else. """ val = val.lower() if val in {"y", "yes", "t", "true", "on", "1"}: return 1 if val in {"n", "no", "f", "false", "off", "0"}: return 0 raise ValueError(f"invalid truth value {val!r}") def infer_framework_from_repr(x): """ Tries to guess the framework of an object `x` from its repr (brittle but will help in `is_tensor` to try the frameworks in a smart order, without the need to import the frameworks). """ representation = str(type(x)) if representation.startswith("<class 'torch."): return "pt" elif representation.startswith("<class 'tensorflow."): return "tf" elif representation.startswith("<class 'jax"): return "jax" elif representation.startswith("<class 'numpy."): return "np" elif representation.startswith("<class 'mlx."): return "mlx" def _get_frameworks_and_test_func(x): """ Returns an (ordered since we are in Python 3.7+) dictionary framework to test function, which places the framework we can guess from the repr first, then Numpy, then the others. """ framework_to_test = { "pt": is_torch_tensor, "tf": is_tf_tensor, "jax": is_jax_tensor, "np": is_numpy_array, "mlx": is_mlx_array, } preferred_framework = infer_framework_from_repr(x) # We will test this one first, then numpy, then the others. frameworks = [] if preferred_framework is None else [preferred_framework] if preferred_framework != "np": frameworks.append("np") frameworks.extend([f for f in framework_to_test if f not in [preferred_framework, "np"]]) return {f: framework_to_test[f] for f in frameworks} def is_tensor(x): """ Tests if `x` is a `torch.Tensor`, `tf.Tensor`, `jaxlib.xla_extension.DeviceArray`, `np.ndarray` or `mlx.array` in the order defined by `infer_framework_from_repr` """ # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(x) for test_func in framework_to_test_func.values(): if test_func(x): return True # Tracers if is_torch_fx_proxy(x): return True if is_flax_available(): from jax.core import Tracer if isinstance(x, Tracer): return True return False def _is_numpy(x): return isinstance(x, np.ndarray) def is_numpy_array(x): """ Tests if `x` is a numpy array or not. """ return _is_numpy(x) def _is_torch(x): import torch return isinstance(x, torch.Tensor) def is_torch_tensor(x): """ Tests if `x` is a torch tensor or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch(x) def _is_torch_device(x): import torch return isinstance(x, torch.device) def is_torch_device(x): """ Tests if `x` is a torch device or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch_device(x) def _is_torch_dtype(x): import torch if isinstance(x, str): if hasattr(torch, x): x = getattr(torch, x) else: return False return isinstance(x, torch.dtype) def is_torch_dtype(x): """ Tests if `x` is a torch dtype or not. Safe to call even if torch is not installed. """ return False if not is_torch_available() else _is_torch_dtype(x) def _is_tensorflow(x): import tensorflow as tf return isinstance(x, tf.Tensor) def is_tf_tensor(x): """ Tests if `x` is a tensorflow tensor or not. Safe to call even if tensorflow is not installed. """ return False if not is_tf_available() else _is_tensorflow(x) def _is_tf_symbolic_tensor(x): import tensorflow as tf # the `is_symbolic_tensor` predicate is only available starting with TF 2.14 if hasattr(tf, "is_symbolic_tensor"): return tf.is_symbolic_tensor(x) return isinstance(x, tf.Tensor) def is_tf_symbolic_tensor(x): """ Tests if `x` is a tensorflow symbolic tensor or not (ie. not eager). Safe to call even if tensorflow is not installed. """ return False if not is_tf_available() else _is_tf_symbolic_tensor(x) def _is_jax(x): import jax.numpy as jnp # noqa: F811 return isinstance(x, jnp.ndarray) def is_jax_tensor(x): """ Tests if `x` is a Jax tensor or not. Safe to call even if jax is not installed. """ return False if not is_flax_available() else _is_jax(x) def _is_mlx(x): import mlx.core as mx return isinstance(x, mx.array) def is_mlx_array(x): """ Tests if `x` is a mlx array or not. Safe to call even when mlx is not installed. """ return False if not is_mlx_available() else _is_mlx(x) def to_py_obj(obj): """ Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a python list. """ if isinstance(obj, (int, float)): return obj elif isinstance(obj, (dict, UserDict)): return {k: to_py_obj(v) for k, v in obj.items()} elif isinstance(obj, (list, tuple)): try: arr = np.array(obj) if np.issubdtype(arr.dtype, np.integer) or np.issubdtype(arr.dtype, np.floating): return arr.tolist() except Exception: pass return [to_py_obj(o) for o in obj] framework_to_py_obj = { "pt": lambda obj: obj.tolist(), "tf": lambda obj: obj.numpy().tolist(), "jax": lambda obj: np.asarray(obj).tolist(), "np": lambda obj: obj.tolist(), } # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(obj) for framework, test_func in framework_to_test_func.items(): if test_func(obj): return framework_to_py_obj[framework](obj) # tolist also works on 0d np arrays if isinstance(obj, np.number): return obj.tolist() else: return obj def to_numpy(obj): """ Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a Numpy array. """ framework_to_numpy = { "pt": lambda obj: obj.detach().cpu().numpy(), "tf": lambda obj: obj.numpy(), "jax": lambda obj: np.asarray(obj), "np": lambda obj: obj, } if isinstance(obj, (dict, UserDict)): return {k: to_numpy(v) for k, v in obj.items()} elif isinstance(obj, (list, tuple)): return np.array(obj) # This gives us a smart order to test the frameworks with the corresponding tests. framework_to_test_func = _get_frameworks_and_test_func(obj) for framework, test_func in framework_to_test_func.items(): if test_func(obj): return framework_to_numpy[framework](obj) return obj class ModelOutput(OrderedDict): """ Base class for all model outputs as dataclass. Has a `__getitem__` that allows indexing by integer or slice (like a tuple) or strings (like a dictionary) that will ignore the `None` attributes. Otherwise behaves like a regular python dictionary. <Tip warning={true}> You can't unpack a `ModelOutput` directly. Use the [`~utils.ModelOutput.to_tuple`] method to convert it to a tuple before. </Tip> """ def __init_subclass__(cls) -> None: """Register subclasses as pytree nodes. This is necessary to synchronize gradients when using `torch.nn.parallel.DistributedDataParallel` with `static_graph=True` with modules that output `ModelOutput` subclasses. """ if is_torch_available(): if version.parse(get_torch_version()) >= version.parse("2.2"): from torch.utils._pytree import register_pytree_node register_pytree_node( cls, _model_output_flatten, partial(_model_output_unflatten, output_type=cls), serialized_type_name=f"{cls.__module__}.{cls.__name__}", ) else: from torch.utils._pytree import _register_pytree_node _register_pytree_node( cls, _model_output_flatten, partial(_model_output_unflatten, output_type=cls), ) def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) # Subclasses of ModelOutput must use the @dataclass decorator # This check is done in __init__ because the @dataclass decorator operates after __init_subclass__ # issubclass() would return True for issubclass(ModelOutput, ModelOutput) when False is needed # Just need to check that the current class is not ModelOutput is_modeloutput_subclass = self.__class__ != ModelOutput if is_modeloutput_subclass and not is_dataclass(self): raise TypeError( f"{self.__module__}.{self.__class__.__name__} is not a dataclass." " This is a subclass of ModelOutput and so must use the @dataclass decorator." ) def __post_init__(self): """Check the ModelOutput dataclass. Only occurs if @dataclass decorator has been used. """ class_fields = fields(self) # Safety and consistency checks if not len(class_fields): raise ValueError(f"{self.__class__.__name__} has no fields.") if not all(field.default is None for field in class_fields[1:]): raise ValueError(f"{self.__class__.__name__} should not have more than one required field.") first_field = getattr(self, class_fields[0].name) other_fields_are_none = all(getattr(self, field.name) is None for field in class_fields[1:]) if other_fields_are_none and not is_tensor(first_field): if isinstance(first_field, dict): iterator = first_field.items() first_field_iterator = True else: try: iterator = iter(first_field) first_field_iterator = True except TypeError: first_field_iterator = False # if we provided an iterator as first field and the iterator is a (key, value) iterator # set the associated fields if first_field_iterator: for idx, element in enumerate(iterator): if not isinstance(element, (list, tuple)) or len(element) != 2 or not isinstance(element[0], str): if idx == 0: # If we do not have an iterator of key/values, set it as attribute self[class_fields[0].name] = first_field else: # If we have a mixed iterator, raise an error raise ValueError( f"Cannot set key/value for {element}. It needs to be a tuple (key, value)." ) break setattr(self, element[0], element[1]) if element[1] is not None: self[element[0]] = element[1] elif first_field is not None: self[class_fields[0].name] = first_field else: for field in class_fields: v = getattr(self, field.name) if v is not None: self[field.name] = v def __delitem__(self, *args, **kwargs): raise Exception(f"You cannot use ``__delitem__`` on a {self.__class__.__name__} instance.") def setdefault(self, *args, **kwargs): raise Exception(f"You cannot use ``setdefault`` on a {self.__class__.__name__} instance.") def pop(self, *args, **kwargs): raise Exception(f"You cannot use ``pop`` on a {self.__class__.__name__} instance.") def update(self, *args, **kwargs): raise Exception(f"You cannot use ``update`` on a {self.__class__.__name__} instance.") def __getitem__(self, k): if isinstance(k, str): inner_dict = dict(self.items()) return inner_dict[k] else: return self.to_tuple()[k] def __setattr__(self, name, value): if name in self.keys() and value is not None: # Don't call self.__setitem__ to avoid recursion errors super().__setitem__(name, value) super().__setattr__(name, value) def __setitem__(self, key, value): # Will raise a KeyException if needed super().__setitem__(key, value) # Don't call self.__setattr__ to avoid recursion errors super().__setattr__(key, value) def __reduce__(self): if not is_dataclass(self): return super().__reduce__() callable, _args, *remaining = super().__reduce__() args = tuple(getattr(self, field.name) for field in fields(self)) return callable, args, *remaining def to_tuple(self) -> tuple[Any]: """ Convert self to a tuple containing all the attributes/keys that are not `None`. """ return tuple(self[k] for k in self.keys()) if is_torch_available(): import torch.utils._pytree as _torch_pytree def _model_output_flatten(output: ModelOutput) -> tuple[list[Any], "_torch_pytree.Context"]: return list(output.values()), list(output.keys()) def _model_output_unflatten( values: Iterable[Any], context: "_torch_pytree.Context", output_type=None, ) -> ModelOutput: return output_type(**dict(zip(context, values))) if version.parse(get_torch_version()) >= version.parse("2.2"): _torch_pytree.register_pytree_node( ModelOutput, _model_output_flatten, partial(_model_output_unflatten, output_type=ModelOutput), serialized_type_name=f"{ModelOutput.__module__}.{ModelOutput.__name__}", ) else: _torch_pytree._register_pytree_node( ModelOutput, _model_output_flatten, partial(_model_output_unflatten, output_type=ModelOutput), ) class ExplicitEnum(str, Enum): """ Enum with more explicit error message for missing values. """ @classmethod def _missing_(cls, value): raise ValueError( f"{value} is not a valid {cls.__name__}, please select one of {list(cls._value2member_map_.keys())}" ) class PaddingStrategy(ExplicitEnum): """ Possible values for the `padding` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. """ LONGEST = "longest" MAX_LENGTH = "max_length" DO_NOT_PAD = "do_not_pad" class TensorType(ExplicitEnum): """ Possible values for the `return_tensors` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. """ PYTORCH = "pt" TENSORFLOW = "tf" NUMPY = "np" JAX = "jax" MLX = "mlx" class ContextManagers: """ Wrapper for `contextlib.ExitStack` which enters a collection of context managers. Adaptation of `ContextManagers` in the `fastcore` library. """ def __init__(self, context_managers: list[ContextManager]): self.context_managers = context_managers self.stack = ExitStack() def __enter__(self): for context_manager in self.context_managers: self.stack.enter_context(context_manager) def __exit__(self, *args, **kwargs): self.stack.__exit__(*args, **kwargs) def can_return_loss(model_class): """ Check if a given model can return loss. Args: model_class (`type`): The class of the model. """ framework = infer_framework(model_class) if framework == "tf": signature = inspect.signature(model_class.call) # TensorFlow models elif framework == "pt": signature = inspect.signature(model_class.forward) # PyTorch models else: signature = inspect.signature(model_class.__call__) # Flax models for p in signature.parameters: if p == "return_loss" and signature.parameters[p].default is True: return True return False def find_labels(model_class): """ Find the labels used by a given model. Args: model_class (`type`): The class of the model. """ model_name = model_class.__name__ framework = infer_framework(model_class) if framework == "tf": signature = inspect.signature(model_class.call) # TensorFlow models elif framework == "pt": signature = inspect.signature(model_class.forward) # PyTorch models else: signature = inspect.signature(model_class.__call__) # Flax models if "QuestionAnswering" in model_name: return [p for p in signature.parameters if "label" in p or p in ("start_positions", "end_positions")] else: return [p for p in signature.parameters if "label" in p] def flatten_dict(d: MutableMapping, parent_key: str = "", delimiter: str = "."): """Flatten a nested dict into a single level dict.""" def _flatten_dict(d, parent_key="", delimiter="."): for k, v in d.items(): key = str(parent_key) + delimiter + str(k) if parent_key else k if v and isinstance(v, MutableMapping): yield from flatten_dict(v, key, delimiter=delimiter).items() else: yield key, v return dict(_flatten_dict(d, parent_key, delimiter)) @contextmanager def working_or_temp_dir(working_dir, use_temp_dir: bool = False): if use_temp_dir: with tempfile.TemporaryDirectory() as tmp_dir: yield tmp_dir else: yield working_dir def transpose(array, axes=None): """ Framework-agnostic version of `numpy.transpose` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.transpose(array, axes=axes) elif is_torch_tensor(array): return array.T if axes is None else array.permute(*axes) elif is_tf_tensor(array): import tensorflow as tf return tf.transpose(array, perm=axes) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.transpose(array, axes=axes) else: raise ValueError(f"Type not supported for transpose: {type(array)}.") def reshape(array, newshape): """ Framework-agnostic version of `numpy.reshape` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.reshape(array, newshape) elif is_torch_tensor(array): return array.reshape(*newshape) elif is_tf_tensor(array): import tensorflow as tf return tf.reshape(array, newshape) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.reshape(array, newshape) else: raise ValueError(f"Type not supported for reshape: {type(array)}.") def squeeze(array, axis=None): """ Framework-agnostic version of `numpy.squeeze` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.squeeze(array, axis=axis) elif is_torch_tensor(array): return array.squeeze() if axis is None else array.squeeze(dim=axis) elif is_tf_tensor(array): import tensorflow as tf return tf.squeeze(array, axis=axis) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.squeeze(array, axis=axis) else: raise ValueError(f"Type not supported for squeeze: {type(array)}.") def expand_dims(array, axis): """ Framework-agnostic version of `numpy.expand_dims` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.expand_dims(array, axis) elif is_torch_tensor(array): return array.unsqueeze(dim=axis) elif is_tf_tensor(array): import tensorflow as tf return tf.expand_dims(array, axis=axis) elif is_jax_tensor(array): import jax.numpy as jnp return jnp.expand_dims(array, axis=axis) else: raise ValueError(f"Type not supported for expand_dims: {type(array)}.") def tensor_size(array): """ Framework-agnostic version of `numpy.size` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. """ if is_numpy_array(array): return np.size(array) elif is_torch_tensor(array): return array.numel() elif is_tf_tensor(array): import tensorflow as tf return tf.size(array) elif is_jax_tensor(array): return array.size else: raise ValueError(f"Type not supported for tensor_size: {type(array)}.") def infer_framework(model_class): """ Infers the framework of a given model without using isinstance(), because we cannot guarantee that the relevant classes are imported or available. """ for base_class in inspect.getmro(model_class): module = base_class.__module__ name = base_class.__name__ if module.startswith("tensorflow") or module.startswith("keras") or name == "TFPreTrainedModel": return "tf" elif module.startswith("torch") or name == "PreTrainedModel": return "pt" elif module.startswith("flax") or module.startswith("jax") or name == "FlaxPreTrainedModel": return "flax" else: raise TypeError(f"Could not infer framework from class {model_class}.") def torch_int(x): """ Casts an input to a torch int64 tensor if we are in a tracing context, otherwise to a Python int. """ if not is_torch_available(): return int(x) import torch return x.to(torch.int64) if torch.jit.is_tracing() and isinstance(x, torch.Tensor) else int(x) def torch_float(x): """ Casts an input to a torch float32 tensor if we are in a tracing context, otherwise to a Python float. """ if not is_torch_available(): return int(x) import torch return x.to(torch.float32) if torch.jit.is_tracing() and isinstance(x, torch.Tensor) else int(x) def filter_out_non_signature_kwargs(extra: Optional[list] = None): """ Decorator to filter out named arguments that are not in the function signature. This decorator ensures that only the keyword arguments that match the function's signature, or are specified in the `extra` list, are passed to the function. Any additional keyword arguments are filtered out and a warning is issued. Parameters: extra (`Optional[list]`, *optional*): A list of extra keyword argument names that are allowed even if they are not in the function's signature. Returns: Callable: A decorator that wraps the function and filters out invalid keyword arguments. Example usage: ```python @filter_out_non_signature_kwargs(extra=["allowed_extra_arg"]) def my_function(arg1, arg2, **kwargs): print(arg1, arg2, kwargs) my_function(arg1=1, arg2=2, allowed_extra_arg=3, invalid_arg=4) # This will print: 1 2 {"allowed_extra_arg": 3} # And issue a warning: "The following named arguments are not valid for `my_function` and were ignored: 'invalid_arg'" ``` """ extra = extra or [] extra_params_to_pass = set(extra) def decorator(func): sig = inspect.signature(func) function_named_args = set(sig.parameters.keys()) valid_kwargs_to_pass = function_named_args.union(extra_params_to_pass) # Required for better warning message is_instance_method = "self" in function_named_args is_class_method = "cls" in function_named_args # Mark function as decorated func._filter_out_non_signature_kwargs = True @wraps(func) def wrapper(*args, **kwargs): valid_kwargs = {} invalid_kwargs = {} for k, v in kwargs.items(): if k in valid_kwargs_to_pass: valid_kwargs[k] = v else: invalid_kwargs[k] = v if invalid_kwargs: invalid_kwargs_names = [f"'{k}'" for k in invalid_kwargs] invalid_kwargs_names = ", ".join(invalid_kwargs_names) # Get the class name for better warning message if is_instance_method: cls_prefix = args[0].__class__.__name__ + "." elif is_class_method: cls_prefix = args[0].__name__ + "." else: cls_prefix = "" warnings.warn( f"The following named arguments are not valid for `{cls_prefix}{func.__name__}`" f" and were ignored: {invalid_kwargs_names}", UserWarning, stacklevel=2, ) return func(*args, **valid_kwargs) return wrapper return decorator class TransformersKwargs(TypedDict, total=False): """ Keyword arguments to be passed to the loss function Attributes: num_items_in_batch (`Optional[torch.Tensor]`, *optional*): Number of items in the batch. It is recommended to pass it when you are doing gradient accumulation. output_hidden_states (`Optional[bool]`, *optional*): Most of the models support outputing all hidden states computed during the forward pass. output_attentions (`Optional[bool]`, *optional*): Turn this on to return the intermediary attention scores. output_router_logits (`Optional[bool]`, *optional*): For MoE models, this allows returning the router logits to compute the loss. cumulative_seqlens_q (`torch.LongTensor`, *optional*) Gets cumulative sequence length for query state. cumulative_seqlens_k (`torch.LongTensor`, *optional*) Gets cumulative sequence length for key state. max_length_q (`int`, *optional*): Maximum sequence length for query state. max_length_k (`int`, *optional*): Maximum sequence length for key state. """ num_items_in_batch: Optional["torch.Tensor"] output_hidden_states: Optional[bool] output_attentions: Optional[bool] output_router_logits: Optional[bool] cumulative_seqlens_q: Optional["torch.LongTensor"] cumulative_seqlens_k: Optional["torch.LongTensor"] max_length_q: Optional[int] max_length_k: Optional[int] def is_timm_config_dict(config_dict: dict[str, Any]) -> bool: """Checks whether a config dict is a timm config dict.""" return "pretrained_cfg" in config_dict def is_timm_local_checkpoint(pretrained_model_path: str) -> bool: """ Checks whether a checkpoint is a timm model checkpoint. """ if pretrained_model_path is None: return False # in case it's Path, not str pretrained_model_path = str(pretrained_model_path) is_file = os.path.isfile(pretrained_model_path) is_dir = os.path.isdir(pretrained_model_path) # pretrained_model_path is a file if is_file and pretrained_model_path.endswith(".json"): with open(pretrained_model_path) as f: config_dict = json.load(f) return is_timm_config_dict(config_dict) # pretrained_model_path is a directory with a config.json if is_dir and os.path.exists(os.path.join(pretrained_model_path, "config.json")): with open(os.path.join(pretrained_model_path, "config.json")) as f: config_dict = json.load(f) return is_timm_config_dict(config_dict) return False def set_attribute_for_modules(module: "torch.nn.Module", key: str, value: Any): """ Set a value to a module and all submodules. """ setattr(module, key, value) for submodule in module.children(): set_attribute_for_modules(submodule, key, value) def del_attribute_from_modules(module: "torch.nn.Module", key: str): """ Delete a value from a module and all submodules. """ # because we might remove it previously in case it's a shared module, e.g. activation function if hasattr(module, key): delattr(module, key) for submodule in module.children(): del_attribute_from_modules(submodule, key) def can_return_tuple(func): """ Decorator to wrap model method, to call output.to_tuple() if return_dict=False passed as a kwarg or use_return_dict=False is set in the config. Note: output.to_tuple() convert output to tuple skipping all `None` values. """ @wraps(func) def wrapper(self, *args, **kwargs): return_dict = self.config.return_dict if hasattr(self, "config") else True return_dict_passed = kwargs.pop("return_dict", return_dict) if return_dict_passed is not None: return_dict = return_dict_passed output = func(self, *args, **kwargs) if not return_dict and not isinstance(output, tuple): output = output.to_tuple() return output return wrapper # if is_torch_available(): # @torch._dynamo.disable @dataclass @requires(backends=("torch",)) class OutputRecorder: """ Configuration for recording outputs from a model via hooks. Attributes: target_class (Type): The class (e.g., nn.Module) to which the hook will be attached. index (Optional[int]): If the output is a tuple/list, optionally record only at a specific index. layer_name (Optional[str]): Name of the submodule to target (if needed), e.g., "transformer.layer.3.attn". class_name (Optional[str]): Name of the class to which the hook will be attached. Could be the suffix of class name in some cases. """ target_class: "type[torch.nn.Module]" index: Optional[int] = 0 layer_name: Optional[str] = None class_name: Optional[str] = None def check_model_inputs(func): """ Decorator to intercept specific layer outputs without using hooks. Compatible with torch.compile (Dynamo tracing). """ @wraps(func) def wrapper(self, *args, **kwargs): use_cache = kwargs.get("use_cache") if use_cache is None: use_cache = getattr(self.config, "use_cache", False) return_dict = kwargs.pop("return_dict", None) if return_dict is None: return_dict = getattr(self.config, "return_dict", True) if getattr(self, "gradient_checkpointing", False) and self.training and use_cache: logger.warning_once( "`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`." ) use_cache = False kwargs["use_cache"] = use_cache all_args = kwargs.copy() if "kwargs" in all_args: for k, v in all_args["kwargs"].items(): all_args[k] = v capture_flags = _CAN_RECORD_REGISTRY.get(str(self.__class__), {}) # there is a weak ref for executorch recordable_keys = { f"output_{k}": all_args.get( f"output_{k}", getattr( self.config, f"output_{k}", all_args.get("output_attentions", getattr(self.config, "output_attentions", False)), ), ) for k in capture_flags } collected_outputs = defaultdict(tuple) monkey_patched_layers = [] def make_capture_wrapper(module, orig_forward, key, index): @wraps(orig_forward) def wrapped_forward(*args, **kwargs): if key == "hidden_states" and len(collected_outputs[key]) == 0: collected_outputs[key] += (args[0],) if kwargs.get("debug_io", False): with model_addition_debugger_context( module, kwargs.get("debug_io_dir", "~/model_debug"), kwargs.get("prune_layers") ): output = orig_forward(*args, **kwargs) else: output = orig_forward(*args, **kwargs) if not isinstance(output, tuple): collected_outputs[key] += (output,) elif output[index] is not None: if key not in collected_outputs: collected_outputs[key] = (output[index],) else: collected_outputs[key] += (output[index],) return output return wrapped_forward if any(recordable_keys.values()): capture_tasks = [] for key, layer_specs in capture_flags.items(): if not recordable_keys.get(f"output_{key}", False): continue if not isinstance(layer_specs, list): layer_specs = [layer_specs] for specs in layer_specs: if not isinstance(specs, OutputRecorder): index = 0 if "hidden_states" in key else 1 class_name = None if not isinstance(specs, str) else specs target_class = specs if not isinstance(specs, str) else None specs = OutputRecorder(target_class=target_class, index=index, class_name=class_name) capture_tasks.append((key, specs)) for name, module in self.named_modules(): for key, specs in capture_tasks: # The second check is for multimodals where only backbone layer suffix is available if (specs.target_class is not None and isinstance(module, specs.target_class)) or ( specs.class_name is not None and name.endswith(specs.class_name) ): if specs.layer_name is not None and specs.layer_name not in name: continue # Monkey patch forward original_forward = module.forward module.forward = make_capture_wrapper(module, original_forward, key, specs.index) monkey_patched_layers.append((module, original_forward)) outputs = func(self, *args, **kwargs) # Restore original forward methods for module, original_forward in monkey_patched_layers: module.forward = original_forward # Inject collected outputs into model output for key in collected_outputs: if key == "hidden_states": collected_outputs[key] = collected_outputs[key][:-1] if hasattr(outputs, "vision_hidden_states"): collected_outputs[key] += (outputs.vision_hidden_states,) elif hasattr(outputs, "last_hidden_state"): collected_outputs[key] += (outputs.last_hidden_state,) outputs[key] = collected_outputs[key] elif key == "attentions": if isinstance(capture_flags[key], list) and len(capture_flags[key]) == 2: outputs[key] = collected_outputs[key][0::2] outputs["cross_" + key] = collected_outputs[key][1::2] else: outputs[key] = collected_outputs[key] else: outputs[key] = collected_outputs[key] if return_dict is False: outputs = outputs.to_tuple() return outputs return wrapper class GeneralInterface(MutableMapping): """ Dict-like object keeping track of a class-wide mapping, as well as a local one. Allows to have library-wide modifications though the class mapping, as well as local modifications in a single file with the local mapping. """ # Class instance object, so that a call to `register` can be reflected into all other files correctly, even if # a new instance is created (in order to locally override a given function) _global_mapping = {} def __init__(self): self._local_mapping = {} def __getitem__(self, key): # First check if instance has a local override if key in self._local_mapping: return self._local_mapping[key] return self._global_mapping[key] def __setitem__(self, key, value): # Allow local update of the default functions without impacting other instances self._local_mapping.update({key: value}) def __delitem__(self, key): del self._local_mapping[key] def __iter__(self): # Ensure we use all keys, with the overwritten ones on top return iter({**self._global_mapping, **self._local_mapping}) def __len__(self): return len(self._global_mapping.keys() | self._local_mapping.keys()) @classmethod def register(cls, key: str, value: Callable): cls._global_mapping.update({key: value}) def valid_keys(self) -> list[str]: return list(self.keys())
08-08
class Unet(nn.Module): def __init__(self, num_classes): def forward(self, x): return out 我需要的输出结果是这样的,图片按照代码和题目要求输出,包括Original Image Ground Truth Prediction三部分,都要有对应的输出,并且参与测试的图片都要输出,需要补全上述代码,英文输出: 输出结果: Starting training... Epoch 1/20: 100%|██████████| 46/46 [00:15<00:00, 3.04it/s, loss=2.49]Epoch 1/20, Training Loss: 2.8437 Validation Loss: 2.4612 New best model with validation loss: 2.4612 Epoch 2/20: 100%|██████████| 46/46 [00:15<00:00, 3.00it/s, loss=1.59]Epoch 2/20, Training Loss: 2.0684 Validation Loss: 1.5868 New best model with validation loss: 1.5868 Epoch 3/20: 100%|██████████| 46/46 [00:15<00:00, 3.00it/s, loss=1.26]Epoch 3/20, Training Loss: 1.3412 Validation Loss: 1.1896 New best model with validation loss: 1.1896 Epoch 4/20: 100%|██████████| 46/46 [00:15<00:00, 3.02it/s, loss=1.16]Epoch 4/20, Training Loss: 1.0508 Validation Loss: 1.0617 New best model with validation loss: 1.0617 Epoch 5/20: 100%|██████████| 46/46 [00:15<00:00, 2.99it/s, loss=0.812] Epoch 5/20, Training Loss: 0.9584 Validation Loss: 1.0257 New best model with validation loss: 1.0257 Epoch 6/20: 100%|██████████| 46/46 [00:15<00:00, 2.96it/s, loss=0.841]Epoch 6/20, Training Loss: 0.9038 Validation Loss: 1.0027 New best model with validation loss: 1.0027 Epoch 7/20: 100%|██████████| 46/46 [00:16<00:00, 2.84it/s, loss=0.77]Epoch 7/20, Training Loss: 0.8736 Validation Loss: 0.9764 New best model with validation loss: 0.9764 Epoch 8/20: 100%|██████████| 46/46 [00:16<00:00, 2.87it/s, loss=0.809]Epoch 8/20, Training Loss: 0.8373 Validation Loss: 0.9694 New best model with validation loss: 0.9694 Epoch 9/20: 100%|██████████| 46/46 [00:15<00:00, 2.99it/s, loss=1.04]Epoch 9/20, Training Loss: 0.8129 Validation Loss: 0.9442 New best model with validation loss: 0.9442 Epoch 10/20: 100%|██████████| 46/46 [00:15<00:00, 3.00it/s, loss=0.838]Epoch 10/20, Training Loss: 0.7859 Validation Loss: 0.9309 New best model with validation loss: 0.9309 Epoch 11/20: 100%|██████████| 46/46 [00:15<00:00, 3.01it/s, loss=0.799]Epoch 11/20, Training Loss: 0.7673 Validation Loss: 0.9087 New best model with validation loss: 0.9087 Epoch 12/20: 100%|██████████| 46/46 [00:15<00:00, 3.02it/s, loss=0.673]Epoch 12/20, Training Loss: 0.7386 Validation Loss: 0.9185 Epoch 13/20: 100%|██████████| 46/46 [00:15<00:00, 3.00it/s, loss=0.638]Epoch 13/20, Training Loss: 0.6899 Validation Loss: 0.8576 New best model with validation loss: 0.8576 Epoch 14/20: 100%|██████████| 46/46 [00:15<00:00, 3.01it/s, loss=0.553]Epoch 14/20, Training Loss: 0.6538 Validation Loss: 0.8267 New best model with validation loss: 0.8267 Epoch 15/20: 100%|██████████| 46/46 [00:14<00:00, 3.07it/s, loss=0.765] Epoch 15/20, Training Loss: 0.6342 Validation Loss: 0.8240 New best model with validation loss: 0.8240 Epoch 16/20: 100%|██████████| 46/46 [00:15<00:00, 2.99it/s, loss=0.688]Epoch 16/20, Training Loss: 0.6203 Validation Loss: 0.8336 Epoch 17/20: 100%|██████████| 46/46 [00:15<00:00, 2.99it/s, loss=0.518]Epoch 17/20, Training Loss: 0.6099 Validation Loss: 0.8014 New best model with validation loss: 0.8014 Epoch 18/20: 100%|██████████| 46/46 [00:15<00:00, 2.93it/s, loss=0.444]Epoch 18/20, Training Loss: 0.6023 Validation Loss: 0.8169 Epoch 19/20: 100%|██████████| 46/46 [00:15<00:00, 2.98it/s, loss=0.822]Epoch 19/20, Training Loss: 0.5885 Validation Loss: 0.8045 Epoch 20/20: 100%|██████████| 46/46 [00:15<00:00, 2.90it/s, loss=0.425] Epoch 20/20, Training Loss: 0.5659 Validation Loss: 0.7840 New best model with validation loss: 0.7840 Training finished! <ipython-input-5-1f21aef180ff>:213: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. model.load_state_dict(torch.load("best_segmentation_model.pth")) Model saved to simple_segmentation_model.pth Visualizing model predictions: 而且还要满足题目要求: Task 1. Implement Unet and train it on the PASCAL VOC dataset The Unet paper is here: https://arxiv.org/pdf/1505.04597 Use any number of tricks that you can You cannot use pretrained models, though (until we learn about transfer learning) You must achieve > 15 mean IOU (the code for evaluation is in the end of the notebook) Grading rubric: mean IOU > 15, 10 points mean 12 < IOU <= 15, 8 points mean 10 <= IOU <= 12, 5 points mean IOU < 10, 0 points Important: you need to achieve 10 and more IOU using all 21 classes from PASCAL VOC In the end of the notebook you must execute the last cell and pass the tests, otherwise you will receive 0. 其中不可修改的代码要保证全部正常输出: import os import numpy as np import matplotlib.pyplot as plt from PIL import Image import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim from torch.utils.data import Dataset, DataLoader import torchvision.transforms as transforms import torchvision.transforms.functional as TF import torchvision.models as models from torchvision.datasets import VOCSegmentation from tqdm import tqdm torch.manual_seed(42) device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') print(f"Using device: {device}") DATA_DIR = "./data" BATCH_SIZE = 32 NUM_EPOCHS = 20 # Increased to get better results LEARNING_RATE = 0.0001 # Lowered to improve stability IMAGE_SIZE = (224, 224) # PASCAL VOC has 21 classes (including background) NUM_CLASSES = 21 # PASCAL VOC class labels for visualization VOC_CLASSES = [ 'background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor' ] # Color map for visualization VOC_COLORMAP = [ [0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0], [0, 0, 128], [128, 0, 128], [0, 128, 128], [128, 128, 128], [64, 0, 0], [192, 0, 0], [64, 128, 0], [192, 128, 0], [64, 0, 128], [192, 0, 128], [64, 128, 128], [192, 128, 128], [0, 64, 0], [128, 64, 0], [0, 192, 0], [128, 192, 0], [0, 64, 128] ] class SegmentationTransform: def __init__(self, size, is_train=False): self.size = size self.is_train = is_train def __call__(self, image, mask): if self.is_train and np.random.random() > 0.5: image = TF.hflip(image) mask = TF.hflip(mask) if self.is_train and np.random.random() > 0.7: angle = np.random.randint(-10, 10) image = TF.rotate(image, angle, interpolation=Image.BILINEAR) mask = TF.rotate(mask, angle, interpolation=Image.NEAREST) if self.is_train and np.random.random() > 0.7: brightness_factor = np.random.uniform(0.8, 1.2) contrast_factor = np.random.uniform(0.8, 1.2) image = TF.adjust_brightness(image, brightness_factor) image = TF.adjust_contrast(image, contrast_factor) image = TF.resize(image, self.size, interpolation=Image.BILINEAR) mask = TF.resize(mask, self.size, interpolation=Image.NEAREST) image = TF.to_tensor(image) mask_array = np.array(mask) mask_array[mask_array == 255] = 0 # Set ignore pixels to background mask = torch.from_numpy(mask_array).long() # Normalize image image = TF.normalize(image, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) return image, mask class VOCDatasetWrapper(Dataset): def __init__(self, dataset, transform=None): self.dataset = dataset self.transform = transform def __len__(self): return len(self.dataset) def __getitem__(self, idx): image, mask = self.dataset[idx] if self.transform: image, mask = self.transform(image, mask) return image, mask voc_train = VOCSegmentation(root=DATA_DIR, year='2012', image_set='train', download=True) voc_val = VOCSegmentation(root=DATA_DIR, year='2012', image_set='val', download=True) train_transform = SegmentationTransform(IMAGE_SIZE, is_train=True) val_transform = SegmentationTransform(IMAGE_SIZE, is_train=False) train_dataset = VOCDatasetWrapper(voc_train, train_transform) val_dataset = VOCDatasetWrapper(voc_val, val_transform) train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=2) # Reduced workers val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False, num_workers=2) # Reduced workers # Display some examples from the dataset def visualize_examples(dataset, num_examples=3): fig, axes = plt.subplots(num_examples, 2, figsize=(12, 4 * num_examples)) for i in range(num_examples): # Get a sample idx = np.random.randint(0, len(dataset)) image, mask = dataset.dataset[idx] # Original image axes[i, 0].imshow(image) axes[i, 0].set_title(f"Original Image {idx}") axes[i, 0].axis('off') # Colored mask colored_mask = np.zeros((mask.size[1], mask.size[0], 3), dtype=np.uint8) mask_array = np.array(mask) for class_idx, color in enumerate(VOC_COLORMAP): colored_mask[mask_array == class_idx] = color axes[i, 1].imshow(colored_mask) axes[i, 1].set_title(f"Segmentation Mask {idx}") axes[i, 1].axis('off') plt.tight_layout() plt.show() # Visualize examples before training print("Displaying dataset examples:") visualize_examples(train_dataset) import torch def evaluate_segmentation(model, val_loader, num_classes, device='cuda'): model.eval() confusion_matrix = torch.zeros(num_classes, num_classes, dtype=torch.long, device=device) ignore_index = 255 with torch.no_grad(): for images, masks in val_loader: images = images.to(device) masks = masks.to(device) outputs = model(images) preds = torch.argmax(outputs, dim=1) # [B, H, W] preds = preds.view(-1) masks = masks.view(-1) # Filter out ignore pixels valid_mask = (masks != ignore_index) preds = preds[valid_mask] gt = masks[valid_mask] # Vectorized confusion matrix update indices = gt * num_classes + preds # also on the GPU bins = torch.bincount(indices, minlength=num_classes*num_classes) confusion_matrix += bins.reshape(num_classes, num_classes) # Move confusion matrix back to CPU if you need .item() or numpy confusion_matrix = confusion_matrix.cpu() # Compute IoU class_iou = [] for c in range(num_classes): TP = confusion_matrix[c, c].item() FN = confusion_matrix[c, :].sum().item() - TP FP = confusion_matrix[:, c].sum().item() - TP denom = TP + FP + FN if denom == 0: iou_c = float('nan') else: iou_c = TP / denom class_iou.append(iou_c) # mean_iou valid_iou = [x for x in class_iou if not np.isnan(x)] mean_iou = float(np.mean(valid_iou)) if len(valid_iou) > 0 else 0.0 return class_iou, mean_iou class_iou, mean_iou = evaluate_segmentation( model=trained_model, val_loader=val_loader, num_classes=NUM_CLASSES, device=device ) # Print results for i, iou_val in enumerate(class_iou): print(f"Class {i} IoU = {iou_val:.4f}") print(f"Mean IoU over {len(class_iou)} classes = {mean_iou:.4f}") 尤其是这部分一定要保证可以正常输出但不能更改代码: assert mean_iou > 0.10, 'Your IOU must be larger than 10 to get the grade' if mean_iou > 0.15: print('Full grade, 10 points') elif 0.12 < mean_iou <= 0.15: print('Partial grade, 8 points') elif 0.10 < mean_iou <= 0.12: print('Partial grade, 5 points') else: print('IOU is less than 10, 0 points') print('All tests pass!')
06-30
Matlab基于粒子群优化算法及鲁棒MPPT控制器提高光伏并网的效率内容概要:本文围绕Matlab在电力系统优化与控制领域的应用展开,重点介绍了基于粒子群优化算法(PSO)和鲁棒MPPT控制器提升光伏并网效率的技术方案。通过Matlab代码实现,结合智能优化算法与先进控制策略,对光伏发电系统的最大功率点跟踪进行优化,有效提高了系统在不同光照条件下的能量转换效率和并网稳定性。同时,文档还涵盖了多种电力系统应用场景,如微电网调度、储能配置、鲁棒控制等,展示了Matlab在科研复现与工程仿真中的强大能力。; 适合人群:具备一定电力系统基础知识和Matlab编程能力的高校研究生、科研人员及从事新能源系统开发的工程师;尤其适合关注光伏并网技术、智能优化算法应用与MPPT控制策略研究的专业人士。; 使用场景及目标:①利用粒子群算法优化光伏系统MPPT控制器参数,提升动态响应速度与稳态精度;②研究鲁棒控制策略在光伏并网系统中的抗干扰能力;③复现已发表的高水平论文(如EI、SCI)中的仿真案例,支撑科研项目与学术写作。; 阅读建议:建议结合文中提供的Matlab代码与Simulink模型进行实践操作,重点关注算法实现细节与系统参数设置,同时参考链接中的完整资源下载以获取更多复现实例,加深对优化算法与控制系统设计的理解。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值