728. Self Dividing Numbers

本文介绍了一种用于找出指定范围内所有自除数的算法。自除数是指能被其每一位数字整除的数,且不含0。文中提供了两种C++实现方法,并详细解释了判断逻辑。

A self-dividing number is a number that is divisible by every digit it contains.

For example, 128 is a self-dividing number because 128 % 1 == 0, 128 % 2 == 0, and 128 % 8 == 0.

Also, a self-dividing number is not allowed to contain the digit zero.

Given a lower and upper number bound, output a list of every possible self dividing number, including the bounds if possible.

Example 1:

Input: 
left = 1, right = 22
Output: [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 15, 22]

Note:

The boundaries of each input argument are 1 <= left <= right <= 10000.


思路:遍历每一个数,判断该数是否是 self-dividing number(核心)。需要注意:因为0不能为分母,所以含0的数一定不是 self-dividing number。

代码1:

class Solution {
public:
    bool isself(int no){
        if(no<10){return true;}
        else {
            string num=to_string(no);
            if(num.find('0')!=std::string::npos){return false;}
            else{
                for(int i=0;i<num.size();i++){
                    int divide=num[i]-'0';
                    if(no%divide!=0){return false;}
                }
            }
        }
        return true;
        }
    vector<int> selfDividingNumbers(int left, int right) {
        vector<int> result;
        for(int i=left;i<=right;i++){
            if(isself(i)){result.push_back(i);}
        }
        return result;
    }
};

代码2:

class Solution {
public:
    vector<int> selfDividingNumbers(int left, int right) {
        vector<int> res;
        for(int i=left;i<=right;i++){
            int temp=i;
            while(temp!=0 && (temp%10)!=0){
                if(i%(temp%10)){break;}
                temp = temp / 10;
            }
            if(temp==0){res.push_back(i);}
        }
        return res;
    }
};


### Linear Complexity Self-Attention Implementation and Optimization Self-attention mechanisms have been pivotal in advancing the capabilities of deep learning models, especially within natural language processing tasks. Traditional self-attention has a quadratic time complexity relative to input length due to its computation involving all pairs of positions in an input sequence[^1]. However, linear complexity self-attention aims at reducing this computational burden. #### Efficient Implementations One approach towards achieving linear complexity involves approximating or restructuring how attentions scores are computed between tokens. For instance, instead of computing full pairwise interactions, one could use locality-sensitive hashing (LSH), which groups similar items into buckets without explicitly comparing every item against each other. This method significantly reduces the number of required comparisons while maintaining performance quality[^3]. Another technique utilizes random projections where high-dimensional vectors representing token embeddings get projected onto lower dimensions through structured matrices like Fastfood transforms. Such transformations preserve distances well enough so that subsequent operations remain effective yet require fewer resources than standard methods do[^4]. ```python import torch from performer_pytorch import PerformerLM model = PerformerLM( num_tokens=20000, dim=512, depth=6, heads=8, causal=True, feature_redraw_interval=1000, generalized_attention=True, kernel_fn='relu' ) text = "The quick brown fox jumps over the lazy dog" tokens = tokenizer.encode(text).ids # assuming you've defined `tokenizer` elsewhere input_tensor = torch.tensor([tokens]) output = model(input_tensor) print(output.shape) # should output something like torch.Size([1, seq_len, vocab_size]) ``` This code snippet demonstrates implementing efficient self-attention via the Performer architecture from PyTorch library, leveraging fast Fourier transform-based kernels for reduced complexity computations during training phases. #### Optimizations Techniques Optimizing these implementations often revolves around exploiting hardware acceleration features such as GPU tensor cores optimized specifically for matrix multiplications involved in attention calculations. Additionally, mixed precision arithmetic can further enhance speed by performing some parts of forward/backward passes using half-precision floating-point numbers when possible without sacrificing much accuracy. Memory efficiency gains come not only from algorithmic improvements but also architectural choices like chunked processing schemes dividing long sequences into smaller manageable chunks processed independently before being recombined later on. These strategies help mitigate memory overhead associated with large-scale transformer architectures operating under constrained environments[^2]. --related questions-- 1. How does Locality-Sensitive Hashing contribute to making self-attention computationally feasible? 2. What role do random projections play in optimizing self-attention algorithms? 3. Can you explain how specific hardware optimizations impact the performance of linear-complexity self-attention models? 4. In what ways might chunked processing improve both runtime and resource utilization compared to traditional approaches?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值