What's the difference between specially and especially?

本文详细解析了英语中specially与especially两个词的用法区别,包括它们在表达非常、特别、为特殊目的及以特殊方式时的不同应用场景。

http://www.learnersdictionary.com/qa/what-s-the-difference-between-specially-and-especially

The following is derived from the above site.

What is the difference between specially and especially? – Mary, United States

The meanings and usage of these two similar-sounding words overlap quite a bit, so it can be hard to figure out which one to use when. If you are interested in the details, I encourage you to read their entries in Merriam-Webster's Learner’s Dictionary. If that’s more information than you need, here are simple rules to follow that will insure that you are using these words correctly:

 

1. Use especially to mean “very” or “extremely,” as in these examples:

  • There is nothing especially radical about that idea. 
  • The food was not especially good. 
  • She loves flowers, especially roses.

2. Use especially when something stands out from all the others, and you want the meaning of “particularly,” as in these examples: 

  • She can't be sure she will win, especially at this early stage of the campaign. 
  • The appetizers and especially the soup were delicious. 

3. When you want to convey the meaning “for a special purpose,” or “specifically,” you can use either especially or specially. They are both correct. 

  • The speech was written especially/specially for the occasion. 

4. When you want to convey the meaning “in a special manner”, as in this example below, use specially. In this context, especially would sound odd or wrong to most native speakers. 

  • I don't want to be treated specially.
  • I don't want to be treated especially

 

 

especially

 ADVERB

used when mentioning conditions that make something more relevantimportant, or true

specially

 ADVERB

in a particular way, or for a particular purpose

uniquely

 ADVERB

especially

particularly

 ADVERB

used for emphasizing that something refers especially to one specific person, thing, or situation

notably

 ADVERB

FORMAL especially: used for introducing a good example of something

in particular

 PHRASE

especially

especially

 ADVERB

used for showing that what you are saying applies to one person or thing more than others

especially

 ADVERB

for a particular purpose or for a particular person

esp.

 ABBREVIATION

especially

above all

 PHRASE

used for referring to something that is more important than any of the other things you could mention

3) Reward: We define that the DG of proposed MT-FJSP will output the current targets vector C(4)(t + 1) = [MKTt+1,PECt+1,SECt+1,TECt+1] at each decision step t. After making at, all STNs in DG are divided into scheduled and unscheduled parts. The finish time FTi,j(t) and processing energy consumption peci,j(t) in hi,j(t) of scheduled STNs are calculated via constraint (8a)–(8e), while those of unscheduled STNs are estimated using estimation rule (9). Additionally, the transportation time and standby time generated from scheduled STN can be calculated by (5) and (6), while those generated from unscheduled STN are set to 0. Thus, C(4)(t + 1) = [maxi,j{FTi,j(t)}, I i=1 |Ji| j=1 peci,j(t), SEC,TEC]. Specially, for the initial state s0, all STN nodes are unscheduled; thus, C(4)(0) = [maxi,j{FTi,j(t)}, I i=1 |Ji| j=1 peci,j(t), 0, 0], which is defined as an ideal benchmark; and at decision step end, C(4)(end) = [MKT,PEC,SEC,TEC]. We define the rewards vector r(4) t =C(4)(t)−C(4)(t+1) at each step t, utilizing the difference between the previous and current targets. The goal of DRL is to maximize the cumulative discounted reward Gt from step t to the end with the discount factor γ. So when γ = 1, the total long-term sum of reward Gt = C(4)(0)−C(4)(end). In this case, the C(4)(0) is a constant calculated by s0 and represents the best-idealized targets, while the C(4)(end) represents the final targets when all jobs are completed. Consequently, learning the policy to maximize Gt is equal to minimize the objective cost in (8). Note that, we separately utilize real-time reward of MKT and SEC for the job agent, and PEC and TEC for the machine agent for strategy learning, while r(4) t is utilized in the global critic network.翻译并解读
最新发布
08-16
以下是该段落的中文翻译与解读: --- ### 中文翻译: **奖励机制**:我们定义所提出的多任务柔性车间调度问题(MT-FJSP)的决策图(DG)在每一步决策时刻 $ t $ 输出当前目标向量 $ C^{(4)}(t+1) = [MKT_{t+1}, PEC_{t+1}, SEC_{t+1}, TEC_{t+1}] $。在执行动作 $ a_t $ 后,DG中的所有子任务节点(STNs)被划分为已调度和未调度两部分。 - 对于**已调度的STNs**,其完成时间 $ FT_{i,j}(t) $ 和加工能耗 $ pec_{i,j}(t) $ 通过约束条件(8a)–(8e)计算; - 对于**未调度的STNs**,这些值通过估计规则(9)进行估计。 此外,由已调度STN产生的运输时间和待机时间分别通过公式(5)和(6)计算,而未调度STN产生的运输时间和待机时间设为0。 因此,目标向量定义为: $$ C^{(4)}(t+1) = [\max_{i,j}\{FT_{i,j}(t)\}, \sum_{i=1}^{I} \sum_{j=1}^{|J_i|} pec_{i,j}(t), SEC, TEC] $$ 特别地,在初始状态 $ s_0 $ 下,所有STN节点均为未调度状态,因此初始目标向量为: $$ C^{(4)}(0) = [\max_{i,j}\{FT_{i,j}(t)\}, \sum_{i=1}^{I} \sum_{j=1}^{|J_i|} pec_{i,j}(t), 0, 0] $$ 这被定义为理想基准;在决策过程结束时: $$ C^{(4)}(\text{end}) = [MKT, PEC, SEC, TEC] $$ 我们定义每一步的奖励向量为: $$ r^{(4)}_t = C^{(4)}(t) - C^{(4)}(t+1) $$ 即利用前后两个目标向量的差值作为奖励。深度强化学习(DRL)的目标是最大化从步骤 $ t $ 到结束的累积折扣奖励 $ G_t $,其中折扣因子为 $ \gamma $。当 $ \gamma = 1 $ 时,长期总奖励为: $$ G_t = C^{(4)}(0) - C^{(4)}(\text{end}) $$ 在这种情况下,$ C^{(4)}(0) $ 是由初始状态计算得到的常数,代表理想目标,而 $ C^{(4)}(\text{end}) $ 表示所有任务完成后的最终目标。因此,学习最大化 $ G_t $ 的策略等价于最小化目标成本(8)中的各项指标。 注意:我们分别为作业代理(job agent)使用MKT和SEC的实时奖励,为机器代理(machine agent)使用PEC和TEC的实时奖励用于策略学习,而整体的奖励 $ r^{(4)}_t $ 则用于全局评判网络(global critic network)。 --- ### 解读: 这段文字描述的是一个多任务柔性车间调度问题(MT-FJSP)中用于深度强化学习(DRL)的**奖励函数设计**。其核心思想是通过定义一个**目标向量** $ C^{(4)} $,并在每一步决策后计算该向量的变化来作为奖励信号。 #### 关键点解析: 1. **目标向量 $ C^{(4)}(t) $ 的构成**: - **MKT**(最大完工时间):即 makespan,是调度问题中最重要的性能指标之一。 - **PEC**(加工能耗):所有工件在机器上加工的能耗总和。 - **SEC**(待机能耗):机器在等待任务时的能耗。 - **TEC**(运输能耗):任务在不同机器之间传输时的能耗。 2. **奖励设计**: - 奖励 $ r^{(4)}_t = C^{(4)}(t) - C^{(4)}(t+1) $,即前一步目标向量减去后一步目标向量。 - 如果目标向量变小(即目标优化了),则奖励为正,说明策略有效。 - 最终目标是最大化从初始到结束的总奖励 $ G_t = C^{(4)}(0) - C^{(4)}(\text{end}) $,等价于最小化最终的调度成本。 3. **代理分工**: - **Job Agent**(作业代理):关注MKT和SEC,即完工时间和待机能耗。 - **Machine Agent**(机器代理):关注PEC和TEC,即加工和运输能耗。 - **Global Critic Network**(全局评判网络):使用整体奖励 $ r^{(4)}_t $ 来评估策略的全局性能。 4. **调度状态划分**: - **已调度STN**:根据实际调度结果计算时间与能耗。 - **未调度STN**:采用估计规则(如平均值、启发式等)进行预估。 5. **强化学习目标**: - 通过最大化奖励,间接优化调度目标,使得最终的调度方案在多个维度(时间、能耗)上达到最优。 --- ### 示例代码(Python伪代码): 以下是一个简化的奖励计算函数示例: ```python def calculate_reward(C_prev, C_current): """ 计算当前步骤的奖励向量 r_t = C(t) - C(t+1) :param C_prev: 上一时刻的目标向量 [MKT, PEC, SEC, TEC] :param C_current: 当前时刻的目标向量 [MKT, PEC, SEC, TEC] :return: 奖励向量 r_t """ MKT_prev, PEC_prev, SEC_prev, TEC_prev = C_prev MKT_curr, PEC_curr, SEC_curr, TEC_curr = C_current r_MKT = MKT_prev - MKT_curr r_PEC = PEC_prev - PEC_curr r_SEC = SEC_prev - SEC_curr r_TEC = TEC_prev - TEC_curr return [r_MKT, r_PEC, r_SEC, r_TEC] # 示例使用 C_prev = [100, 200, 50, 80] # 上一时刻目标 C_current = [90, 180, 40, 70] # 当前时刻目标 reward = calculate_reward(C_prev, C_current) print("Reward Vector:", reward) ``` #### 输出解释: ``` Reward Vector: [10, 20, 10, 10] ``` - 表示当前调度策略在所有四个维度上都带来了正向优化。 --- ###
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值