Hyper threading and Pipeline

本文通过洗衣流程比喻,详细解析了CPU管道的工作原理及其如何提高运算效率。介绍了传统CPU与采用管道技术的现代CPU在执行指令上的区别,并探讨了超线程技术如何进一步优化多任务处理。

To understand what HT is, you need to understand the pipeline. The pipeline is a system in the CPU that breaks down each operation into smaller chunks and executes each of these chunks separately. Think of it like doing your laundry: you can wash your clothes in the washing machine while drying your previous washer load in the dryer. If you have a lot of laundry to do, it would be silly to leave the washer empty just because the dryer is being used, right? But that’s exactly how pre-Pentium Intel CPU’s worked.

Imagine this hypothetical 4-stage pipelione for something like the 8080 CPU:

  1. retrieve the instruction from memory
  2. Retrieve data from memory (if necessary)
  3. Execute the instruction
  4. Write data back to memory (if necessary)

Some instructions, like “NOOP” would only take 2 ticks to run: get the instruction, then execute it. An instruction like “Store the value in the A register to the specified memory location” would take 5 ticks, since the CPU has to read two bytes to get the memory address, then actually write the register value to memory.

Older CPU’s like the 8080 execute all of the steps in an operation before advancing to the next program step. Starting with the 80486 processor, Intel added a pipeline, which allows these steps to operate in parallel. Like the laundry example, a 408 chip can actually operate on parts of more than one instruction at a time. This gives the CPU a faster effective throughput, even though each instruction still happens in series.

But there are some problems with pipelining: one thing that can happen is that the pipeline can become stalled with branch instructions. Imagine the following sequence of operations:

Code:
start:
1 load A from memory location 123
2 compare A and B
3 if they are equal, skip to notequal
4 print “not equal”
5 jump to start
notequal:
6 print equal

So in our 4-stage pipeline, steps 1-3 are running through the pipeline. By the time step 3 hits the “execute” stage, step 4 should be in the “retrieve data from memory” stage, and step 5 should be in the “retrieve instruction from memory” stage.

But wait… A and B are equal, so we can’t actually execute the ‘print “not equal”’ statement! Instead, we have to jump down to step 6. What happens the stuff already loaded in the pipeline? Basically, it gets thrown away, and the CPU sits idle for 3 ticks while it loads the new code sequence.

So the way data moves through pipeline is with the help of pipeline registers. What those do is hold the output from one pipeline stage and feed it to the next stage. So what would happen if we split up the pipeline so that it was doing two different things at once? Essentially, we’d have this:

Code:
Task 2 Step 2 - STAGE 1
--------------------------------- STAGE 2 - Task 1 Step 2
Task 2 Step 1 - STAGE 3
----------------------------------STAGE 4 - Task 1 Step 1

One of the benefits here is that since stages are separated, we know the outcome of test instructions before we read that tasks’s next instruction. So every time we have a branch that gets taken, we gain some time over the single-threaded architecture. Further refinements in HT actually allow HT to be executed dynamically, so things like reading data from main memory (rather than the local cache) can bump the other thread’s priority, allowing that thread to execute faster while the blocked thread waits for memory to return the data.

Things like this are why Hyperthreading can actually speed up execution of two tasks, compared to serializing them.

Finally, I tested this when the first HT CPUs came out. I ran benchmarks with Hyperthreading enabled and disabled, and I compared those results to a non-Hyperthreaded CPU. Generally speaking, a single, high priority task would execute at the same speed, with or without HT. However, when running two or more tasks, the HT processor did a better job of multitasking, and the overall throughput of the HT CPU was better than the non-HT CPU. It was not double, however. It was probably closer to the 15-20% that other people have mentioned.

source

【顶级EI复现】计及连锁故障传播路径的电力系统 N-k 多阶段双层优化及故障场景筛选模型(Matlab代码实现)内容概要:本文介绍了名为《【顶级EI复现】计及连锁故障传播路径的电力系统 N-k 多阶段双层优化及故障场景筛选模型(Matlab代码实现)》的研究资源,重点围绕电力系统中连锁故障的传播机制,提出了一种N-k多阶段双层优化模型,并结合故障场景筛选方法提升系统安全性与鲁棒性。该模型通过Matlab代码实现,可用于模拟复杂电力系统在多重故障下的响应特性,支持对关键故障路径的识别与优化决策,适用于高水平科研复现与工程仿真分析。文中还列举了大量相关技术方向的配套资源,涵盖智能优化算法、电力系统管理、机器学习、路径规划等多个领域,并提供了网盘链接以便获取完整代码与资料。; 适合人群:具备电力系统、优化理论及Matlab编程基础的研究生、科研人员及从事能源系统安全分析的工程技术人员,尤其适合致力于高水平论文(如EI/SCI)复现与创新的研究者。; 使用场景及目标:①复现顶级期刊关于N-k故障与连锁传播的优化模型;②开展电力系统韧性评估、故障传播分析与多阶段防御策略设计;③结合YALMIP等工具进行双层优化建模与场景筛选算法开发;④支撑科研项目、学位论文或学术成果转化。; 阅读建议:建议读者按照文档提供的目录顺序系统学习,优先掌握双层优化与场景筛选的核心思想,结合网盘中的Matlab代码进行调试与实验,同时参考文中提及的智能算法与电力系统建模范例,深化对复杂电力系统建模与优化的理解。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值