Simultaneous CPU/GPU Debugging in Visual Studio 2013

本文介绍如何使用Visual Studio 2013进行C++ AMP应用的CPU和GPU部分同步调试。首先配置项目属性选择WARP加速器,设置断点,并通过F5开始调试。可以查看GPU线程状态并评估表达式,同时也能检查CPU状态。

The debugging support introduced for C++ AMP in Visual Studio 2012 has been extended in Visual Studio 2013 when running on Windows 8.1 so that you are able to debug both the CPU and GPU parts of your C++ AMP application simultaneously. In this post I’ll show you how to get started.

1. Configure your project properties

The WARP accelerator on Windows 8.1 is the only accelerator that currently supports simultaneous debugging, so you will need to choose WARP when you want to debug your AMP code. If your application explicitly selects an accelerator then you will need to change your code to select WARP instead. If you are using the default accelerator then the C++ AMP runtime attempts to find the “best” accelerator available on your system to use as the default. Typically this will be your DirectX 11 capable GPU. But you can get the C++ AMP runtime to select WARP as the default accelerator instead via the project properties.

Take any C++ AMP project and set your properties as follows:

  1. Select Configuration Properties->Debugging
  2. Set “Debugger Type” to “Native Only” or “Auto”.
  3. Set “Amp Default Accelerator” to “WARP software accelerator

2. Set breakpoints in your C++ AMP code

I want to demonstrate both CPU and GPU debugging capabilities so we will start by setting a couple of breakpoints, one in CPU code and another in GPU code. Set the first breakpoint somewhere in your application before the first execution of a parallel_for_each. Set the second breakpoint in the body of a parallel_for_each loop. In my MatrixMultiplication example I have set breakpoints at line 126 (CPU code) and 147 (GPU code).

3. Hit F5 to test CPU debugging

Hit the F5 key (or Debug->Start Debugging menu) to start debugging as normal. You should see your breakpoint in the CPU code get hit. All the usual CPU debugging experience is available.

The breakpoint in the GPU code (at line 147) is displayed with the unbound breakpoint icon. This is because the AMP runtime has not yet created the compute shader corresponding to the loop body. It will do so at the first attempt to execute the parallel_for_each, at which time the breakpoint will bind and the icon will become the familiar red breakpoint icon.

The execution of the body of the parallel_for_each on the GPU is asynchronous with respect to the CPU execution. If you were to advance to the parallel_for_each (line 130) and step (F10) the debugger would remain in the CPU code with the appearance of having stepped over the parallel_for_each – however the loop body will likely not yet have been executed. But since we want to debug into the GPU code we set that second break on line 147 in the loop body. When the loop body eventually gets executed the debugger will break with active GPU state available for examination.

4. Hit F5 to break in the GPU code

Hit the F5 key again and you will advance execution into the GPU code (line 147).

5. Debug the GPU code

To see your GPU thread state open the GPU Threads window (Debug->Windows->GPU Threads). The entire debugger UI is directed at the GPU thread that hit the breakpoint (the current thread – in this example Tile: [0,0] Thread:[0,0]). The call stack window will show the GPU thread’s call stack, and expressions in the locals, watch, and autos windows will all be evaluated in the GPU context. The parallel watch window will allow you to evaluate expression across all active GPU threads.

6. Check the CPU state

When stopped at a GPU breakpoint the CPU portion of the application is also stopped. You can easily view the CPU state without having to continue execution until you exit the GPU code. Just open the Threads window (Debug->Windows->Threads). Set your debugger context back to a CPU thread by double clicking on your main thread. The entire debugger experience is now focused on the CPU portion of your program. Once you are finished you can return to the GPU by double clicking on a line in the GPU Threads window.

Limitations

The simultaneous debugging experience is only available when your application executes on Windows 8.1 and uses the WARP software driver. WARP on earlier versions of Windows does not support debugging. Debugging GPU code on previous versions of Windows is still possible with VS 2013 using the “GPU Only” debugger type and remains unchanged from Visual Studio 2012.

When simultaneous debugging on WARP a few GPU debugging features are unavailable: race detection, freezing and thawing of GPU threads, and “Run Current Tile to Cursor”. All this functionality continues to be available when “GPU Only” debugging. 

Note for Visual Studio 2013 Preview: In the preview release simultaneous debugging only works for Win32 processes, not x64 processes. 64-bit debugging will be enabled in the final product release.


http://blogs.msdn.com/b/nativeconcurrency/archive/2013/06/28/simultaneous-cpu-gpu-debugging-in-visual-studio-2013.aspx

内容概要:本文档为《软件设计师资料净化与分析报告(汇总)》,系统整理了软件设计师考试涉及的核心知识点及历年试题分析,涵盖计算机系统基础、操作系统、数据库、软件工程、网络与信息安全、程序设计语言、知识产权及计算机专业英语等多个模块。文档不仅包含各知识点的理论讲解,如CPU结构、海明码校验、虚拟存储器、PV操作、页式存储管理、关系范式、设计模式等,还结合真题解析强化理解,并提供了大量案例分析与算法实现,如数据流图、E-R图设计、排序算法、策略模式、备忘录模式等,全面覆盖软件设计师考试的上午选择题与下午案例分析题的考核重点。; 适合人群:准备参加全国计算机技术与软件专业技术资格(水平)考试中“软件设计师”科目的考生,尤其适合有一定计算机基础、正在系统复习备考的中级技术人员。; 使用场景及目标:①系统梳理软件设计师考试大纲要求的知识体系;②通过真题解析掌握高频考点与解题思路;③强化对操作系统、数据库、软件工程等核心模块的理解与应用能力;④提升对设计模式、算法设计与程序语言机制的综合运用水平。; 阅读建议:建议结合考试大纲,分模块逐步学习,重点掌握各章节的知识点归纳与真题解析部分,对于案例分析题应动手练习数据流图、E-R图绘制及代码填空,算法部分应理解分治、动态规划等思想,并通过反复练习巩固记忆,全面提升应试能力。
【完美复现】面向配电网韧性提升的移动储能预布局与动态调度策略【IEEE33节点】(Matlab代码实现)内容概要:本文介绍了基于IEEE33节点的配电网韧性提升方法,重点研究了移动储能系统的预布局与动态调度策略。通过Matlab代码实现,提出了一种结合预配置和动态调度的两阶段优化模型,旨在应对电网故障或极端事件时快速恢复供电能力。文中采用了多种智能优化算法(如PSO、MPSO、TACPSO、SOA、GA等)进行对比分析,验证所提策略的有效性和优越性。研究不仅关注移动储能单元的初始部署位置,还深入探讨其在故障发生后的动态路径规划与电力支援过程,从而全面提升配电网的韧性水平。; 适合人群:具备电力系统基础知识和Matlab编程能力的研究生、科研人员及从事智能电网、能源系统优化等相关领域的工程技术人员。; 使用场景及目标:①用于科研复现,特别是IEEE顶刊或SCI一区论文中关于配电网韧性、应急电源调度的研究;②支撑电力系统在灾害或故障条件下的恢复力优化设计,提升实际电网应对突发事件的能力;③为移动储能系统在智能配电网中的应用提供理论依据和技术支持。; 阅读建议:建议读者结合提供的Matlab代码逐模块分析,重点关注目标函数建模、约束条件设置以及智能算法的实现细节。同时推荐参考文中提及的MPS预配置与动态调度上下两部分,系统掌握完整的技术路线,并可通过替换不同算法或测试系统进一步拓展研究。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值