VC11, std functional bind/function, big performance hit under x64

In testing thefollowing code with both VC10 and 11 (beta, ultimate) I've noticed a remarkableperformance difference. When compiled for Win32 they perform about the same,with 11 gaining a slight edge. But when compiled for x64, the VC11 version isseveral times slower (measured in performance counter ticks). In both cases Iuse the default project settings, adding a new config for x64 and not modifyingthe compiler command line.

*edit I shouldmention that VC10 is not patched to SP1.

#include <Windows.h>
#include <functional>
#include <vector>
#include <iostream>
#include <conio.h>


typedef std::function<void(void)> func; 
typedef std::vector<func> funcVector;
typedef std::vector<func>::iterator ifuncVector;


void Function1()
{
	int j = 0;
	for( int i = 0; i < 100; i++ )
		j += i;
}

void Function2()
{
	float j = 1.0f;
	for( int i = 0; i < 100; i++ )
		j *= (i+1);
}

void Function3()
{
	double j = 1.0;
	for( int i = 0; i < 100; i++ )
		j *= (i+1);
}


LARGE_INTEGER timingStart, timingEnd;
__int64 diff;
int nrCalls = 1;

int main()
{
	funcVector funcs;

	while ( true )
	{
		std::cout << "Enter number of calls or 0 to quit" << std::endl;
		std::cin >> nrCalls;
		if( !std::cin.good() || nrCalls == 0 )
			break;

		QueryPerformanceCounter(&timingStart);
		for( int i = 0; i < nrCalls; i++ )
		{
			funcs.push_back( std::bind( &Function1 ) );
			funcs.push_back( std::bind( &Function2 ) );
			funcs.push_back( std::bind( &Function3 ) );
			for( ifuncVector i = funcs.begin(); i != funcs.end(); i++ )
			{
				(*i)();
			}
			funcs.clear();
		}
		QueryPerformanceCounter(&timingEnd);

		diff = timingEnd.QuadPart - timingStart.QuadPart;

		std::cout << "Timing for " << nrCalls << " calls:" << std::endl;
		std::cout <<  diff << " ticks " << std::endl;
		std::cout << "***************************" << std::endl;
	}

	return 0;
}


==============================================

Hi, I maintainVC's STL, and I was just alerted to this 5-month-old post.  We're nowtracking this as DevDiv#490878 in our internal database, and I've analyzed itto figure out what's going on here (and why it appeared to bex64-specific).  To summarize: your function pointer is 8 bytes onx64.  VC11 changed the representation of bound functors to store theirbound arguments in tuples; you have no bound arguments, so we store atuple<>.  That's an empty 1-byte class, so the overall bound functoris 16 bytes (for alignment), when it used to be 8 with VC10.  Then, yougive it to std::function.  That adds a vtable pointer (necessary forstd::function's magic type erasure), another 8 bytes.  It also stores astd::allocator (empty 1-byte class), which is a second regression from VC10(where we avoided storing std::allocator here).  The total size is 32bytes, which is greater than our Small Functor Optimization limit on x64 of 24bytes. (That's undocumented and we could change it at any time, but that's thecurrent value.)  Functors less than or equal to 24 bytes, when thenecessary vtable pointer is included, are stored directly within the std::function. Larger functors must be dynamically allocated.  This dynamic memoryallocation (and deallocation) is responsible for the massive performance hityou've observed.

Note that whileyou can't do anything about std::function storing std::allocator unnecessarily,you can avoid using bind() which is responsible for half of the bloathere.  Function pointers can be directly given to std::function. Otherwise, you can "bind" things with lambdas, which are moreefficient and have more natural syntax even aside from this unnecessary bloatissue.

If you have anyfurther questions, please E-mail me atstl@microsoft.com   (I am unfortunatelytoo busy to continuously monitor the forums).



内容概要:本文详细介绍了文生视频大模型及AI人应用方案的设计与实现。文章首先阐述了文生视频大模型的技术基础,包括深度生成模型、自然语言处理(NLP)和计算机视觉(CV)的深度融合,以及相关技术的发展趋势。接着,文章深入分析了需求,包括用户需求、市场现状和技术需求,明确了高效性、个性化和成本控制等关键点。系统架构设计部分涵盖了数据层、模型层、服务层和应用层的分层架构,确保系统的可扩展性和高效性。在关键技术实现方面,文章详细描述了文本解析与理解、视频生成技术、AI人交互技术和实时处理与反馈机制。此外,还探讨了数据管理与安全、系统测试与验证、部署与维护等重要环节。最后,文章展示了文生视频大模型在教育、娱乐和商业领域的应用场景,并对其未来的技术改进方向和市场前景进行了展望。 适用人群:具备一定技术背景的研发人员、产品经理、数据科学家以及对AI视频生成技术感兴趣的从业者。 使用场景及目标:①帮助研发人员理解文生视频大模型的技术实现和应用场景;②指导产品经理在实际项目中应用文生视频大模型;③为数据科学家提供技术优化和模型改进的思路;④让从业者了解AI视频生成技术的市场潜力和发展趋势。 阅读建议:本文内容详尽,涉及多个技术细节和应用场景,建议读者结合自身的专业背景和技术需求,重点阅读与自己工作相关的章节,并结合实际项目进行实践和验证。
内容概要:《智慧教育应用发展研究报告(2025年)》由中国信息通信研究院发布,全面梳理了全球及我国智慧教育的发展现状和趋势。报告指出,智慧教育通过多种数字技术促进教育模式、管理模式和资源生成等方面的变革。国外经济体如欧盟、美国、韩国和日本纷纷通过顶层设计推动智慧教育发展,而我国则通过政策支持、基础设施建设、技术融合等多方面努力,推动智慧教育进入“快车道”。智慧教育应用场景分为智慧校园和校外教育两类,涵盖教学、考试、评价、管理和服务等多个方面。报告还详细分析了支撑智慧教育发展的技术、产业、基础设施和安全能力的发展趋势,并指出了当前面临的挑战及建议。 适用人群:教育领域的政策制定者、教育管理者、教育技术从业者、研究人员和关心教育发展的社会各界人士。 使用场景及目标:①了解全球及我国智慧教育的最新进展和趋势;②为政策制定者提供决策参考;③为教育管理者和技术从业者提供实施智慧教育的具体指导;④促进教育技术的研发和应用。 其他说明:报告强调了智慧教育在促进教育公平、提升教育质量、推动教育模式创新等方面的重要性,并呼吁加强跨领域协同攻关、缩小教育数字化差距、强化网络信息安全和提升教师数字素养,以应对当前面临的挑战。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值