昨天晚上业务出现了一次core dump,今天早上来check生产环境的core文件的堆栈内容如下:
gdb ./appname --core=core.1234
(gdb) bt
得到栈的内容如下:
#0 0x00007f5634262734 in std::_Rb_tree_rotate_right () from /usr/lib/libstdc++.so.6
#1 0x00007f56342628c1 in std::_Rb_tree_insert_and_rebalance () from /usr/lib/libstdc++.so.6
#2 0x00000000004b556c in std::_Rb_tree<unsigned int, std::pair<unsigned int const, unsigned int>, std::_Select1st<std::pair<unsigned int const, unsigned int> >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > >::_M_insert_ (this=0x7fff3d253090, __x=0x0, __p=0x12e48f0, __v=@0x7fff3d251350)
at /usr/include/c++/4.3/bits/stl_tree.h:854
#3 0x00000000004b63d2 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, unsigned int>, std::_Select1st<std::pair<unsigned int const, unsigned int> >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > >::_M_insert_unique_ (this=0x7fff3d253090, __position={_M_node = 0x7f56182ee260}, __v=@0x7fff3d251350)
at /usr/include/c++/4.3/bits/stl_tree.h:1201
#4 0x00000000004b65da in std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > >::insert (
this=0x7fff3d253090, __position={_M_node = 0x7f56182ee260}, __x=@0x7fff3d251350) at /usr/include/c++/4.3/bits/stl_map.h:496
#5 0x00000000004b6680 in std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > >::operator[] (
this=0x7fff3d253090, __k=@0x7fff3d251508) at /usr/include/c++/4.3/bits/stl_map.h:419
找到#5对应的源码为:
mapped_type& operator[](const key_type& __k)
{
// concept requirements
__glibcxx_function_requires(_DefaultConstructibleConcept<mapped_type>)
iterator __i = lower_bound(__k);
// __i->first is greater than or equivalent to __k.
if (__i == end() || key_comp()(__k, (*__i).first))
{
__i = insert(__i, value_type(__k, mapped_type()));//出问题的语句
}
return (*__i).second;
}
找到#4对应的源码为:
iterator insert(iterator __position, const value_type& __x)
{
return _M_t._M_insert_unique_(__position, __x);//出问题的语句
}
找到#3对应的源码:
typename _Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::iterator
_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::_M_insert_unique_(const_iterator __position, const _Val& __v)
{
// end() 如果position为end,则表示目前tree中没有它
if (__position._M_node == _M_end())
{
if (size() > 0
&& _M_impl._M_key_compare(_S_key(_M_rightmost()),
_KeyOfValue()(__v)))
return _M_insert_(0, _M_rightmost(), __v);
else
return _M_insert_unique(__v).first;
}
else if (_M_impl._M_key_compare(_KeyOfValue()(__v), _S_key(__position._M_node)))
{
// First, try before...
const_iterator __before = __position;
if (__position._M_node == _M_leftmost()) // begin()
{
return _M_insert_(_M_leftmost(), _M_leftmost(), __v);
}
else if (_M_impl._M_key_compare(_S_key((--__before)._M_node), _KeyOfValue()(__v)))
{
if (_S_right(__before._M_node) == 0)
{
return _M_insert_(0, __before._M_node, __v);//出问题的语句
分析到这里,我们可以看到,在红黑树的insert_unique操作中,没有走到 if (__position._M_node == _M_end())分支,而是到了 else if (...)的分支。标明该key值在map中已经存在了,此次插入的结果就是改变key对应的value值。而在生产环境上的log里显示该key值已经被drop了,如果插入应该走第一个 if 分支。
因此只有一种情况,就是因为多线程引起的数据不一致问题。在需要维护map的代码段加上:
boost::recursive_mutex::scoped_lock lock(m_mutex);
即可,其中m_mutex为:
boost::recursive_mutex m_mutex;
原文链接: http://blog.youkuaiyun.com/poechant/article/details/6774160