最近使用std::sort进行排序,发现当待排序数据相同元素的规模达到一定程度时,可能会导致segmentation fault。
重现bug的代码:
#include <stdio.h>
#include <vector>
#include <algorithm>
#include <new>
struct foo_t
{ int size; };
class cmp_t {
public:
bool operator()(foo_t *a, foo_t *b)
{ return a->size >= b->size; }
};
int main(int argc, char *argv[]) {
std::vector<foo_t *> vec;
for (int i = 0; i < 17; i++) {
foo_t *x = new(std::nothrow) foo_t();
if (NULL == x)
{ goto fail; }
else
{ x->size = 1; }
vec.push_back(x);
} // for i
std::sort(vec.begin(), vec.end(), cmp_t());
fail:
std::vector<foo_t *>::iterator end_itr = vec.end();
std::vector<foo_t *>::iterator iter = vec.begin();
while (end_itr != iter)
{ delete *iter; *iter = NULL; ++iter; }
return 0;
} // main
注:当 NUM_ELEM 为10时,可以完成排序;但当 NUM_ELEM 为20时,则会导致core dumped.
BUG分析
C++ 中std::sort的实现使用Introsort,即先使用quick sort进行分组,当递归深度到到阈值之后,使用堆排序,以保证排序的时间复杂度。
Source Code from /usr/include/c++/4.8/bits/stl_algo.h
/**
* @brief Sort the elements of a sequence using a predicate for comparison.
* @ingroup sorting_algorithms
* @param __first An iterator.
* @param __last Another iterator.
* @param __comp A comparison functor.
* @return Nothing.
*
* Sorts the elements in the range @p [__first,__last) in ascending order,
* such that @p __comp(*(i+1),*i) is false for every iterator @e i in the
* range @p [__first,__last-1).
*
* The relative ordering of equivalent elements is not preserved, use
* @p stable_sort() if this is needed.
*/
template<typename _RandomAccessIterator, typename _Compare>
inline void
sort(_RandomAccessIterator __first, _RandomAccessIterator __last,
_Compare __comp)
{
typedef typename iterator_traits<_RandomAccessIterator>::value_type
_ValueType;
// concept requirements
__glibcxx_function_requires(_Mutable_RandomAccessIteratorConcept<
_RandomAccessIterator>)
__glibcxx_function_requires(_BinaryPredicateConcept<_Compare, _ValueType,
_ValueType>)
__glibcxx_requires_valid_range(__first, __last);
if (__first != __last)
{
std::__introsort_loop(__first, __last,
std::__lg(__last - __first) * 2, __comp);
std::__final_insertion_sort(__first, __last, __comp);
}
}
Sort算法的实现,使用introsort_loop。
/// This is a helper function for the sort routine.
template<typename _RandomAccessIterator, typename _Size, typename _Compare>
void
__introsort_loop(_RandomAccessIterator __first,
_RandomAccessIterator __last,
_Size __depth_limit, _Compare __comp)
{
while (__last - __first > int(_S_threshold))
{
if (__depth_limit == 0)
{
_GLIBCXX_STD_A::partial_sort(__first, __last, __last, __comp);
return;
}
--__depth_limit;
_RandomAccessIterator __cut =
std::__unguarded_partition_pivot(__first, __last, __comp);
std::__introsort_loop(__cut, __last, __depth_limit, __comp);
__last = __cut;
}
}
/// This is a helper function...
template<typename _RandomAccessIterator, typename _Compare>
inline _RandomAccessIterator
__unguarded_partition_pivot(_RandomAccessIterator __first,
_RandomAccessIterator __last, _Compare __comp)
{
_RandomAccessIterator __mid = __first + (__last - __first) / 2;
std::__move_median_to_first(__first, __first + 1, __mid, __last - 1,
__comp);
return std::__unguarded_partition(__first + 1, __last, *__first, __comp);
}
/// This is a helper function...
template<typename _RandomAccessIterator, typename _Tp, typename _Compare>
_RandomAccessIterator
__unguarded_partition(_RandomAccessIterator __first,
_RandomAccessIterator __last,
const _Tp& __pivot, _Compare __comp)
{
while (true)
{
while (__comp(*__first, __pivot))
++__first;
--__last;
while (__comp(__pivot, *__last))
--__last;
if (!(__first < __last))
return __first;
std::iter_swap(__first, __last);
++__first;
}
}
注意到以上源代码中*__first和*__last 与 pivot的比较:
实现的是快排分组的过程,但是由于每次只和pivot进行比较而没有越界的检测,故当*__first == pivot 或 *__last == pivot时可能会出现指针越界。
解决方法:
在使用functor进行比较时, 若出现两个比较的值相同时,funtor返回false避免指针越界。