What's wrong with arrays?-优快云博客

本文探讨了C++中数组的一些根本问题，如不保存自身大小信息及指针衰变等，并通过实例比较了使用标准库vector的优势，包括安全性、易用性和效率等方面。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

In terms of time and space, an array is just about the optimal construct for accessing a sequence of objects in memory. It is, however, also a very low level data structure with a vast potential for misuse and errors and in essentially all cases there are better alternatives. By "better" I mean easier to write, easier to read, less error prone, and as fast.

The two fundamental problems with arrays are that

an array doesn't know its own size
the name of an array converts to a pointer to its first element at the slightest provocation

Consider some examples:

	void f(int a[], int s)
	{
		// do something with a; the size of a is s
		for (int i = 0; i<s; ++i) a[i] = i;
	}

	int arr1[20];
	int arr2[10];

	void g()
	{
		f(arr1,20);
		f(arr2,20);
	}

The second call will scribble all over memory that doesn't belong to arr2. Naturally, a programmer usually get the size right, but it's extra work and ever so often someone makes the mistake. I prefer the simpler and cleaner version using the standard library vector:

	void f(vector<int>& v)
	{
		// do something with v
		for (int i = 0; i<v.size(); ++i) v[i] = i;
	}

	vector<int> v1(20);
	vector<int> v2(10);

	void g()
	{
		f(v1);
		f(v2);
	}

Since an array doesn't know its size, there can be no array assignment:

	void f(int a[], int b[], int size)
	{
		a = b;	// not array assignment
		memcpy(a,b,size);	// a = b
		// ...
	}

Again, I prefer vector:

	void g(vector<int>& a, vector<int>& b, int size)
	{
		a = b;	
		// ...
	}

Another advantage of vector here is that memcpy() is not going to do the right thing for elements with copy constructors, such as strings.

	void f(string a[], string b[], int size)
	{
		a = b;	// not array assignment
		memcpy(a,b,size);	// disaster
		// ...
	}

	void g(vector<string>& a, vector<string>& b, int size)
	{
		a = b;	
		// ...
	}

An array is of a fixed size determined at compile time:

	const int S = 10;

	void f(int s)
	{
		int a1[s];	// error
		int a2[S];	// ok

		// if I want to extend a2, I'll have to chage to an array
		// allocated on free store using malloc() and use ralloc()
		// ...
	}

To contrast:

	const int S = 10;

	void g(int s)
	{
		vector<int> v1(s);	// ok
		vector<int> v2(S);	// ok
		v2.resize(v2.size()*2);
		// ...
	}

C99 allows variable array bounds for local arrays, but those VLAs have their own problems.

The way that array names "decay" into pointers is fundamental to their use in C and C++. However, array decay interact very badly with inheritance. Consider:

	class Base { void fct(); /* ... */ };
	class Derived { /* ... */ };

	void f(Base* p, int sz)
	{
		for (int i=0; i<sz; ++i) p[i].fct();
	}

	Base ab[20];
	Derived ad[20];

	void g()
	{
		f(ab,20);
		f(ad,20);	// disaster!
	}

In the last call, the Derived[] is treated as a Base[] and the subscripting no longer works correctly when sizeof(Derived)!=sizeof(Base) -- as will be the case in most cases of interest. If we used vectors instead, the error would be caught at compile time:

	void f(vector<Base>& v)
	{
		for (int i=0; i<v.size(); ++i) v[i].fct();
	}

	vector<Base> ab(20);
	vector<Derived> ad(20);

	void g()
	{
		f(ab);
		f(ad);	// error: cannot convert a vector<Derived> to a vector<Base>
	}

I find that an astonishing number of novice programming errors in C and C++ relate to (mis)uses of arrays.