Item6 :Distinguish Between Value Type and Reference Type
区别值类型和引用类型
Value types or reference types? Structs or classes? When should you use each? This isn't C++, in which you define all types as value types and can create references to them. This isn't Java, in which everything is a reference type. You must decide how all instances of your type will behave when you create it. It's an important decision to get right the first time. You must live with the consequences of your decision because changing later can cause quite a bit of code to break in subtle ways. It's a simple matter of choosing the struct or class keyword when you create the type, but it's much more work to update all the clients using your type if you change it later.
值类型还是引用类型?结构体还是类?应该在什么时候使用哪个呢?这不是C++,可以将所有类型定义为值类型,创建对它们的引用。这不是Java,在Java里面所有的东西都是引用类型。在你创建一个类型的时候,就应该决定它的实例的行为。第一次的时候就做对,这是个重要的决定。你必须为你的决定负责,因为以后的改变会在细微的方面引起一些代码上的破坏。在你创建一个类型的时候,选择struct还是class关键字是个简单的问题;但是如果你在以后修改它,那么更新你的所有客户就要做很多工作。
It's not as simple as preferring one over the other. The right choice depends on how you expect to use the new type. Value types are not polymorphic. They are better suited to storing the data that your application manipulates. Reference types can be polymorphic and should be used to define the behavior of your application. Consider the expected responsibilities of your new type, and from those responsibilities, decide which type to create. Structs store data. Classes define behavior.
这不像你喜欢这个胜于另一个那样简单,正确的选择取决于你准备怎样使用该新类型。值类型不是多态的,它们更适合用来存储你的应用程序所操作的数据。引用类型可以是多态的,应该被用来定义应用程序的行为。考虑新类型要承担的职责,从这些职责出发,来决定创建哪种类型。struct存储数据,class定义行为。
The distinction between value types and reference types was added to .NET and C# because of common problems that occurred in C++ and Java. In C++, all parameters and return values were passed by value. Passing by value is very efficient, but it suffers from one problem: partial copying (sometimes called slicing the object). If you use a derived object where a base object is expected, only the base portion of the object gets copied. You have effectively lost all knowledge that a derived object was ever there. Even calls to virtual functions are sent to the base class version.
因为在C++和Java里面出现了一些共通的问题,所以在.Net和C#里面加入了对值类型和引用类型的显著区别。在C++里面,所有的参数和返回值都是通过值来传递的。值传递很高效,但是面临一个问题:部分复制(有时被叫做对象分割)。如果你在一个期望使用基类的地方使用了它的派生类,那么只有这个对象的基本部分被复制,这样,你就损失了所有的关于这里是一个派生类的信息,甚至对虚方法的调用都被送到了基类那里。
The Java language responded by more or less removing value types from the language. All user-defined types are reference types. In the Java language, all parameters and return values are passed by reference. This strategy has the advantage of being consistent, but it's a drain on performance. Let's face it, some types are not polymorphicthey were not intended to be. Java programmers pay a heap allocation and an eventual garbage collection for every variable. They also pay an extra time cost to dereference every variable. All variables are references. In C#, you declare whether a new type should be a value type or a reference type using the struct or class keywords. Value types should be small, lightweight types. Reference types form your class hierarchy. This section examines different uses for a type so that you understand all the distinctions between value types and reference types.
Java通过从语言里面或多或少的移除值类型对此做出了回应,所有用户自定义的类型都是引用类型。在Java中,所有的参数和返回值都是按引用传递的。这个策略具有相容性的优势,但是它在性能方面有很大的流失。让我们面对它吧,一些类型不是多态的——它们从没准备那样。Java程序员为每个变量都要分配一个堆(heap)和最终的垃圾收集,这是要付出代价的。同样,在对每个变量进行引用的解析时,也要付出额外的时间损失,因为所有的变量都是引用。在C#里面,通过使用struct或者class关键字,来决定一个新类型是值类型还是引用类型。值类型是小型的轻量级的类型,引用类型形成类继承体系。这一节检查一个类型的不同应用,那样的话,你就能理解值类型和引用类型之间的所有区别了。
To start, this type is used as the return value from a method:
来看,这个类型用作一个方法的返回值;
- private MyData myData;
- public MyData Foo()
- {
- return MyData;
- }
- //call it
- MyData v = Foo();
- TotalSum += v.Value;
If MyData is a value type, the return value gets copied into the storage for v. Furthermore, v is on the stack. However, if MyData is a reference type, you've exported a reference to an internal variable. You've violated the principal of encapsulation (see Item 23).
如果MyData是值类型,返回值就会被拷贝到v的存储空间中,进一步说,V是存储在栈(stack)中的。然而,如果MyData是引用类型,你就暴露了一个对内部变量的引用,违反了封装原则(见Item 23)。
Or, consider this variant:
或者,考虑这个变量:
- private MyData myData;
- public MyData Foo()
- {
- return myData.Clone() as MyData;
- }
- //call it
- MyData v = Foo();
- TotalSum += v.Value;
Now, v is a copy of the original _myData. As a reference type, two objects are created on the heap. You don't have the problem of exposing internal data. Instead, you've created an extra object on the heap. If v is a local variable, it quickly becomes garbage and Clone forces you to use runtime type checking. All in all, it's inefficient.
现在,v是原始的myData的一个拷贝。作为一个引用类型,两个对象是在堆上被创建的,不会有暴露内部数据的问题。相反,是在堆上创建了额外的对象。如果v是一个局部变量,很快就会变成垃圾,Clone强迫你使用运行时类型检查。总之,这是低效的。
Types that are used to export data through public methods and properties should be value types. But that's not to say that every type returned from a public member should be a value type. There was an assumption in the earlier code snippet that MyData stores values. Its responsibility is to store those values.
通过公共方法和属性来暴露数据的类型应该是值类型,但是这不是说:从公共成员返回的任何类型都是值类型。在前面的MyData存储数值的代码段中有一个假设,它的职责就是存储这些数值。
But, consider this alternative code snippet:
但是,考虑下面修改过的代码段:
- private MyType myType;
- public IMyInterface Foo()
- {
- return myType as IMyInterface;
- }
- //call it
- IMyInterface ime = Foo();
- ime.DoWork();
The _myType variable is still returned from the Foo method. But this time, instead of accessing the data inside the returned value, the object is accessed to invoke a method through a defined interface. You're accessing the MyType object not for its data contents, but for its behavior. That behavior is expressed through the IMyInterface, which can be implemented by multiple different types. For this example, MyType should be a reference type, not a value type. MyType's responsibilities revolve around its behavior, not its data members.
myType仍然是从Foo方法返回的,但是这次,不是访问返回值的内部数据,对象是通过指定的接口调用方法来被访问的。你刚才访问过的MyType,不是为了它的数据内容,而是为了它的行为。这种行为通过IMyInterface进行了表达,并且是可以被多种不同的类型实现的。看这个例子,MyType应该是一个引用类型,不是值类型;它的责任都是围绕着它的行为的,而不是它的数据成员。
That simple code snippet starts to show you the distinction: Value types store values, and reference types define behavior. Now look a little deeper at how those types are stored in memory and the performance considerations related to the storage models. Consider this class:
这个简单的代码段开始向你展示了区别:值类型存储数值,引用类型定义行为。现在,看得更深一点儿,看看这些类型在内存里面是怎么存储的,看看和存储模型相关连的性能考虑。考虑这个类:
- public class C
- {
- private MyType a = new MyType();
- private MyType b = new MyType();
- //Remaining implementation removed
- }
- C var = new C();
How many objects are created? How big are they? It depends. If MyType is a value type, you've made one allocation. The size of that allocation is twice the size of MyType. However, if MyType is a reference type, you've made three allocations: one for the C object, which is 8 bytes (assuming 32-bit pointers), and two more for each of the MyType objects that are contained in a C object. The difference results because value types are stored inline in an object, whereas reference types are not. Each variable of a reference type holds a reference, and the storage requires extra allocation.
有多少个对象被创建呢?它们有多大呢?这不确定。如果MyType是值类型,就有一次分配,分配的大小是MyType大小的2倍。然而,如果MyType是引用类型,就有三次分配:一次是为C对象,占有8个字节(假设这里是32位指针),另外两次是为包含在C对象里面的每个MyType对象。这结果的不同是因为,值类型是内联在对象内部被存储的,而引用类型不是。每个引用类型的变量有一个引用,它的存储需要额外的空间分配。
To drive this point home, consider this allocation:
为了让你理解到家,考虑下面的分配:
- MyType[] var = new MyType[100];
If MyType is a value type, one allocation of 100 times the size of a MyType object occurs. However, if MyType is a reference type, one allocation just occurred. Every element of the array is null. When you initialize each element in the array, you will have performed 101 allocationsand 101 allocations take more time than 1 allocation. Allocating a large number of reference types fragments the heap and slows you down. If you are creating types that are meant to store data values, value types are the way to go.
如果MyType是值类型,100个MyType大小的对象分配会发生。然而,如果MyType是一个引用类型,只会有一次分配。这个数组每个元素都是null。当你初始化数组中的每个元素时,将会执行101次分配——101次分配比1次分配花费更多的时间。分配一大批引用类型的话,会让堆变得零碎,降低效率。如果你要创建准备村数数据值的类型,值类型是你的选择。
The decision to make a value type or a reference type is an important one. It is a far-reaching change to turn a value type into a class type. Consider this type:
值类型还是引用类型,做出这个决定是重要的。将一个值类型改成引用类型是很难的,考虑:
- public struct Employee
- {
- private String name;
- private Int32 ID;
- private Decimal salary;
- //Protrrity elided
- public void Pay(BankAccount b)
- {
- b.Balance += salary;
- }
- }
This fairly simple type contains one method to let you pay your employees. Time passes, and the system runs fairly well. Then you decide that there are different classes of Employees: Salespeople get commissions, and managers get bonuses. You decide to change the Employee type into a class:
这个相当简单的类型包含了一个方法,让你给你的员工支付工资。时间一点点过去,系统运行的相当不错。接下来,你决定:有不同的雇员类:销售人员得到佣金,经理得到分红。你决定将Employee改成一个类:
- public class Employee
- {
- private String name;
- private Int32 ID;
- private Decimal salary;
- //Protrrity elided
- public virtual void Pay(BankAccount b)
- {
- b.Balance += salary;
- }
- }
That breaks much of the existing code that uses your customer struct. Return by value becomes return by reference. Parameters that were passed by value are now passed by reference. The behavior of this little snippet changed drastically:
这个改变破坏了很多使用你的结构的现有的代码。返回值变成了返回引用,值传递的参数变成了引用传递。下面一小段代码在行为上发生了戏剧性的变化:
- Employee e1 = Employee.Find("CEO");
- e1.Salary += Bonus;// Add one time bonus
- e1.Pay( CEOBankAccount );
What was a one-time bump in pay to add a bonus just became a permanent raise. Where a copy by value had been used, a reference is now in place. The compiler happily makes the changes for you. The CEO is probably happy, too. The CFO, on the other hand, will report the bug. You just can't change your mind about value and reference types after the fact: It changes behavior.
在工资上的一次提升:增加分红,现在变成了永久的提升。一个值的拷贝被使用的地方,现在变成了引用。编译器很乐意为你做这个改变,CEO可能也会很高兴。在另一方面,CFO可能会上报这个bug。在经历了这个事实之后,你就不会改变关于值类型和引用类型的看法了:引用类型改变了行为。
This problem occurred because the Employee type no longer follow the guidelines for a value type. In addition to storing the data elements that define an employee, you've added responsibilitiesin this example, paying the employee. Responsibilities are the domain of class types. Classes can define polymorphic implementations of common responsibilities easily; structs cannot and should be limited to storing values.
这个问题会发生,是因为:Employee类型再也不是值类型了。除了存储定义了员工的数据元素外,你加入了职责——在这个例子里,就是为员工支付工资。职责是类类型的领域,类可以为一个通用的职责定义多种实现;结构则不能,它被限制来存储数值。
The documentation for .NET recommends that you consider the size of a type as a determining factor between value types and reference types. In reality, a much better factor is the use of the type. Types that are simple structures or data carriers are excellent candidates for value types. It's true that value types are more efficient in terms of memory management: There is less heap fragmentation, less garbage, and less indirection. More important, value types are copied when they are returned from methods or properties. There is no danger of exposing references to internal structures. But you pay in terms of features. Value types have very limited support for common object-oriented techniques. You cannot create object hierarchies of value types. You should consider all value types as though they were sealed. You can create value types that implement interfaces, but that requires boxing, which Item 17 shows causes performance degradation. Think of value types as storage containers, not objects in the OO sense.
.Net的文档建议你将类型的大小作为选择值类型还是引用类型的一个决定性因素。实际上,更好的因素是对该类型的使用。简单的结构体或者数据携带者类型是值类型的绝佳候选者。这是真的:值类型在内存管理方面是更高效的:更少的堆碎片,更少的垃圾,更少的间接化。更重要的是,值类型从方法或者属性返回时,是被复制的,这样就不存在将内部结构的引用暴露出来的危险了。但是你在特性方面付出了代价,值类型对通用面向对象技术的支持很有限。不能创建值类型的继承,应该将所有的值类型到认为是sealed的。你可以创建实现接口的值类型,但是那需要装箱操作,从而引起性能的退化。将值类型考虑成存储器,而不是OO里面的对象。
You'll create more reference types than value types. If you answer yes to all these questions, you should create a value type. Compare these to the previous Employee example:
你可能会创建较多的引用类型而不是值类型。对于下面的问题,如果你的回答都是yes,就应该创建一个值类型。将这些和前面的Employee例子做个比较:
1.Is this type's principal responsibility data storage?
2.Is its public interface defined entirely by properties that access or modify its data members?
3.Am I confident that this type will never have subclasses?
4.Am I confident that this type will never be treated polymorphically?
1.这个类型的原则性职责是存储数据吗?
2.这个类型的公共接口全部是使用属性来访问或者修改它的数据成员吗?
3.确定这个类型以后决不会有子类吗?
4.确定这个类型以后决不会被作为多态来对待吗?
Build low-level data storage types as value types. Build the behavior of your application using reference types. You get the safety of copying data that gets exported from your class objects. You get the memory usage benefits that come with stack-based and inline value storage, and you can utilize standard object-oriented techniques to create the logic of your application. When in doubt about the expected use, use a reference type.
将低层次的数据存储类型构建成值类型,使用引用类型来构建应用程序的行为。使用来自类对象的复制数据,能获得安全性。使用基于栈的,内联的数值存储,可以获得内存存储的好处;可以利用标准的面向对象技术来创建你的应用程序的行为。当在要使用的类型上犹豫不决时,使用引用类型。