对象循环引用,gc_没有参考周期和循环GC的对象

对象循环引用,gc

Each instance of a class in CPython created using the class syntax is involved in a cyclic GC mechanism. This increases the memory footprint of each instance and can create memory problems in heavily loaded systems.

使用语法创建的CPython中的每个类实例都涉及循环GC机制。 这会增加每个实例的内存占用量,并可能在负载较重的系统中造成内存问题

reference counting mechanism when necessary? 参考计数机制?

Let's analyze one approach based on recordclass library that will help to create classes whose instances will only be deleted using the reference counting mechanism.

让我们分析一种基于记录类库的方法,该方法将有助于创建仅使用引用计数机制删除其实例的类。

Note: this is translation from original post (in russian).

注意:这是原始帖子的翻译(俄语)。

关于CPython中的垃圾收集的一些知识 (A little bit about garbage collection in CPython)

The primary mechanism for garbage collection in Python is reference counting. Each object contains a field that contains the current value of the references to it. An object is destroyed as soon as the value of the reference counter becomes zero. However, it does not allow the disposal of objects that contain cyclic references. For example:

Python中垃圾回收的主要机制是引用计数。 每个对象都包含一个字段,其中包含对该对象的引用的当前值。 只要参考计数器的值变为零,就会破坏对象。 但是,它不允许处理包含循环引用的对象。 例如:

lst = []
 lst.append(lst)
 del lst

In such cases, after deleting the object, the counter of references to it remains more than zero. To solve this problem, Python has an additional mechanism that tracks objects and breaks loops in the graph of references between objects. There is a good article on how the cyclic garbage collection mechanism works in CPython3 article.

在这种情况下,删除对象后,对该对象的引用计数器仍大于零。 为了解决这个问题,Python提供了一种附加的机制来跟踪对象并打破对象之间引用图中的循环。 有对循环垃圾回收机制在CPython3如何工作的好文章的文章

与垃圾回收机制相关的内存开销 (Memory overhead associated with the garbage collection mechanism)

Typically, the garbage collection mechanism does not cause problems. But there is certain overhead associated with it:

通常,垃圾回收机制不会引起问题。 但是有一些相关的开销:

PyGC_Head is added to each instance of the class during memory allocation: at least 24 bytes in Python <= 3.7 and 16 bytes in 3.8 on a 64-bit platform. PyGC_Head在内存分配期间被添加到该类的每个实例:在64位平台上,Python <= 3.7中至少为24个字节,在3.8中为3.8中至少有16个字节。

This can create a memory shortage problem if you run many instances of the same process, in which you need to have at the same time a very large number of objects with a relatively small number of attributes, and the amount of memory is limited.

如果您运行同一进程的许多实例,这可能会造成内存不足的问题,在该实例中,您需要同时拥有数量众多且具有相对较少属性的对象,并且内存量受到限制。

有时是否有可能将自己局限于参考计数的基本机制? (Is it sometimes possible to limit oneself to the basic mechanism of reference counting?)

The garbage collection mechanism may be redundant when the class represents a non-recursive data type. For example, records containing values of a simple type (numbers, strings, date/time). To illustrate, consider a simple class:

当类表示非递归数据类型时,垃圾回收机制可能是多余的。 例如,包含简单类型值(数字,字符串,日期/时间)的记录。 为了说明,考虑一个简单的类:

class Point:
     x: int
     y: int

If used correctly, reference cycles are not possible. Although in Python, nothing prevents "to shoot yourself in the foot":

如果使用正确,则不可能进行参考循环。 尽管在Python中,没有什么可以阻止“向自己开枪”:

p = Point(0, 0)
 p.x = p

That is, if cyclic GC is disabled, then in this case the object will not be disposed of.

也就是说,如果禁用了循环GC,则在这种情况下将不会处理该对象。

However, for the Point class, just could be limited to a reference counting mechanism. Of course, provided that when the program is executed, reference cycles will not be created, that is, the x and y attributes will take only integer values, as was stated when defining the class. But there is no standard way to refuse cyclic GC for user defined class yet.

但是,对于Point类, 可能仅限于引用计数机制。 当然,只要执行程序时就不会创建参考循环,即x和y属性将仅采用整数值,如定义类时所述。 但是,尚无标准方法可以拒绝用户定义的类的循环GC。

Modern CPython is designed so that when defining custom classes in the structure, which is responsible for the type that defines the custom class, the flag Py_TPFLAGS_HAVE_GC is always set. It determines that class instances will be included in the garbage collection mechanism. For all such objects, when created, the header PyGC_Head is added, and they are included in the list of monitored objects. If the flag Py_TPFLAGS_HAVE_GC is not set, then only the basic reference counting mechanism works. However, a single reset of Py_TPFLAGS_HAVE_GC will not work. You will need to make changes to the core CPython responsible for creating and destroying instances. This is still problematic because it is too big a change in the core of CPython.

现代CPython的设计使得在结构中定义自定义类(负责定义自定义类的类型)时, 始终会设置标志Py_TPFLAGS_HAVE_GC 。 它确定类实例将包含在垃圾回收机制中。 对于所有此类对象,在创建时都会添加标头PyGC_Head ,并将它们包含在受监视对象的列表中。 如果未设置标志Py_TPFLAGS_HAVE_GC ,则仅基本参考计数机制起作用。 但是,一次重置Py_TPFLAGS_HAVE_GC将不起作用。 您将需要对负责创建和销毁实例的核心CPython进行更改。 这仍然是有问题的,因为它对CPython核心的更改太大了。

关于一种实施 (About one implementation)

As an example of the implementation of the idea, consider using of base class dataobject from the recordclass project. Using it, you can create classes whose instances do not participate in the mechanism of cyclic GC (Py_TPFLAGS_HAVE_GC is not seted and, accordingly, there is no additional header PyGC_Head). They have exactly the same structure in memory as class instances with __slots__, but without PyGC_Head:

作为实现此想法的示例,请考虑使用recordclass项目中的基类dataobject 。 使用它,您可以创建其实例不参与循环GC机制的类(未设置Py_TPFLAGS_HAVE_GC ,因此,没有其他头PyGC_Head )。 它们在内存中的结构与带有__slots__的类实例的结构完全相同,但没有PyGC_Head

from recordclass import dataobject
class Point(dataobject):
    x:int
    y:int

>>> p = Point(1,2)
>>> print(p.__sizeof__(), sys.getsizeof(p))
32 32

For comparison, we give a similar class with __slots__:

为了进行比较,我们使用__slots__给出了一个类似的类:

class Point:
    __slots__ = 'x', 'y'
    x:int
    y:int

>>> p = Point(1,2)
>>> print(p.__sizeof__(), sys.getsizeof(p)) # this is in python 3.7
32 64

The size difference is exactly the size of the PyGC_Head header. For instances with several attributes, such an increase in the size of its memory footprint may be significant. For instances of the Point class, addingPyGC_Head results in a 2-fold increase in its size.

大小差异恰好是PyGC_Head标头的大小。 对于具有多个属性的实例,其内存占用空间大小的这种增加可能很重要。 对于Point类的实例,添加PyGC_Head导致其大小增加2倍。

To achieve this effect, a special metaclass datatype is used, which provides the setting of subclasses of dataobject. As a result of the configuration, the flag Py_TPFLAGS_HAVE_GC is reset, the base instance size tp_basicsize increases by the amount necessary to store additional slots for fields. The corresponding field names are listed when the class is declared (the class Point has two of them: x and y). The datatype metaclass also provides setting the values ​​of the slots tp_alloc, tp_new, tp_dealloc, tp_free, which implement the correct algorithms for creating and destroying instances in memory. By default, instances lack __weakref__ and __dict__ (as with class instances with __slots__).

为了实现此效果,使用了特殊的元类datatype ,该datatype提供dataobject的子类的dataobject 。 作为配置的结果,标志Py_TPFLAGS_HAVE_GC被重置,基本实例大小tp_basicsize增加了存储字段的其他插槽所需的数量。 声明该类时,将列出相应的字段名称(类Point具有两个名称: xy )。 datatype元类还提供设置插槽tp_alloctp_newtp_dealloctp_free的值 ,这些值实现了用于创建和销毁内存中实例的正确算法。 默认情况下,实例缺乏__weakref____dict__ (如与类的实例__slots__ )。

结论 (Conclusion)

As one could see, in CPython, if necessary, it is possible to disable the mechanism of cyclic garbage collection for a particular class, when there is confidence that its instances will not form cyclic references. This will also reduce the size of each instance in memory by the size of the PyGC_Head header.

可以看到,在CPython中,如果确信其实例不会形成循环引用,则有可能在特定类中禁用循环垃圾收集机制。 这也会通过PyGC_Head标头的大小来减少内存中每个实例的大小。

In the next article we will try to demonstrate ability to reduce memory usage using classes based on dataobject.

在下一篇文章中,我们将尝试演示使用基于dataobject的类来减少内存使用的能力。

翻译自: https://habr.com/en/post/475120/

对象循环引用,gc

在 .NET 中,垃圾回收器(GC)能够处理对象之间的循环引用,并正确地回收不再被外部引用对象。.NET 的垃圾回收机制基于“根引用”(roots)来追踪哪些对象是可达的。如果一组对象之间存在循环引用,但它们整体不被任何根引用所指向,则这些对象会被视为不可达并被回收 [^3]。 具体来说,GC 会从应用程序根(如全局变量、静态字段、线程栈上的局部变量等)出发,遍历所有可到达的对象。未被访问到的对象将被视为垃圾并被回收。这种机制天然支持对循环引用的处理,因为即使两个或多个对象相互引用,只要它们都不再被根引用,就会被回收 [^1]。 ### 对象生命周期GC 的关系 对象的生命周期由其是否被引用决定。当一个对象被创建后,它被分配在托管堆上,并处于第 0 代中。随着垃圾回收的发生,存活下来的对象会被晋升到更高代(第 1 代第 2 代),而长期存活的对象通常驻留在第 2 代。每次垃圾回收时,GC 会检查哪些对象仍然可达,并释放不可达对象占用的内存 [^5]。 对于实现了终结器(Finalizer)的对象GC 会在回收之前将其放入终结队列,并在稍后调用终结器。这可能会影响性能,并且终结顺序不可控,因此推荐使用 `IDisposable` 接口配合 `Dispose()` 方法手动释放资源 [^4]。 ### 避免内存泄漏的最佳实践 为了更好地管理资源并避免潜在的内存泄漏问题,可以采取以下措施: - 使用 `using` 语句确保实现 `IDisposable` 的对象在使用完后立即释放。 - 避免不必要的事件订阅或委托绑定,防止意外持有对象引用。 - 使用弱引用(WeakReference)存储临时缓存数据,使得对象可以在必要时被回收。 - 利用性能分析工具(如 Visual Studio 的诊断工具、PerfView 等)检测内存使用情况,及时发现潜在的内存泄漏点。 ### 示例代码:使用 IDisposable 正确释放资源 ```csharp public class ResourceUser : IDisposable { private bool disposed = false; protected virtual void Dispose(bool disposing) { if (!disposed) { if (disposing) { // 释放托管资源 } // 释放非托管资源 disposed = true; } } public void Dispose() { Dispose(true); GC.SuppressFinalize(this); } ~ResourceUser() { Dispose(false); } } ``` 通过上述方式,开发者可以更精细地控制对象生命周期,同时依赖 GC 自动处理大部分内存管理任务。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符  | 博主筛选后可见
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值