Pimpl idiom

本文详细介绍了 C++ 中的 Pimpl (pointer to implementation) 指针技术,探讨了其如何帮助分离接口与实现,提高编译速度,并简化维护工作。此外,还讨论了 Pimpl 的优点,如支持事务一致性和异常安全的代码,以及如何减少动态内存分配。

Pimp My Pimpl


This is a translation of a two-part article that originally appeared in Heise Developer. You can find the originals here:

You can find Part Two here:

Pimp My Pimpl

Much has been written about this funnily-named idiom, alternatively known as d-pointer, compiler firewall, or Cheshire Cat. Heise Developer highlights some angles of this practical construct that go beyond the classic technique.

Part One

This article first recapitulates the classic Pimpl idiom (pointer-to-implementation), points out its advantages, and goes on to develop further idioms on its basis. The second part will concentrate on how to mitigate the disadvantages that inevitably arise through Pimpl use.

The Classic Idiom

Every C++ programmer has probably stumbled across a class definition akin to the following:

1

2

3

4

5

6

class Class {

  // ...

private:

    class Private; // forward declaration

    Private * d;   // hide impl details

};

Here, the programmer moved the data fields of his class Class into a nested class Class::Private. Instances of Classmerely contain a pointer d to their Class::Private instance.

To understand why the class author used this indirection, we need to take a step back and look at the C++ module system. In contrast to many other languages, C++, as a language of C descent, has no built-in support for modules (such support was proposed for C++0x, but did not make it into the final standard). Instead, one factors module function declarations (but not usually their definitions) into header files, and makes them available to other modules using the #include preprocessor directive. This, however, leaves the header files filling a double role: On the one hand, they serve as the module interface. On the other, as a declaration site for potentially internal implementation details.

In times of C, this worked well: Implementation details of functions are encapsulated perfectly by the declaration/definition split; one could either merely forward-declare structs (in which case they were private), or define them directly in the header file (in which case they were public). In “object-oriented C”, class Class from above would maybe look like the following:

1

2

3

4

5

6

struct Class;                           // forward declaration

typedef struct Class * Class_t;         // -> only private members

void Class_new( Class_t * cls );        // Class::Class()

void Class_release( Class_t cls );      // Class::~Class()

int Class_f( Class_t cls, double num ); // int Class::f( double )

//...

Unfortunately, that doesn’t work in C++. Methods must be declared inside the class. Since classes without methods are rather boring, class definitions usually appear in C++ header files. Since classes, unlike namespaces, cannot be reopened, the header file must contain declarations for all (data fields, as well as) methods:

1

2

3

4

5

6

class Class {

public:

    // ... public methods ... ok

private:

    // ... private data & methods ... don't want these here

};

The problem is evident: The module interface (header file) necessarily contains implementation details; always a bad idea. That is why one uses a rather ugly trick and in short factors all implementation details (data fields as well as private methods) into a separate class:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

// --- class.h ---

class Class {

public:

    Class();

    ~Class();

 

    // ... public methods ...

 

    void f( double n );

 

private:

    class Private;

    Private * d;

};

// -- class.cpp --

#include <class.h>

 

class Class::Private {

public:

    // ... private data & methods ...

    bool canAcceptN( double num ) const { return num != 0 ; }

    double n;

};

 

Class::Class()

    : d( new Private ) {}

 

Class::~Class() {

    delete d;

}

 

void Class::f( double n ) {

    if ( d->canAcceptN( n ) )

        d->n = n;

}

Since Class::Private is used only in the declaration of a pointer variable, ie. “in name only” (Lakos) and not “in size”, a forward declaration suffices, as in the C case. All methods of Class now access private methods and data members of Class::Private through d only.

In this way, one gains the convenience of a fully-encapsulating module system in C++, too. Because of the recourse to indirection, the developer pays for these benefits with an additional memory allocation (new Class::Private), the indirection on accessing data fields and private methods, as well as the total waiving of (at least public) inlinemethods. As the second part will show, the semantics of const methods also change.

Before the second part of this article addresses the issue of how to rectify, or at least mitigate, the above downsides, the remainder of this article will first shed some light on the idiom’s benefits.

Benefits of the Pimpl Idiom

They are substantial. By encapsulating all implementation details, a slim and long-term stable interface (header file) arises. The former leads to more readable class definitions; the latter helps maintaining binary compatibility even through extensive changes to the implementation.

For instance, Nokia’s “Qt Development Frameworks” department (formerly Trolltech) has carried out profound changes to the widget rendering at least twice during the development of their “Qt 4” class library without the need to so much as relink programs using Qt 4.

Particularly in larger projects, the tendency of the Pimpl Idiom to dramatically speed up builds should not be underestimated. This is accomplished both by a reduction of #include directives in header files and though the considerably reduced frequency of changes to header files of Pimpl classes in general. In “Exceptional C++”, Herb Sutter reports regular doubling of compilation speeds, John Lakos even claims build speed-ups of two orders of magnitude.

Another virtue of the design: classes with d-pointers are well-suited for transaction-oriented and exception-safe code, respectively. For instance, the developer may use the Copy-Swap Idiom (Sutter/Alexandrescu: C++ Coding Standards, Item 56) to create a transactional (all-or-nothing) copy assignment operator:

1

2

3

4

5

6

7

8

9

10

11

12

class Class {

    // ...

    void swap( Class & other ) {

        std::swap( d, other.d );

    }

    Class & operator=( const Class & other ) {

        // this may fail, but doesn't change *this

        Class copy( other );

        // this cannot fail, commits to *this:

        swap( copy );

        return *this;

    }

Implementation of C++0x move operations is trivial as well (and, in particular, identical across all Pimpl classes):

1

2

3

4

5

6

7

8

    // C++0x move semantics:

    Class( Class && other ) : d( other.d ) { other.d = 0; }

    Class & operator=( Class && other ) {

        std::swap( d, other.d );

        return *this;

    }

    // ...

};

Both member swap and assignment operators may be implemented inline in this model, without compromising the class’ encapsulation; developers should make good use of this fact.

Extended Means of Composition

As the last benefit the option to cut down on some of the extra dynamic memory allocations through direct aggregation of data fields should be mentioned. Without Pimpl, aggregation would customarily have been through a pointer in order to decouple classes from each other (a kind of Pimpl per data field). By “pimpling” the whole class once, the need to hold private data fields of complex type only though pointers can be dispensed with.

For instance, the idiomatic Qt dialog class

1

2

3

4

5

6

7

8

9

10

11

class QLineEdit;

class QLabel;

class MyDialog : public QDialog {

    // ...

private:

    // idiomatic Qt:

    QLabel    * m_loginLB;

    QLineEdit * m_loginLE;

    QLabel    * m_passwdLB;

    QLineEdit * m_passwdLE;

};

turns into

1

2

3

4

5

6

7

8

9

10

11

#include <QLabel>

#include <QLineEdit>

class MyDialog::Private {

    // ...

    // not idiomatic Qt, but less heap allocations:

    QLabel    loginLB;

    QLineEdit loginLE;

    QLabel    passwdLB;

    QLineEdit passwdLE;

};

Qt aficionados may argue that the QDialog destructor already destroys the child widgets; direct aggregation would therefore trigger a double-delete. Indeed, usage of this technique poses the threat of allocation sequence errors (double-delete, use-after-free, etc), particularly if data fields are also owned by the class, and vice versa. The transformation shown, however, is safe here, since Qt always allows to delete children before their parents.

This approach is especially effective in case data fields aggregated this way are themselves instances of “pimpled” classes. This is the case in the example shown, and usage of the Pimpl Idiom saves four dynamic memory allocations of size sizeof(void*) while incurring only one additional (larger) allocation. This can lead to more efficient use of the heap, since small allocations regularly create especially high overhead in the allocator.

In addition, the compiler is much more likely to “de-virtualise” calls to virtual functions in this scenario, ie. it removes the double indirection caused by the virtuality of the function call. This requires interprocedural optimisation when using aggregation by pointer. Whether or not this indeed constitutes a win in runtime performance against the background of an extra indirection though the d-pointer has to be checked as needed by profiling concrete classes.

In case profiling shows that the dynamic memory allocation turns in to a bottleneck, the “Fast Pimpl” Idiom (Exceptional C++, Item 30) may produce relief. In this variant, a fast allocator, e.g. boost::singleton_pool, is used to create Private instances instead of global operator new().

Interim Findings

As a well-known C++ idiom, Pimpl allows class authors to separate class interface and implementation to an extent not directly provided for by C++. As a positive side-effect, the use of d-pointers speeds up compilation runs, eases implementation of transaction semantics, and allows, through extended means of composition, implementations that potentially are more runtime-efficient.

Not everything is shiny when using d-pointers, though: In addition to the extra Private class, and its dynamic memory allocation, modified const method semantics, as well as potential allocation sequence errors are cause for concern.

For some of these, the author will show solutions in the second part of the article. Complexity will increase further, though, so that for each concrete case one has to verify anew that the benefits of the idiom outweigh the downsides. If in doubt, this needs to be done per class in question. As usual, there can be no blanket judgements.

Outlook

The second and last part of this article will take a closer look under the hood of Pimpl, uncover the rust-streaked areas, and pimp the idiom using a whole array of accessories.

转载于:https://my.oschina.net/LsDimplex/blog/757063

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值