Block和GCD介绍

最新推荐文章于 2024-05-12 16:29:18 发布

原创最新推荐文章于 2024-05-12 16:29:18 发布 · 521 阅读

0 ·

CC 4.0 BY-SA版权

本文深入探讨了BlockObjects的概念及其相较于传统函数指针的优势，并详细解析了Grand Central Dispatch (GCD) 的工作机制，包括其在内存管理和多线程调度方面的特点。

Block Objects

block object对比函数指针的好处：

Block objects (informally, “blocks”) are an extension to C, as well as Objective-C and C++, that make it easy for programmers to define self-contained units of work. Blocks are similar to — but far more powerful than — traditional function pointers. The key differences are:

Blocks can be defined inline, as “anonymous functions.”
Blocks capture read-only copies of local variables, similar to “closures” in other languages

Memory Management

Internally, a block object is implemented as a function pointer plus context data and optional support routines. It is allocated directly on the stack for efficiency, which means that you need to copy a block to the heap (and release it when you are done) if you want it to persist outside its initial scope.

block对象的实现是函数指针加上上下文数据和额外的支持函数。它直接被分配到栈上。如果在heap中使用，就必须要复制它。

C++ Behaviors

block在C++中的行为。

Block objects are available in both C++ and Objective-C++. They use the same block management API as C, but with additional considerations:

Unlike Objective-C, C++ doesn’t have a standard reference-counting mechanism. Thus, you need to manually ensure that any objects referenced by your block (including the implicit this pointer, if any) continue to exist for the lifecycle of the block or any of its copies.
Any stack-based C++ objects referenced by the block must have a const copy constructor, which will be invoked whenever the block is created or copied.
The block will invoke any appropriate destructors when objects it has constructed are released.

因为C++没有标准的引用计数机制，因此，需要人工保证被block引用的对象或者它的拷贝在block的生命周期中都一直存在。

任何被block引用的基于栈的C++对象必须有一个const的拷贝构造函数。

Dispatch Queues

Global Concurrent Queues

The "root level" of GCD is a set of global concurrent queues for every UNIX process, each of which is associated with a pool of threads. Most of the time you will simply use the default queue:

GCD实际上是UNIX进程的一系列全局并发队列，每一个队列都和线程池相关。

For the common case of a “parallel for loop”, GCD provides an optimized “apply” function that submits a block for each iteration:

#define COUNT 128

__block double result[COUNT];

dispatch_apply(COUNT, q_default, ^(size_t i){

 	result[i] = complex_calculation(i);

});

double sum = 0;

for (int i=0; i < COUNT; i++) sum += result[i];

Note that dispatch_apply is synchronous, so all the applied blocks will have completed by the time it returns.

对于循环的并发实现，GCD提供一个优化的apply函数。注意，该函数是同步的，也就是所有的blocks在返回的时候肯定是已经执行完成的。

Private Serial Queues

In addition to the global concurrent queues, developers can create their own private serial queues. These are typically used to enforce mutually exclusive access to critical sections of code. One common use is to serialize access to shared data structures:

__block double sum = 0;

dispatch_queue_t q_sum = dispatch_queue_create("com.example.sum", NULL);

The first parameter should be a unique and descriptive reverse-DNS name to assist debugging, and the second parameter is currently NULL to allow for future expansion.

We can use this queue to rewrite the previous example using a shared accumulator:

#define COUNT 128

dispatch_apply(COUNT, q_default, ^(size_t i){

 	double x = complex_calculation(i);

   dispatch_async(q_sum, ^{ sum += x; });

});

dispatch_release(q_sum);

可以创建自己的私有串行队列。串行队列用于相互执行独占访问（就是独占访问，加锁）重要的代码。常用于串行访问共享数据。

Implementation

Atomic Operations

As the centerpiece of both serialization and concurrency, queues need to be extremely efficient yet thread-safe, so they can be quickly yet safely accessed from any thread. To achieve this, blocks are added and removed from queues using atomic operations available on modern processors, which are guaranteed to maintain data consistency even in the presence of multiple cores. These are the same primitives used to implement locks, and are inherently safe and fast.

Unlike locks, however, queues are take up very few resources and don’t require calling into the kernel. This makes it safe and efficient to use as many as you need to describe the fine-grained structure of your code, rather than having to use larger chunks to minimize the overhead of manually managing locks and threads.

queue作为串行化和并行化的核心，必须要保证它是线程安全的，这样才能够保证从任何线程访问它都是安全的，因此，block增加到queue或者从queue中移除必须是原子操作。

Thread Pools

GCD will dequeue blocks and private queues from the global queues a first-in/first-out (FIFO) basis as long as there are available threads in the thread pool, providing an easy way to achieve concurrency. If there is more work than available threads, GCD will ask for the kernel for more, which are given if there are idle logical processors. Conversely, GCD will eventually retire threads from the pool if they are unused or the system is under excessive load. This all happens as a side effect of queuing and completing work, so that GCD itself doesn't require a separate thread.

This approach provides optimal thread allocation and CPU utilization across a wide range of loads, though it works best if threads aren’t forced to wait behind locks or I/O requests. Fortunately GCD provides mechanisms to help prevent that from happening, as discussed below.

只要线程池中有可用的线程，GCD会从FIFO的公用队列中调用（出列）block和私有的队列。如果线程不够处理工作时，GCD会想内核请求更多的线程。

GCD会在线程不被使用或者CPU过度使用的情况下释放线程池中的线程。

Synchronization

Grand Central Dispatch provides four primary mechanisms for tracking completion of asynchronous work:

synchronous dispatch
callbacks
groups
semaphores

GCD提供四个原始的机制去跟踪异步工作的完成。

Synchronous Dispatch

While asynchronous calls give GCD maximum scheduling flexibility, sometimes you do need to know when that block has finished execution. One option is to just add the block synchronously using dispatch_sync:

dispatch_sync(a_queue, ^{ wait_for_me(); });

However, this requires the parent thread (and queue) to idle until it completes, and thus shouldn’t be used for non-trivial operations. Instead, use one of the following options: callbacks, groups or semaphores.

同步的dispatch要求父线程和队列在block完成之前都是空闲的，因此不适用于non-trivial （重大的？耗时的？中文翻译是狗屁）的操作。应该考虑使用callbacks，groups和semaphores替代。

Extra：

关于non-trivial的理解：

Not lightweight. Nontrivial is a favorite word among programmers and computer people for describing any task that is not quick and easy to accomplish. It may mean "extremely" difficult and time consuming. Therefore, since many programmers are unbridled optimists, take the word seriously!（http://www.pcmag.com/encyclopedia/term/63127/nontrivial）

Callbacks

The simplest way to resume work after a block completes is to nest another dispatch back to the original queue, using a completion callback:

dispatch_retain(first_queue);

dispatch_async(a_queue, ^{

   do_not_wait_for_me();

   dispatch_async(first_queue, ^{ i_am_done_now(); });

   dispatch_release(first_queue);

});

Note that since the queue is referenced by the block, it must be explicitly retained until it is invoked.

Groups

Another option is to use a dispatch group, while allows you to track block completion across multiple queues:

dispatch_group_t my_group = dispatch_group_create();

dispatch_group_async(my_group, a_queue, ^{ some_async_work(); });

dispatch_group_async(my_group, b_queue, ^{ some_other_async_work(); });

dispatch_group_notify(my_group, first_queue, ^{ do_this_when_all_done(); });

dispatch_release(my_group);

Note that since GCD calls always retain objects passed to them it is safe to release my_group even while the “notify” is pending.

In this example, do_this_when_all_done() is executed only after every one of the blocks in the group have completed. It is also perfectly legal for a block to add additional work to a group during execution, allowing a potentially unbounded set of operations to be tracked.

Alternatively, you can instead halt execution until the group completes in a manner analogous to pthread_join(3):

dispatch_group_wait(my_group, DISPATCH_TIME_FOREVER);

do_this_when_all_done();

Note however that dispatch_group_wait pauses the current thread much like a dispatch_sync, and should therefore be used sparingly.

groups可以跨多个队列跟踪block的完成。

GCD调用会retain传递给它的对象，所以release my_group是安全的，尽管“notify”正在进行。

Semaphores

Finally, GCD has an efficient, general-purpose signaling mechanism known as dispatch semaphores. These are most commonly used to throttle usage of scarce resources, but can also help track completed work:

dispatch_semaphore_t sema = dispatch_semaphore_create(0);

dispatch_async(a_queue, ^{ some_work(); dispatch_semaphore_signal(sema); });

more_work();

dispatch_semaphore_wait(sema, DISPATCH_TIME_FOREVER);

dispatch_release(sema);

do_this_when_all_done();

Like other GCD objects, dispatch semaphores usually don’t need to call into the kernel, making them much faster than regular semaphores when there is no need to wait.

Event Sources

In addition to scheduling blocks directly, developers can set a block as the handler for event sources such as:

Timers
Signals
File descriptors and sockets
Process state changes
Mach ports
Custom application-specific events

When the source “fires,” GCD will schedule the handler on the specific queue if it is not currently running, or coalesce pending events if it is. This provides excellent responsiveness without the expense of either polling or binding a thread to the event source. Plus, since the handler is never run more than once at a time, the block doesn’t even need to be reentrant.

除了可以schedule block之外，可以把block设置为事件源的handler。

当事件源fires，如果block并不在运行，GCD会schedule这个handler到特定的队列上，如果正在运行，那么就会合并到正在运行的事件上。

Timer Example

For example, this is how you would create a timer that prints out the current time every 30 seconds -- plus 5 microseconds leeway, in case the system wants to align it with other events to minimize power consumption.

dispatch_source_t timer = dispatch_source_create(DISPATCH_SOURCE_TYPE_TIMER, 0, 0, q_default); //run event handler on the default global queue

dispatch_time_t now = dispatch_walltime(DISPATCH_TIME_NOW, 0);

dispatch_source_set_timer(timer, now, 30ull*NSEC_PER_SEC, 5000ull);

dispatch_source_set_event_handler(timer, ^{

   printf("%s\n", ctime(time(NULL)));

});

Sources are always created in a suspended state to allow configuration, so when you are all set they must be explicitly resumed to begin processing events.

dispatch_resume(timer);

You can suspend a source or dispatch queue at any time to prevent it from executing new blocks, though this will not affect blocks that are already being processed.

事件源创建后总是处于暂停状态的，等待配置，所以必须显式开始事件处理。

Custom Events Example

GCD provides two different types of user events, which differ in how they coalesce the data passed to dispatch_source_merge_data:

DISPATCH_SOURCE_TYPE_DATA_ADD accumulates the sum of the event data (e.g., for numbers)
DISPATCH_SOURCE_TYPE_DATA_OR combines events using a logical OR (e.g, for booleans or bitmasks)

Though it is arguably overkill, we can even use events to rewrite our dispatch_apply example. Since the event handler is only ever called once at a time, we get automatic serialization over the "sum" variable without needing to worry about reentrancy or private queues:

__block unsigned long sum = 0;

dispatch_source_t adder = dispatch_source_create(DISPATCH_SOURCE_TYPE_DATA_ADD, 0, 0, q_default);

dispatch_source_set_event_handler(adder, ^{

   sum += dispatch_source_get_data(adder);

});

dispatch_resume(adder);

#define COUNT 128

dispatch_apply(COUNT, q_default, ^(size_t i){

   unsigned long x = integer_calculation(i);

   dispatch_source_merge_data(adder, x);

});

dispatch_release(adder);

Note that for this example we changed our calculation to use integers, as dispatch_source_merge_data expects an unsigned long parameter.

DISPATCH_SOURCE_TYPE_DATA_ADD：该类型事件是统计事件数据的总和，例如numbers。

DISPATCH_SOURCE_TYPE_DATA_OR：该类型使用逻辑操作符or计算事件的数据，例如布尔类型和掩码等。

File Descriptor Example

Here is a more sophisticated example involving reading from a file. Note the use of non-blocking I/O to avoid stalling a thread:

int fd = open(filename, O_RDONLY);

fcntl(fd, F_SETFL, O_NONBLOCK);  // Avoid blocking the read operation

dispatch_source_t reader =

  dispatch_source_create(DISPATCH_SOURCE_TYPE_READ, fd, 0, q_default);

We will also specify a “cancel handler” to clean up our descriptor:

dispatch_source_set_cancel_handler(reader, ^{ close(fd); } );

The cancellation will be invoked from the event handler on, e.g., end of file:

typedef struct my_work {…} my_work_t;

dispatch_source_set_event_handler(reader, ^{

   size_t estimate = dispatch_source_get_data(reader);

   my_work_t *work = produce_work_from_input(fd, estimate);

   if (NULL == work)

   	dispatch_source_cancel(reader);

   else

   	dispatch_async(q_default, ^{ consume_work(work); free(work); } );

});

dispatch_resume(reader);

To avoid bogging down the reads, the event handler packages up the data in a my_work_t and schedules the processing in another block. This separation of concerns is known as the producer/consumer pattern, and maps very naturally to Grand Central Dispatch queues. In case of imbalance, you may need to adjust the relative priorities of the producer and consumer queues or throttle them using semaphores.

来源： <https://developer.apple.com/library/mac/featuredarticles/BlocksGCD/_index.html>