梳理binder通信流程(binder驱动部分)

目录

0.前言

1.整体流程

1.1 BC_ENTER_LOOPER

1.2 BC_TRANSACTION

1.3 BR_TRANSACTION

1.4 BC_REPLY

1.5 BR_REPLY

2.BC_TRANSACTION

3.BR_TRANSACTION

4.BC_REPLY

5.BR_REPLY

6.binder内存管理

6.1 binder内存分配

6.2 binder内存回收

7.binder线程优先级继承

8.binder线程创建

9.匿名binder

10.binder死亡通知


0.前言

在android系统中,与binder打交道是避不开的,因此需要了解binder通信流程。但是binder通信流程比较复杂、涉及模块比较多,很难短时间内把binder通信原理全部搞清楚。网上很多介绍binder通信的文章,一方面年代久远(大多都是5-10年前),虽说binder整个框架没有变,但是其中一些技术细节以及代码是有所变化的;另一方面,要么讲的太浅或流程不全,要么缺少关键代码的注释,因此读完之后仍是一肚子疑问,包括但不限于如下:

⭐server->client时会消耗client进程binder_buffer吗?
⭐server->client时reply数据是拷贝到内核还是写在server进程对应的binder内存中?
⭐binder线程创建与退出thread loop时机?
⭐BC_FREE_BUFFER时机?
⭐client->server时如何确认server端binder线程?
⭐server->client时如何确认client线程?
⭐匿名binder,如何找到对应的binder_node或binder_ref?
⭐binder的一次拷贝是发生在什么时候?
⭐binder线程数量最多是多少个?
⭐biner主线程与普通binder线程的区别?
⭐binder线程如何继承优先级?
⭐当请求size大于free_buffers中的最大size时,如何解决?
⭐free_buffers中的buffer更新过程与时机?
⭐当服务端线程没有空闲可用的binder线程时,如何通知进程新建binder线程?
⭐binder线程是否可继承rt调度?
⭐如何确定服务端binder_buffer大部分被谁占用?
⭐async space与sync space是否均有限制?
⭐BR_REPLY与BR_TRANSACTION时机?
⭐free_buffers中的binder_buffer可能被分割,那么释放可以合并成size更大的binder_buffer?
⭐binder主线程与普通binder线程的区别?
⭐binder死亡通知原理?
⭐...


故写下此文,用于记录对binder通信的梳理细节,便于日后查阅。本文着重于binder驱动部分的流程,代码分析基于android14+kernel5.15。

1.整体流程

以最典型的同步binder通信为例,其整体流程如上图所示。

1.1 BC_ENTER_LOOPER

首先,在进程初始化阶段会执行一些与binder相关的操作,比如打开binder设备文件、初始化binder线程池、调用mmap为进程分配1Mb左右大小的虚拟地址空间、创建binder主线程等,其中binder主线程创建完毕会通过IPCThreadState::joinThreadPool添加到binder线程池,并通过发送BC_ENTER_LOOPER指令来通知binder驱动用户空间binder主线程已创建完成,为后续执行binder通信做好准备。

1.2 BC_TRANSACTION

client进程向server进程发起binder通信,此时client进程会向kernel发送BC_TRANSACTION指令以及相关数据,binder驱动接收到指令后会将上层传来的数据(主要有binder_transaction_data和flat_binder_object等类型)填充到binder_transaction结构体中,包括server端对应的进程target_proc、binder线程target_thread、接收数据的缓冲区binder_buffer等。binder驱动将对应的binder_work添加到server端进程或线程的todo队列中,并唤醒server端binder线程。同时,binder驱动会向client发送BR_TRANSACTION_COMPLETE指令,告知client已经接收到数据。

1.3 BR_TRANSACTION

server端binder线程被唤醒后根据todo队列里的binder_work找到对应的binder_transaction,并确认对应的cmd指令为BR_TRANSACTION。IPCThreadState::executeCommand方法根据cmd指令去读取和处理client进程发来的数据,然后执行BBinder::transact方法去响应client端的业务请求。

1.4 BC_REPLY

server端执行业务结束,将处理结果发给binder驱动,对应的cmd指令为BC_REPLY。binder驱动接收到数据并处理后,会向server端发送binder驱动会向client发送BR_TRANSACTION_COMPLETE指令。

1.5 BR_REPLY

最后,binder驱动接收到server端发来的数据,先根据binder_transaction找到之前发起binder通信的client线程并唤醒,确认对应cmd指令为BR_REPLY,client线程。client线程返回上层继续执行IPCThreadState::waitForResponse逻辑,将server端返回的数据binder_transaction_data填充到reply(Parcel类型)中,执行client后续流程。至此,binder通信流程结束。

下面将梳理各个阶段的代码细节,以及主干之外的一些关键流程。
 

2.BC_TRANSACTION

在client发起binder通信时,会通过如下代码流程最终通过IPCThreadState向binder驱动发送BC_TRANSACTION命令,同时会把上层传下来的数据填充到binder_transaction_data中,并一起发送到内核。binder_transaction_data这个结构体在用户空间与内核空间都是通用的,其中比较关键的属性为handle(用于在内核中找到对应的binder_ref结构体)、code(用于确认server将要执行的业务)、data_size(用于在分配binderbuffer时确认所需的buffer size)、data(用于binder驱动读取从用户空间拷贝到binder内存中的数据,并根据binder数据类型进行必要的转换)。最终,client端线程阻塞在waitForResponse方法中(非one_way),等待server端返回结果。

IPCThreadState::transact->waitForResponse->talkWithDriver->ioctl

/frameworks/native/libs/binder/IPCThreadState.cpp
807  status_t IPCThreadState::transact(int32_t handle,
808                                    uint32_t code, const Parcel& data,
809                                    Parcel* reply, uint32_t flags)
810  {
...
>>将client发来的数据写入到binder_transaction_data中
>>注意这里传入的cmd指令为BC_TRANSACTION
827      err = writeTransactionData(BC_TRANSACTION, flags, handle, code, data, nullptr);
...
>>判断binder通信类型是否为oneway
834      if ((flags & TF_ONE_WAY) == 0) {
...
>>执行waitForResponse,与binder驱动通信并等待结果返回
852          if (reply) {
853              err = waitForResponse(reply);
854          } else {
855              Parcel fakeReply;
856              err = waitForResponse(&fakeReply);
857          }
...
877      } else {
>>如果是oneway,则传入的参数均为null
878          err = waitForResponse(nullptr, nullptr);
879      }
880  
881      return err;
882  }


1231  status_t IPCThreadState::writeTransactionData(int32_t cmd, uint32_t binderFlags,
1232      int32_t handle, uint32_t code, const Parcel& data, status_t* statusBuffer)
1233  {
>>新建一个binder_transaction_data类型的变量tr,该数据类型在binder驱动中同样适用
1234      binder_transaction_data tr;
1235  
1236      tr.target.ptr = 0; /* Don't pass uninitialized stack data to a remote process */
>>handle代表BpBinder对象
1237      tr.target.handle = handle;
>>code表示将要执行的server端方法
1238      tr.code = code;
1239      tr.flags = binderFlags;
1240      tr.cookie = 0;
1241      tr.sender_pid = 0;
1242      tr.sender_euid = 0;
1243  
1244      const status_t err = data.errorCheck();
1245      if (err == NO_ERROR) {
>>data为client端写入Parcel中的数据
1246          tr.data_size = data.ipcDataSize();
1247          tr.data.ptr.buffer = data.ipcData();
1248          tr.offsets_size = data.ipcObjectsCount()*sizeof(binder_size_t);
1249          tr.data.ptr.offsets = data.ipcObjects();
...
>>将cmd命令、数据tr写入mOut中
1261      mOut.writeInt32(cmd);
1262      mOut.write(&tr, sizeof(tr));
1263  
1264      return NO_ERROR;
1265  }


1004  status_t IPCThreadState::waitForResponse(Parcel *reply, status_t *acquireResult)
1005  {
1006      uint32_t cmd;
1007      int32_t err;
1008  
>>进入while死循环
1009      while (1) {
>>与binder driver通信
1010          if ((err=talkWithDriver()) < NO_ERROR) break;
1011          err = mIn.errorCheck();
1012          if (err < NO_ERROR) break;
1013          if (mIn.dataAvail() == 0) continue;
1014  
>>从mIn中读取数据cmd指令
1015          cmd = (uint32_t)mIn.readInt32();
1016  
1017          IF_LOG_COMMANDS() {
1018              std::ostringstream logStream;
1019              logStream << "Processing waitForResponse Command: " << getReturnString(cmd) << "\n";
1020              std::string message = logStream.str();
1021              ALOGI("%s", message.c_str());
1022          }
1023  
1024          switch (cmd) {
1025          case BR_ONEWAY_SPAM_SUSPECT:
...
>>读取到driver发来的BR_REPLY指令
1059          case BR_REPLY:
1060              {
1061                  binder_transaction_data tr;
>>读取server端返回的数据
1062                  err = mIn.read(&tr, sizeof(tr));
1063                  ALOG_ASSERT(err == NO_ERROR, "Not enough command data for brREPLY");
1064                  if (err != NO_ERROR) goto finish;
1065  
1066                  if (reply) {


1108  status_t IPCThreadState::talkWithDriver(bool doReceive)
1109  {
1110      if (mProcess->mDriverFD < 0) {
1111          return -EBADF;
1112      }
1113  
>>新建一个binder_write_read类型的变量bwr,该数据类型在binder驱动中同样适用
1114      binder_write_read bwr;
...
1124      bwr.write_size = outAvail;
>>将mOut中的数据赋给binder_write_read.write_buffer
1125      bwr.write_buffer = (uintptr_t)mOut.data();
...
>>通过系统调用ioctl与driver通信,这里传入的cmd指令为BINDER_WRITE_READ
1166          if (ioctl(mProcess->mDriverFD, BINDER_WRITE_READ, &bwr) >= 0)
1167              err = NO_ERROR;
1168          else
1169              err = -errno;

发起系统调用ioctl后,client端线程便进入kernel,开始执行binder驱动中的逻辑,主要代码流程如下:

binder_ioctl->binder_ioctl_write_read->1.binder_thread_write->binder_transaction->binder_alloc_new_buf; binder_proc_transaction; 2.binder_thread_read

首先执行binder_ioctl函数,根据用户空间传下来的cmd命令BINDER_WRITE_READ,执行binder_ioctl_write_read函数。判断binder_write_read结构体中的write_size大于0则说明有可写的内容,调用binder_thread_write函数处理其中的write_buffer,write_buffer中则主要包含了上层传下来的主要数据,比如code、handle、flat_binder_object等。在binder_thread_write函数中,读取到write_buffer中的cmd指令为BC_TRANSACTION,说明此时是client端向server端发起binder通信,于是执行binder_transaction函数,binder_transaction函数包含了很多binder通信的关键逻辑。

 / drivers / android / binder.c
static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
	int ret;
	struct binder_proc *proc = filp->private_data;
	struct binder_thread *thread;
	unsigned int size = _IOC_SIZE(cmd);
	void __user *ubuf = (void __user *)arg;
...
>>根据当前线程(task_struct类型)以及当前进程(binder_proc类型)获取到该线程在binder驱动中的binder_thread类型结构体
	thread = binder_get_thread(proc);
	if (thread == NULL) {
		ret = -ENOMEM;
		goto err;
	}

	switch (cmd) {
	case BINDER_WRITE_READ:
>>由于调用ioctl时传入的cmd指令为BINDER_WRITE_READ,所以会执行binder_ioctl_write_read
		ret = binder_ioctl_write_read(filp, cmd, arg, thread);
		if (ret)
			goto err;
		break;




static int binder_ioctl_write_read(struct file *filp,
				unsigned int cmd, unsigned long arg,
				struct binder_thread *thread)
{
	int ret = 0;
	struct binder_proc *proc = filp->private_data;
	unsigned int size = _IOC_SIZE(cmd);
	void __user *ubuf = (void __user *)arg;
	struct binder_write_read bwr;

	if (size != sizeof(struct binder_write_read)) {
		ret = -EINVAL;
		goto out;
	}
>>从用户空间拷贝binder_write_read地址
	if (copy_from_user(&bwr, ubuf, sizeof(bwr))) {
		ret = -EFAULT;
		goto out;
	}
...
>>若有可写的内容,则执行binder_thread_write函数
	if (bwr.write_size > 0) {
		ret = binder_thread_write(proc, thread,
					  bwr.write_buffer,
					  bwr.write_size,
					  &bwr.write_consumed);
...
>>若有可读的内容,则执行binder_thread_read函数
	if (bwr.read_size > 0) {
		ret = binder_thread_read(proc, thread, bwr.read_buffer,
					 bwr.read_size,
					 &bwr.read_consumed,
					 filp->f_flags & O_NONBLOCK);


static int binder_thread_write(struct binder_proc *proc,
			struct binder_thread *thread,
			binder_uintptr_t binder_buffer, size_t size,
			binder_size_t *consumed)
{
	uint32_t cmd;
	struct binder_context *context = proc->context;
	void __user *buffer = (void __user *)(uintptr_t)binder_buffer;
	void __user *ptr = buffer + *consumed;
	void __user *end = buffer + size;

	while (ptr < end && thread->return_error.cmd == BR_OK) {
		int ret;
>>从用户空间读取write_buffer中的cmd命令:BC_TRANSACTION
		if (get_user(cmd, (uint32_t __user *)ptr))
			return -EFAULT;
		ptr += sizeof(uint32_t);
		trace_binder_command(cmd);
		if (_IOC_NR(cmd) < ARRAY_SIZE(binder_stats.bc)) {
			atomic_inc(&binder_stats.bc[_IOC_NR(cmd)]);
			atomic_inc(&proc->stats.bc[_IOC_NR(cmd)]);
			atomic_inc(&thread->stats.bc[_IOC_NR(cmd)]);
		}
		switch (cmd) {
...
		case BC_TRANSACTION:
		case BC_REPLY: {
			struct binder_transaction_data tr;

			if (copy_from_user(&tr, ptr, sizeof(tr)))
				return -EFAULT;
			ptr += sizeof(tr);
>>如果cmd命令为BC_TRANSACTION或BC_REPLY,则执行binder_transaction函数
>>对于BC_TRANSACTION,传入的proc、thread分别为client对应的进程与线程
			binder_transaction(proc, thread, &tr,
					   cmd == BC_REPLY, 0);
			break;
		}

binder_transaction函数中代码逻辑很长,大概做了这些事情:找到binder代理对象在binder驱动中的存在形式--binder_ref;根据binder_ref找到binder实体,即binder_node,并确认目标进程;找到目标线程;为此次binder通信分配binder buffer;将用户空间数据拷贝到binder buffer中;遍历用户数据中的binder_object并进行类型转换;将binder_work添加到todo队列中并唤醒target_thread。如果用一句话概况上面这些事情,便是新建binder_transaction类型变量并为其成员变量一一赋值。binder_transaction这个结构体中包含了此次binder通信的所有信息。

static void binder_transaction(struct binder_proc *proc,
			       struct binder_thread *thread,
			       struct binder_transaction_data *tr, int reply,
			       binder_size_t extra_buffers_size)
{
	int ret;
>>新建一个binder_transaction,根据上层传下来的信息填充binder_transaction
	struct binder_transaction *t;
	struct binder_work *w;
	struct binder_work *tcomplete;
	binder_size_t buffer_offset = 0;
	binder_size_t off_start_offset, off_end_offset;
	binder_size_t off_min;
	binder_size_t sg_buf_offset, sg_buf_end_offset;
	binder_size_t user_offset = 0;
>>目标进程,对于BC_TRANSACTION目标进程就是server进程,需要找到
	struct binder_proc *target_proc = NULL;
>>目标线程,对于BC_TRANSACTION目标线程就是server端响应此次binder通信的binder线程
	struct binder_thread *target_thread = NULL;
	struct binder_node *target_node = NULL;
	struct binder_transaction *in_reply_to = NULL;
	struct binder_transaction_log_entry *e;
	uint32_t return_error = 0;
	uint32_t return_error_param = 0;
	uint32_t return_error_line = 0;
	binder_size_t last_fixup_obj_off = 0;
	binder_size_t last_fixup_min_off = 0;
	struct binder_context *context = proc->context;
	int t_debug_id = atomic_inc_return(&binder_last_id);
	char *secctx = NULL;
	u32 secctx_sz = 0;
	struct list_head sgc_head;
	struct list_head pf_head;
>>这里tr->data.ptr.buffer表示用户空间数据地址,赋给user_buffer
	const void __user *user_buffer = (const void __user *)
				(uintptr_t)tr->data.ptr.buffer;
	INIT_LIST_HEAD(&sgc_head);
	INIT_LIST_HEAD(&pf_head);

	e = binder_transaction_log_add(&binder_transaction_log);
	e->debug_id = t_debug_id;
	e->call_type = reply ? 2 : !!(tr->flags & TF_ONE_WAY);
	e->from_proc = proc->pid;
	e->from_thread = thread->pid;
	e->target_handle = tr->target.handle;
	e->data_size = tr->data_size;
	e->offsets_size = tr->offsets_size;
	strscpy(e->context_name, proc->context->name, BINDERFS_MAX_NAME);
>>这里的reply=(cmd == BC_REPLY)
	if (reply) {//server端返回时执行这里
...
	} else {//client端发起时执行这里
		if (tr->target.handle) {
			struct binder_ref *ref;

			/*
			 * There must already be a strong ref
			 * on this node. If so, do a strong
			 * increment on the node to ensure it
			 * stays alive until the transaction is
			 * done.
			 */
			binder_proc_lock(proc);
>>根据tr->target.handle找到对应的binder_ref, handle对应着native层的BPBinder对象
			ref = binder_get_ref_olocked(proc, tr->target.handle,
						     true);
			if (ref) {
>>找到binder_ref后,再根据binder_ref找到对应的binder_node
>>同时根据binder_node找到server进程target_proc
				target_node = binder_get_node_refs_for_txn(
						ref->node, &target_proc,
						&return_error);
			} else {
				binder_user_error("%d:%d got transaction to invalid handle, %u\n",
						  proc->pid, thread->pid, tr->target.handle);
				return_error = BR_FAILED_REPLY;
			}
			binder_proc_unlock(proc);
...
>>当不是oneway且transaction_stack不为空时才满足条件
		if (!(tr->flags & TF_ONE_WAY) && thread->transaction_stack) {
			struct binder_transaction *tmp;

			tmp = thread->transaction_stack;
			if (tmp->to_thread != thread) {
				spin_lock(&tmp->lock);
				binder_user_error("%d:%d got new transaction with bad transaction stack, transaction %d has target %d:%d\n",
	...
					goto err_bad_call_stack;
			}
>>根据thread->transaction_stack找到server端目标线程target_thread;
>>个人理解这里找到target_thread并不是一般情况,大概率是找不到的,这里的情况适用于一些特殊情况,比如A进程向B进程发起binder
>>通信t1,而B进程在响应过程中又需要向A进程发起binder通信t2,A进程响应t2过程中又需要再次向B进程发起binder通信t3,即完成t1
>>需要先完成t2、t3。此时binder驱动为A进程发起的t3所分配的server端binder线程应该还是t1对应的binder线程,就不需要重新分配了。
			while (tmp) {
				struct binder_thread *from;

				spin_lock(&tmp->lock);
				from = tmp->from;
				if (from && from->proc == target_proc) {
					atomic_inc(&from->tmp_ref);
>>找到目标线程target_thread
					target_thread = from;
					spin_unlock(&tmp->lock);
					break;
				}
				spin_unlock(&tmp->lock);
				tmp = tmp->from_parent;
			}
		}
			binder_inner_proc_unlock(proc);
	}
	if (target_thread)
		e->to_thread = target_thread->pid;
	e->to_proc = target_proc->pid;
...
	if (!reply && !(tr->flags & TF_ONE_WAY))
		t->from = thread;
	else
		t->from = NULL;
	t->sender_euid = task_euid(proc->tsk);
	t->to_proc = target_proc;
	t->to_thread = target_thread;
	t->code = tr->code;
	t->flags = tr->flags;
>>保存当前client线程的priority
	t->priority = task_nice(current);
...
>>通过binder_alloc_new_buf函数去分配binder_buffer
	t->buffer = binder_alloc_new_buf(&target_proc->alloc, tr->data_size,
		tr->offsets_size, extra_buffers_size,
		!reply && (t->flags & TF_ONE_WAY), current->tgid);
...
>>通过binder_alloc_copy_user_to_buffer函数将client端数据从用户空间拷贝到刚刚分配的binder_buffer中
	if (binder_alloc_copy_user_to_buffer(
				&target_proc->alloc,
				t->buffer,
				ALIGN(tr->data_size, sizeof(void *)),
				(const void __user *)
					(uintptr_t)tr->data.ptr.offsets,
				tr->offsets_size)) {
...
	off_start_offset = ALIGN(tr->data_size, sizeof(void *));
	buffer_offset = off_start_offset;
>>对应着IPCThreadState中:tr->offsets_size=data.ipcObjectsCount()*sizeof(binder_size_t)
	off_end_offset = off_start_offset + tr->offsets_size;
	sg_buf_offset = ALIGN(off_end_offset, sizeof(void *));
	sg_buf_end_offset = sg_buf_offset + extra_buffers_size -
		ALIGN(secctx_sz, sizeof(u64));
	off_min = 0;
>>根据起始地址off_start_offset、结束地址off_end_offset、步长binder_size_t从buffer中循环读取binder_object
	for (buffer_offset = off_start_offset; buffer_offset < off_end_offset;
	     buffer_offset += sizeof(binder_size_t)) {
		struct binder_object_header *hdr;
		size_t object_size;
		struct binder_object object;
		binder_size_t object_offset;
		binder_size_t copy_size;
...
>>从buffer中读取数据到object中
		object_size = binder_get_object(target_proc, user_buffer,
				t->buffer, object_offset, &object);
...
		/*
		 * Set offset to the next buffer fragment to be
		 * copied
		 */
		user_offset = object_offset + object_size;

		hdr = &object.hdr;
		off_min = object_offset + object_size;
		switch (hdr->type) {
>>如果读取到的数据类型是BBinder类型
		case BINDER_TYPE_BINDER:
		case BINDER_TYPE_WEAK_BINDER: {
			struct flat_binder_object *fp;
>>转换为flat_binder_object类型
			fp = to_flat_binder_object(hdr);
>>将binder
			ret = binder_translate_binder(fp, t, thread);

			if (ret < 0 ||
			    binder_alloc_copy_to_buffer(&target_proc->alloc,
							t->buffer,
							object_offset,
							fp, sizeof(*fp))) {
				return_error = BR_FAILED_REPLY;
				return_error_param = ret;
				return_error_line = __LINE__;
				goto err_translate_failed;
			}
		} break;
...
>>如果读取到的数据类型为BPBinder
		case BINDER_TYPE_HANDLE:
		case BINDER_TYPE_WEAK_HANDLE: {
			struct flat_binder_object *fp;

			fp = to_flat_binder_object(hdr);
			ret = binder_translate_handle(fp, t, thread);
>>将新建的flat_binder_object拷贝到binder_buffer中,包含了刚刚binder_ref对应的desc,对应着BpBinder的handle
			if (ret < 0 ||
			    binder_alloc_copy_to_buffer(&target_proc->alloc,
							t->buffer,
							object_offset,
							fp, sizeof(*fp))) {
				return_error = BR_FAILED_REPLY;
				return_error_param = ret;
				return_error_line = __LINE__;
				goto err_translate_failed;
			}
		} break;
...
	if (t->buffer->oneway_spam_suspect)
		tcomplete->type = BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT;
	else
>>向tcomplete中添加BINDER_WORK_TRANSACTION_COMPLETE
		tcomplete->type = BINDER_WORK_TRANSACTION_COMPLETE;
>>向binder_transaction中添加BINDER_WORK_TRANSACTION
	t->work.type = BINDER_WORK_TRANSACTION;

>>如果是BC_REPLY
	if (reply) {
...
>>如果是BC_TRANSACTION,且不是oneway
	} else if (!(t->flags & TF_ONE_WAY)) {
...
>>将work添加到当前线程(若为BC_TRANSACTION,则为client线程;若为BC_REPLY,则为server端binder线程)的todo队列中
		binder_enqueue_deferred_thread_work_ilocked(thread, tcomplete);
		t->need_reply = 1;
		t->from_parent = thread->transaction_stack;
		thread->transaction_stack = t;
		binder_inner_proc_unlock(proc);
>>执行binder_proc_transaction函数,主要作用是将binder_work添加到todo队列中且唤醒target_thread;
>>对于BC_TRANSACTION,则target_thread为server进程处理此次binder通信的binder线程,可能为null;
>>如果target_thread为null,会在binder_proc_transaction函数中找一个;
		return_error = binder_proc_transaction(t,
				target_proc, target_thread);
...
	} else {
>>如果是BC_TRANSACTION,且为oneway类型
		BUG_ON(target_node == NULL);
		BUG_ON(t->buffer->async_transaction != 1);
>>将work添加到当前线程的todo队列中
		binder_enqueue_thread_work(thread, tcomplete);
		return_error = binder_proc_transaction(t, target_proc, NULL);



/**
 * binder_proc_transaction() - sends a transaction to a process and wakes it up
 * @t:		transaction to send
 * @proc:	process to send the transaction to
 * @thread:	thread in @proc to send the transaction to (may be NULL)
 *
 * This function queues a transaction to the specified process. It will try
 * to find a thread in the target process to handle the transaction and
 * wake it up. If no thread is found, the work is queued to the proc
 * waitqueue.
 *
 * If the @thread parameter is not NULL, the transaction is always queued
 * to the waitlist of that specific thread.
 *
 * Return:	0 if the transaction was successfully queued
 *		BR_DEAD_REPLY if the target process or thread is dead
 *		BR_FROZEN_REPLY if the target process or thread is frozen
 */
static int binder_proc_transaction(struct binder_transaction *t,
				    struct binder_proc *proc,
				    struct binder_thread *thread)
{
	struct binder_node *node = t->buffer->target_node;
	bool oneway = !!(t->flags & TF_ONE_WAY);
	bool pending_async = false;

	BUG_ON(!node);
	binder_node_lock(node);
	if (oneway) {
		BUG_ON(thread);
		if (node->has_async_transaction)
			pending_async = true;
		else
			node->has_async_transaction = true;
	}

	binder_inner_proc_lock(proc);
	if (proc->is_frozen) {
		proc->sync_recv |= !oneway;
		proc->async_recv |= oneway;
	}

	if ((proc->is_frozen && !oneway) || proc->is_dead ||
			(thread && thread->is_dead)) {
		binder_inner_proc_unlock(proc);
		binder_node_unlock(node);
		return proc->is_frozen ? BR_FROZEN_REPLY : BR_DEAD_REPLY;
	}

	if (!thread && !pending_async)
>>如果不是oneway类型且server端target thread尚未找到,则通过binder_select_thread_ilocked去确认server进程将要
>>响应此次binder通信的binder线程
		thread = binder_select_thread_ilocked(proc);

	if (thread)
>>如果找到了binder线程,将binder_work添加到该线程的todo队列中
		binder_enqueue_thread_work_ilocked(thread, &t->work);
	else if (!pending_async)
>>否则添加到server进程对应的todo队列中
		binder_enqueue_work_ilocked(&t->work, &proc->todo);
	else
>>如果是oneway类型的binder通信,添加到async_todo队列中
		binder_enqueue_work_ilocked(&t->work, &node->async_todo);

	if (!pending_async)
>>唤醒server端binder线程去响应client的binder请求
		binder_wakeup_thread_ilocked(proc, thread, !oneway /* sync */);

	proc->outstanding_txns++;
	binder_inner_proc_unlock(proc);
	binder_node_unlock(node);

	return 0;
}


//根据传入的fp找到对应的binder_node,并为target_proc找到或新建对应的binder_ref
static int binder_translate_binder(struct flat_binder_object *fp,
				   struct binder_transaction *t,
				   struct binder_thread *thread)
{
	struct binder_node *node;
	struct binder_proc *proc = thread->proc;
	struct binder_proc *target_proc = t->to_proc;
	struct binder_ref_data rdata;
	int ret = 0;
>>根据fp->binder找到对应的binder_node
	node = binder_get_node(proc, fp->binder);
	if (!node) {
>>如果没有找到,则为此BBinder对象新建一个binder_node
		node = binder_new_node(proc, fp);
		if (!node)
			return -ENOMEM;
	}
...
>>为target_proc找到或新建一个binder_node对应的binder引用binder_ref
>>并将binder_ref对应的data赋给入参rdata
	ret = binder_inc_ref_for_node(target_proc, node,
			fp->hdr.type == BINDER_TYPE_BINDER,
			&thread->todo, &rdata);
	if (ret)
		goto done;

	if (fp->hdr.type == BINDER_TYPE_BINDER)
>>转换binder类型为BINDER_TYPE_HANDLE(对应BPBinder)
		fp->hdr.type = BINDER_TYPE_HANDLE;
	else
		fp->hdr.type = BINDER_TYPE_WEAK_HANDLE;
	fp->binder = 0;
>>将binder_ref->data.desc赋给flat_binder_object->handle以便在native层可以根据handle去新建一个BPBinder对象
	fp->handle = rdata.desc;
	fp->cookie = 0;


static int binder_inc_ref_for_node(struct binder_proc *proc,
			struct binder_node *node,
			bool strong,
			struct list_head *target_list,
			struct binder_ref_data *rdata)
{
	struct binder_ref *ref;
	struct binder_ref *new_ref = NULL;
	int ret = 0;

	binder_proc_lock(proc);
>>在proc中根据binder_node找到对应的binder_ref
	ref = binder_get_ref_for_node_olocked(proc, node, NULL);
	if (!ref) {
>>如果没找到,则为proc新建一个binder_ref
		binder_proc_unlock(proc);
		new_ref = kzalloc(sizeof(*ref), GFP_KERNEL);
		if (!new_ref)
			return -ENOMEM;
		binder_proc_lock(proc);
		ref = binder_get_ref_for_node_olocked(proc, node, new_ref);
	}
	ret = binder_inc_ref_olocked(ref, strong, target_list);
	*rdata = ref->data;

3.BR_TRANSACTION

BR_TRANSACTION阶段为server端binder线程被binder驱动唤醒,然后读取数据并执行client请求的业务,然后返回结果。对于非oneway类型的binder通信,此阶段client线程被挂起,直到后面被binder驱动唤醒。

前面在binder_proc_transaction函数中已经确定了server端执行此次binder通信的binder线程,将binder_work添加到对应线程的todo队列中,等待后面线程被唤醒后取出binder_work并执行。

static int binder_proc_transaction(struct binder_transaction *t,
				    struct binder_proc *proc,
				    struct binder_thread *thread)
{
...
	if (thread)
>>从binder_transaction中取出binder_work
		binder_enqueue_thread_work_ilocked(thread, &t->work);


static void
binder_enqueue_thread_work_ilocked(struct binder_thread *thread,
				   struct binder_work *work)
{
	WARN_ON(!list_empty(&thread->waiting_thread_node));
>>将binder_work添加到binder线程的todo队列中
	binder_enqueue_work_ilocked(work, &thread->todo);

server端binder线程被唤醒后,会执行binder_thread_read函数逻辑,读取client发来的信息。首先检查当前线程的todo队列是否有待处理项(binder_work),如果没有检查进程的todo队列中是否有待处理项。然后从todo队列出取出binder_work,此时对应的type为BINDER_WORK_TRANSACTION。再根据binder_work找到对应的binder_transaction,根据binder_transaction中的target_node是否为空来判断当前是否处于回复阶段。此时正处于binder通信发送阶段,其target_node不为空,因此确认对应的cmd为BR_TRANSACTION;否则,为binder通信回复阶段,对应cmd为BR_REPLY。然后将cmd命令发送给server进程用户空间,然后线程运行逻辑从内核回到用户空间。

static int binder_thread_read(struct binder_proc *proc,
			      struct binder_thread *thread,
			      binder_uintptr_t binder_buffer, size_t size,
			      binder_size_t *consumed, int non_block)
{
	void __user *buffer = (void __user *)(uintptr_t)binder_buffer;
	void __user *ptr = buffer + *consumed;
	void __user *end = buffer + size;

	int ret = 0;
	int wait_for_proc_work;

	if (*consumed == 0) {
		if (put_user(BR_NOOP, (uint32_t __user *)ptr))
			return -EFAULT;
		ptr += sizeof(uint32_t);
	}
...
>>进入死循环
	while (1) {
		uint32_t cmd;
		struct binder_transaction_data_secctx tr;
		struct binder_transaction_data *trd = &tr.transaction_data;
		struct binder_work *w = NULL;
		struct list_head *list = NULL;
		struct binder_transaction *t = NULL;
		struct binder_thread *t_from;
		size_t trsize = sizeof(*trd);

		binder_inner_proc_lock(proc);
>>检查当前线程的todo列表是否有待处理项,如果有则赋值给list
		if (!binder_worklist_empty_ilocked(&thread->todo))
			list = &thread->todo;
>>否则将进程的todo列表赋给list
		else if (!binder_worklist_empty_ilocked(&proc->todo) &&
			   wait_for_proc_work)
			list = &proc->todo;
		else {
...
>>从todo队列中取出一个binder_work
		w = binder_dequeue_work_head_ilocked(list);
		if (binder_worklist_empty_ilocked(&thread->todo))
			thread->process_todo = false;

		switch (w->type) {
		case BINDER_WORK_TRANSACTION: {
>>前面binder_transaction中添加的work类型我BINDER_WORK_TRANSACTION: t->work.type = BINDER_WORK_TRANSACTION;
			binder_inner_proc_unlock(proc);
>>这里container_of函数作用是:根据成员变量名work、成员变量地址w找到类型为binder_transaction的结构体
>>即根据已知的两个条件找到目标结构体
			t = container_of(w, struct binder_transaction, work);
		} break;
...
>>target_node为binder_node类型,如果存在说明当前为binder通信发送阶段
		if (t->buffer->target_node) {
			struct binder_node *target_node = t->buffer->target_node;

			trd->target.ptr = target_node->ptr;
			trd->cookie =  target_node->cookie;
			t->saved_priority = task_nice(current);
			if (t->priority < target_node->min_priority &&
			    !(t->flags & TF_ONE_WAY))
>>如果是非oneway且client线程优先级数值小于server端的最小数值(数值越低则优先级越高),
>>则将binder线程优先级设置为client线程的优先级
				binder_set_nice(t->priority);
			else if (!(t->flags & TF_ONE_WAY) ||
				 t->saved_priority > target_node->min_priority)
>>否则设为server端最低优先级(此时client线程优先级更低)
				binder_set_nice(target_node->min_priority);
>>确定cmd为BR_TRANSACTION
			cmd = BR_TRANSACTION;
		} else {
>>如果target_node为null,则说明当前为binder通信回复阶段
			trd->target.ptr = 0;
			trd->cookie = 0;
>>确定cmd为BR_TRANSACTION
			cmd = BR_REPLY;
		}
		trd->code = t->code;
		trd->flags = t->flags;
		trd->sender_euid = from_kuid(current_user_ns(), t->sender_euid);
...
>>将cmd命令发送给server进程用户空间
		if (put_user(cmd, (uint32_t __user *)ptr)) {
			if (t_from)
...
		ptr += sizeof(uint32_t);
		if (copy_to_user(ptr, &tr, trsize)) {
...
		}
		ptr += trsize;
...
		if (cmd != BR_REPLY && !(t->flags & TF_ONE_WAY)) {
			binder_inner_proc_lock(thread->proc);
			t->to_parent = thread->transaction_stack;
>>如果不是BR_REPLY且不为oneway,给to_thread赋值
			t->to_thread = thread;
			thread->transaction_stack = t;
			binder_inner_proc_unlock(thread->proc);
		} else {
			binder_free_transaction(t);
		}
		break;

回到用户空间,读取到BR_TRANSACTION命令后先新建一个parcel类型的数据,用于从binder_transaction_data中拷贝数据。同时还会新建一个parcel类型的数据reply,用于返回server执行结果。server端业务执行结束,调用sendReply方法与binder驱动进行通信,以回复client进程。

/frameworks/native/libs/binder/IPCThreadState.cpp
1274  status_t IPCThreadState::executeCommand(int32_t cmd)
1275  {
...
1354      case BR_TRANSACTION_SEC_CTX:
>>server端处理BR_TRANSACTION
1355      case BR_TRANSACTION:
1356          {
...
1371              Parcel buffer;
>>从tr中读取数据到buffer(Parcel)中
1372              buffer.ipcSetDataReference(
1373                  reinterpret_cast<const uint8_t*>(tr.data.ptr.buffer),
1374                  tr.data_size,
1375                  reinterpret_cast<const binder_size_t*>(tr.data.ptr.offsets),
1376                  tr.offsets_size/sizeof(binder_size_t), freeBuffer);
...
>>新建变量reply,用于server端返回数据
1404              Parcel reply;
1405              status_t error;
...
1417              if (tr.target.ptr) {
1418                  // We only have a weak reference on the target object, so we must first try to
1419                  // safely acquire a strong reference before doing anything else with it.
1420                  if (reinterpret_cast<RefBase::weakref_type*>(
1421                          tr.target.ptr)->attemptIncStrong(this)) {
>>执行BBinder::transact方法,最终根据code执行到server端对应的方法
1422                      error = reinterpret_cast<BBinder*>(tr.cookie)->transact(tr.code, buffer,
1423                              &reply, tr.flags);
1424                      reinterpret_cast<BBinder*>(tr.cookie)->decStrong(this);
1425                  } else {
1426                      error = UNKNOWN_TRANSACTION;
1427                  }
...
1436              if ((tr.flags & TF_ONE_WAY) == 0) {
>>如果不是oneway,需要给client回复
1437                  LOG_ONEWAY("Sending reply to %d!", mCallingPid);
1438                  if (error < NO_ERROR) reply.setError(error);
1439  
1440                  // b/238777741: clear buffer before we send the reply.
1441                  // Otherwise, there is a race where the client may
1442                  // receive the reply and send another transaction
1443                  // here and the space used by this transaction won't
1444                  // be freed for the client.
>>clear buffer
1445                  buffer.setDataSize(0);
1446  
1447                  constexpr uint32_t kForwardReplyFlags = TF_CLEAR_BUF;
>>调用sendReply方法进行回复,与derver进行通信
1448                  sendReply(reply, (tr.flags & kForwardReplyFlags));
1449              } else {

4.BC_REPLY

sendReply->waitForResponse->talkWithDriver->ioctl

经过上述代码流程,用户空间将BC_REPLY命令发送到内核空间,线程进入内核。后续流程与之前处理BC_TRANSACTION流程基本一致,因此对于相同的地方不再赘述,只是把其中一些不同的流程列举出来。最主要的不同点在于执行binder_transaction函数时,其入参reply为true,因此涉及到reply的条件判断时其执行逻辑与之前BC_TRANSACTION不同。具体来说,一是确认target_thread逻辑不同,BC_REPLY阶段可以根据binder线程对应的transaction_stack找到之前的client线程,即为目标线程;二是binder线程优先级会回落到默认优先级(之前是继承了client线程的优先级)。

/frameworks/native/libs/binder/IPCThreadState.cpp
994  status_t IPCThreadState::sendReply(const Parcel& reply, uint32_t flags)
995  {
996      status_t err;
997      status_t statusBuffer;
>>返回时对应的cmd命令为BC_REPLY
998      err = writeTransactionData(BC_REPLY, flags, -1, 0, reply, &statusBuffer);
999      if (err < NO_ERROR) return err;
1000  
>>与kernel通信
1001      return waitForResponse(nullptr, nullptr);
1002  }



static int binder_thread_write(struct binder_proc *proc,
			struct binder_thread *thread,
			binder_uintptr_t binder_buffer, size_t size,
			binder_size_t *consumed)
{
...
		case BC_TRANSACTION:
>>server端回复client时对应cmd为BC_REPLY
		case BC_REPLY: {
			struct binder_transaction_data tr;

			if (copy_from_user(&tr, ptr, sizeof(tr)))
				return -EFAULT;
			ptr += sizeof(tr);
>>注意这里reply= (cmd == BC_REPLY) =true
			binder_transaction(proc, thread, &tr,
					   cmd == BC_REPLY, 0);
			break;
		}


static void binder_transaction(struct binder_proc *proc,
			       struct binder_thread *thread,
			       struct binder_transaction_data *tr, int reply,
			       binder_size_t extra_buffers_size)
{
...
	if (reply) {
		binder_inner_proc_lock(proc);
>>reply=true,获取当前thread对应的binder_transaction(为之前client端向server通信时的binder_transaction)
		in_reply_to = thread->transaction_stack;
...
		thread->transaction_stack = in_reply_to->to_parent;
		binder_inner_proc_unlock(proc);
>>restore流程,恢复当前线程原来的优先级
		binder_set_nice(in_reply_to->saved_priority);
>>binder_get_txn_from_and_acq_inner函数主要逻辑是从binder_transaction中获取到成员from
>>因此目标线程即为原来的client线程
		target_thread = binder_get_txn_from_and_acq_inner(in_reply_to);
...
>>目标进程为client线程对应的进程
		target_proc = target_thread->proc;
		target_proc->tmp_ref++;
		binder_inner_proc_unlock(target_thread->proc);
	} else {
		if (tr->target.handle) {
			struct binder_ref *ref;
...
	if (t->buffer->oneway_spam_suspect)
		tcomplete->type = BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT;
	else
		tcomplete->type = BINDER_WORK_TRANSACTION_COMPLETE;
	t->work.type = BINDER_WORK_TRANSACTION;

	if (reply) {
		binder_enqueue_thread_work(thread, tcomplete);
		binder_inner_proc_lock(target_proc);
		if (target_thread->is_dead) {
			return_error = BR_DEAD_REPLY;
			binder_inner_proc_unlock(target_proc);
			goto err_dead_proc_or_thread;
		}
		BUG_ON(t->buffer->async_transaction != 0);
		binder_pop_transaction_ilocked(target_thread, in_reply_to);
>>向client线程的todo队列添加binder_work
		binder_enqueue_thread_work_ilocked(target_thread, &t->work);
		target_proc->outstanding_txns++;
		binder_inner_proc_unlock(target_proc);
>>唤醒client线程
		wake_up_interruptible_sync(&target_thread->wait);
		binder_free_transaction(in_reply_to);
	} else if (!(t->flags & TF_ONE_WAY)) {

5.BR_REPLY

binder_ioctl->binder_ioctl_write_read->1.binder_thread_write->binder_transaction->binder_alloc_new_buf; binder_proc_transaction; 2.binder_thread_read

在前面BC_TRANSACTION阶段,binder_transaction函数执行完毕后,binder_transaction、binder_thread_write将依次出栈,回到binder_ioctl_write_read中,并开始执行binder_thread_read,检查当前client线程/进程的todo队列中没有待处理binder_work,如果没有则进入休眠状态(TASK_INTERRUPTIBLE),等待后面被server端唤醒。

static int binder_thread_read(struct binder_proc *proc,
			      struct binder_thread *thread,
			      binder_uintptr_t binder_buffer, size_t size,
			      binder_size_t *consumed, int non_block)
{
	void __user *buffer = (void __user *)(uintptr_t)binder_buffer;
	void __user *ptr = buffer + *consumed;
	void __user *end = buffer + size;
...
	if (non_block) {
		if (!binder_has_work(thread, wait_for_proc_work))
			ret = -EAGAIN;
	} else {
>>检查todo队列中是否存在binder_work,若没有则进入休眠状态(TASK_INTERRUPTIBLE)等待被唤醒
		ret = binder_wait_for_work(thread, wait_for_proc_work);
	}



static int binder_wait_for_work(struct binder_thread *thread,
				bool do_proc_work)
{
	DEFINE_WAIT(wait);
	struct binder_proc *proc = thread->proc;
	int ret = 0;

	freezer_do_not_count();
	binder_inner_proc_lock(proc);
>>死循环
	for (;;) {
>>将thread_wait添加到wait_queue中,并设置线程状态为TASK_INTERRUPTIBLE
		prepare_to_wait(&thread->wait, &wait, TASK_INTERRUPTIBLE);
>>判断当前线程或进程对应todo队列中是否存在待处理工作项,如果存在,则break跳出循环
		if (binder_has_work_ilocked(thread, do_proc_work))
			break;
		if (do_proc_work)
			list_add(&thread->waiting_thread_node,
				 &proc->waiting_threads);
		binder_inner_proc_unlock(proc);
>>调用schedule函数,进入休眠状态
		schedule();
		binder_inner_proc_lock(proc);
		list_del_init(&thread->waiting_thread_node);
		if (signal_pending(current)) {
			ret = -EINTR;
			break;
		}
	}
	finish_wait(&thread->wait, &wait);
	binder_inner_proc_unlock(proc);
	freezer_count();

	return ret;
}



static bool binder_has_work_ilocked(struct binder_thread *thread,
				    bool do_proc_work)
{
>>判断当前线程或进程的todo队列是否有待处理项
	return thread->process_todo ||
		thread->looper_need_return ||
		(do_proc_work &&
		 !binder_worklist_empty_ilocked(&thread->proc->todo));
}

后面server端执行业务结束,向binder驱动发送BC_REPLY,并进入binder_transaction函数,找到之前发起binder通信的client线程,并向其线程的todo队列添加binder_work,并唤醒client线程。client线程之前休眠在binder_wait_for_work函数中,被唤醒后继续执行for循环,检查当client线程/进程的todo队列,此时不为空,跳出循环,返回binder_thread_read函数继续执行后续逻辑。后续流程与之前的BR_TRANSACTION阶段类似,client线程会检测当前binder_transaction结构体的target_node是否为空,以此来判断当前是否为回复阶段。很显然,target_node在前面BC_REPLY阶段并未被赋值,因此为空,故判定当前为回复阶段,并确定cmd命令为BR_REPLY,发送到用户空间。

static int binder_thread_read(struct binder_proc *proc,
			      struct binder_thread *thread,
			      binder_uintptr_t binder_buffer, size_t size,
			      binder_size_t *consumed, int non_block)
{
...
		if (t->buffer->target_node) {
			struct binder_node *target_node = t->buffer->target_node;
...
			cmd = BR_TRANSACTION;
		} else {
>>reply阶段target_node为空,故进入此分支,cmd = BR_REPLY
			trd->target.ptr = 0;
			trd->cookie = 0;
			cmd = BR_REPLY;
		}
...
>>将BR_REPLY命令发送到用户空间
		if (put_user(cmd, (uint32_t __user *)ptr)) {

前面client线程发起binder通信后,在waitForResponse方法中进入内核执行后续逻辑并最终进入休眠,server端唤醒client线程后,client线程回到用户空间,读取到cmd命令为BR_REPLY,进入对应case分支,将server端返回的数据(binder_transaction_data)写入到reply中,并设置release_func为freeBuffer函数,用于在parcel使用结束后释放对应的buffer。然后,执行finish分支,waitForResponse方法执行完毕,将reply返回给上层。至此,一次完整的binder通信(驱动部分)基本就结束了,至于驱动中释放binder_buffer的流程后面会再总结。
 

1004  status_t IPCThreadState::waitForResponse(Parcel *reply, status_t *acquireResult)
1005  {
1006      uint32_t cmd;
1007      int32_t err;
1008  
1009      while (1) {
1010          if ((err=talkWithDriver()) < NO_ERROR) break;
1011          err = mIn.errorCheck();
1012          if (err < NO_ERROR) break;
1013          if (mIn.dataAvail() == 0) continue;
1014  
>>读取deriver发来的cmd命令
1015          cmd = (uint32_t)mIn.readInt32();
...
1024          switch (cmd) {
1025          case BR_ONEWAY_SPAM_SUSPECT:
...
>>进入BR_REPLY
1059          case BR_REPLY:
1060              {
1061                  binder_transaction_data tr;
1062                  err = mIn.read(&tr, sizeof(tr));
1063                  ALOG_ASSERT(err == NO_ERROR, "Not enough command data for brREPLY");
1064                  if (err != NO_ERROR) goto finish;
1065  
1066                  if (reply) {
1067                      if ((tr.flags & TF_STATUS_CODE) == 0) {
>>进入此分支,将binder_transaction_data中的数据写入reply(Parcel类型),并设置release_func为freeBuffer
1068                          reply->ipcSetDataReference(
1069                              reinterpret_cast<const uint8_t*>(tr.data.ptr.buffer),
1070                              tr.data_size,
1071                              reinterpret_cast<const binder_size_t*>(tr.data.ptr.offsets),
1072                              tr.offsets_size/sizeof(binder_size_t),
1073                              freeBuffer);
1074                      } else {
1075                          err = *reinterpret_cast<const status_t*>(tr.data.ptr.buffer);
1076                          freeBuffer(reinterpret_cast<const uint8_t*>(tr.data.ptr.buffer),
1077                                     tr.data_size,
1078                                     reinterpret_cast<const binder_size_t*>(tr.data.ptr.offsets),
1079                                     tr.offsets_size / sizeof(binder_size_t));
1080                      }
1081                  } else {
1082                      freeBuffer(reinterpret_cast<const uint8_t*>(tr.data.ptr.buffer), tr.data_size,
1083                                 reinterpret_cast<const binder_size_t*>(tr.data.ptr.offsets),
1084                                 tr.offsets_size / sizeof(binder_size_t));
1085                      continue;
1086                  }
1087              }
>>进入finish分支,waitForResponse方法执行完毕
1088              goto finish;

6.binder内存管理

在项目中经常会遇到一些binder相关的问题,其中最常见的就是binder buffer耗尽导致binder通信失败的问题。因此,非常有必要将binder内存的分配与回收流程进行梳理。

6.1 binder内存分配

首先,当进程创建时会对ProcessState进行初始化,在这里会去打开binder设备并调用mmap给进程分配一段虚拟地址空间,为后面的binder通信做准备。传入mmap的参数中,第二个为虚拟地址空间大小,为1Mb左右(具体来说是1016kb)。

frameworks/native/libs/binder/ProcessState.cpp
49  #define BINDER_VM_SIZE ((1 * 1024 * 1024) - sysconf(_SC_PAGE_SIZE) * 2)


529  ProcessState::ProcessState(const char* driver)
...
545      base::Result<int> opened = open_driver(driver);
546  
547      if (opened.ok()) {
548          // mmap the binder, providing a chunk of virtual address space to receive transactions.
>>BINDER_VM_SIZE=1024-8=1016kb
549          mVMStart = mmap(nullptr, BINDER_VM_SIZE, PROT_READ, MAP_PRIVATE | MAP_NORESERVE,
550                          opened.value(), 0);
551          if (mVMStart == MAP_FAILED) {

调用mmap后进入内核空间,经系统调用执行对应的函数---binder_mmap。先获取进程信息,再执行binder_alloc_mmap_handler函数去分配地址空间。

/ drivers / android / binder.c
static int binder_mmap(struct file *filp, struct vm_area_struct *vma)
{
	struct binder_proc *proc = filp->private_data;

	if (proc->tsk != current->group_leader)
		return -EINVAL;
...
	vma->vm_ops = &binder_vm_ops;
	vma->vm_private_data = proc;
>>调用binder_alloc_mmap_handler执行一些列初始化工作
	return binder_alloc_mmap_handler(&proc->alloc, vma);
}

binder_alloc_mmap_handler函数的作用,这里的注释已经说的很清楚了,主要就是映射虚拟地址空间到进程中。同时还做了binder内存相关的一些初始化工作,比如:初始化虚拟地址空间最终大小为1016kb,初始化进程的第一个binder_buffer,将其置为空闲状态,并将其插入到free_buffers中备用。

网上一些binder相关文章提到进程在对binder设备初始化时会为其分配物理页,且异步空间与同步空间各限制为整个虚拟地址空间大小的一半。但是从代码来看,在mmap过程中并没有分配物理页,只是确定了地址空间的大小以及地址空间的起始地址,不过在以前的版本(如kernel3.10)中确实是在mmap过程中分配的一个物理页,这说明不同kernel版本其实现细节是有一些变化的。另外,从之前binder通信的整个流程来看,只是在异步binder通信时对异步空间剩余进行检查、以及每次binder通信时对整个地址空间剩余进行检查, 且这里mmap过程中只是对异步空间大小做了限制,并未对同步空间有限制,由此也可以看出同步binder是要优先于异步binder通信的,而且对同步空间大小没有单独去限制。
 

/ drivers / android / binder_alloc.c
/**
 * binder_alloc_mmap_handler() - map virtual address space for proc
 * @alloc:	alloc structure for this proc
 * @vma:	vma passed to mmap()
 *
 * Called by binder_mmap() to initialize the space specified in
 * vma for allocating binder buffers
 *
 * Return:
 *      0 = success
 *      -EBUSY = address space already mapped
 *      -ENOMEM = failed to map memory to given address space
 */
int binder_alloc_mmap_handler(struct binder_alloc *alloc,
			      struct vm_area_struct *vma)
{
	int ret;
	const char *failure_string;
	struct binder_buffer *buffer;
...
>>映射的地址空间大小取(vma->vm_end - vma->vm_start)和4M的较小值
>>前者为用户空间传入的大小为1016kb,显然buffer_size最终为1016kb
	alloc->buffer_size = min_t(unsigned long, vma->vm_end - vma->vm_start,
				   SZ_4M);
	mutex_unlock(&binder_alloc_mmap_lock);
>>虚拟地址空间的起始位置赋给alloc->buffer
	alloc->buffer = (void __user *)vma->vm_start;
>>给pages数组分配内存
	alloc->pages = kcalloc(alloc->buffer_size / PAGE_SIZE,
			       sizeof(alloc->pages[0]),
			       GFP_KERNEL);
	if (alloc->pages == NULL) {
		ret = -ENOMEM;
		failure_string = "alloc page array";
		goto err_alloc_pages_failed;
	}
>>给binder_buffer结构体分配内存
	buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
	if (!buffer) {
		ret = -ENOMEM;
		failure_string = "alloc buffer struct";
		goto err_alloc_buf_struct_failed;
	}
>>将alloc->buffer赋给新创建的binder_buffer的user_data,此binder_buffer也是当前进程对应的第一个binder_buffer
	buffer->user_data = alloc->buffer;
	list_add(&buffer->entry, &alloc->buffers);
>>将该binder_buffer的free状态置为1,表示当前处于空闲状态
	buffer->free = 1;
>>将binder_buffer插入到free_buffers中
	binder_insert_free_buffer(alloc, buffer);
>>异步空间大小为为整个虚拟地址空间大小的一半
	alloc->free_async_space = alloc->buffer_size / 2;

	/* Signal binder_alloc is fully initialized */
	binder_alloc_set_vma(alloc, vma);

	return 0;

在BC_TRSANCTION或BC_REPLY都会执行binder_transaction函数,在这个函数中都会新建一个binder_transaction结构体并为其分配一个binder_buffer用于传输数据。其中,分配binder_buffer的核心逻辑在binder_alloc_new_buf_locked函数中,主要分配过程为:先遍历当前进程对应的free_buffers红黑树(按buffer_size大小排列),目标是找到一个buffer_size大于或等于所请求size的binder_buffer,buffer_size越接近请求size越好,如果都不符合要求,则说明剩余地址空间不足以支持当前这一次binder通信,报错;如果刚好相等,那么这是最理想的情况;如果buffer_size大于请求的size,则把该binder_buffer剩余的地址空间分配给新的binder_buffer去管理,并将新的binder_buffer插入到free_buffers中。对于第二、三种情况,都说明找到了满足要求的binder_buffer,则把当前binder_buffer从free_buffers中移除,将其free状态置为0,并将该binder_buffer插入到allocated_buffers中。最终,返回该binder_buffer。
 

 / drivers / android / binder.c
static void binder_transaction(struct binder_proc *proc,
			       struct binder_thread *thread,
			       struct binder_transaction_data *tr, int reply,
			       binder_size_t extra_buffers_size)
{
...
>>为新的binder_transaction分配新的buffer
	t->buffer = binder_alloc_new_buf(&target_proc->alloc, tr->data_size,
		tr->offsets_size, extra_buffers_size,
		!reply && (t->flags & TF_ONE_WAY), current->tgid);


struct binder_buffer *binder_alloc_new_buf(struct binder_alloc *alloc,
					   size_t data_size,
					   size_t offsets_size,
					   size_t extra_buffers_size,
					   int is_async,
					   int pid)
{
	struct binder_buffer *buffer;

	mutex_lock(&alloc->mutex);
>>分配binder_buffer核心逻辑在binder_alloc_new_buf_locked函数中
	buffer = binder_alloc_new_buf_locked(alloc, data_size, offsets_size,
					     extra_buffers_size, is_async, pid);
	mutex_unlock(&alloc->mutex);
	return buffer;
}


static struct binder_buffer *binder_alloc_new_buf_locked(
				struct binder_alloc *alloc,
				size_t data_size,
				size_t offsets_size,
				size_t extra_buffers_size,
				int is_async,
				int pid)
{
	struct rb_node *n = alloc->free_buffers.rb_node;
>>为新的binder_transaction新建一个binder_buffer
	struct binder_buffer *buffer;
	size_t buffer_size;
	struct rb_node *best_fit = NULL;
	void __user *has_page_addr;
	void __user *end_page_addr;
	size_t size, data_offsets_size;
	int ret;

	/* Check binder_alloc is fully initialized */
	if (!binder_alloc_get_vma(alloc)) {
		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
				   "%d: binder_alloc_buf, no vma\n",
				   alloc->pid);
		return ERR_PTR(-ESRCH);
	}

	data_offsets_size = ALIGN(data_size, sizeof(void *)) +
		ALIGN(offsets_size, sizeof(void *));

	if (data_offsets_size < data_size || data_offsets_size < offsets_size) {
		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
				"%d: got transaction with invalid size %zd-%zd\n",
				alloc->pid, data_size, offsets_size);
		return ERR_PTR(-EINVAL);
	}
>>请求的size,用于存放binder通信相关数据
	size = data_offsets_size + ALIGN(extra_buffers_size, sizeof(void *));
	if (size < data_offsets_size || size < extra_buffers_size) {
		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
				"%d: got transaction with invalid extra_buffers_size %zd\n",
				alloc->pid, extra_buffers_size);
		return ERR_PTR(-EINVAL);
	}

	/* Pad 0-size buffers so they get assigned unique addresses */
	size = max(size, sizeof(void *));
>>对异步空间剩余进行检查
	if (is_async && alloc->free_async_space < size) {
		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
			     "%d: binder_alloc_buf size %zd failed, no async space left\n",
			      alloc->pid, size);
		return ERR_PTR(-ENOSPC);
	}
>>开始遍历free_buffers,寻找大小最合适的binder_buffer
	while (n) {
>>取出一个binder_buffer
		buffer = rb_entry(n, struct binder_buffer, rb_node);
		BUG_ON(!buffer->free);
>>确定当前binder_buffer对应的buffer_size
		buffer_size = binder_alloc_buffer_size(alloc, buffer);
>>当请求的size小于当前buffer_size,则说明是比较符合要求的,可进一步遍历确认是否有更合适的
		if (size < buffer_size) {
			best_fit = n;
			n = n->rb_left;
>>如果请求的size大于当前binder_buffer对应的buffer_size,则不满足要求,指针右移遍历下一个buffer_size更大的binder_buffer
		} else if (size > buffer_size)
			n = n->rb_right;
		else {
>>如果size=buffer_size,则是最合适的,直接退出循环
			best_fit = n;
			break;
		}
	}
>>如果best_fit为null,说明剩下的binder_buffer对应的buffer_size都比请求的size要小,直接退出
	if (best_fit == NULL) {
		size_t allocated_buffers = 0;
		size_t largest_alloc_size = 0;
		size_t total_alloc_size = 0;
		size_t free_buffers = 0;
		size_t largest_free_size = 0;
		size_t total_free_size = 0;

		for (n = rb_first(&alloc->allocated_buffers); n != NULL;
		     n = rb_next(n)) {
			buffer = rb_entry(n, struct binder_buffer, rb_node);
			buffer_size = binder_alloc_buffer_size(alloc, buffer);
			allocated_buffers++;
			total_alloc_size += buffer_size;
			if (buffer_size > largest_alloc_size)
				largest_alloc_size = buffer_size;
		}
		for (n = rb_first(&alloc->free_buffers); n != NULL;
		     n = rb_next(n)) {
			buffer = rb_entry(n, struct binder_buffer, rb_node);
			buffer_size = binder_alloc_buffer_size(alloc, buffer);
			free_buffers++;
			total_free_size += buffer_size;
			if (buffer_size > largest_free_size)
				largest_free_size = buffer_size;
		}
>>打印log,说明用于binder通信的剩余地址空间不够了
		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
				   "%d: binder_alloc_buf size %zd failed, no address space\n",
				   alloc->pid, size);
		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
				   "allocated: %zd (num: %zd largest: %zd), free: %zd (num: %zd largest: %zd)\n",
				   total_alloc_size, allocated_buffers,
				   largest_alloc_size, total_free_size,
				   free_buffers, largest_free_size);
		return ERR_PTR(-ENOSPC);
	}
	if (n == NULL) {
>>执行到这里说明best_fit不为null, 如果n为null,说明找到的binder_buffer并不是大小最合适的(但至少其buffer_size大于请求的size)
>>于是根据best_fit找到对应的binder_buffer
		buffer = rb_entry(best_fit, struct binder_buffer, rb_node);
>>并确认其buffer_size
		buffer_size = binder_alloc_buffer_size(alloc, buffer);
	}

	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
		     "%d: binder_alloc_buf size %zd got buffer %pK size %zd\n",
		      alloc->pid, size, buffer, buffer_size);

	has_page_addr = (void __user *)
		(((uintptr_t)buffer->user_data + buffer_size) & PAGE_MASK);
	WARN_ON(n && buffer_size != size);
>>确认结束地址为:当前binder_buffer起始地址+请求的size(注意不是buffer_size)
	end_page_addr =
		(void __user *)PAGE_ALIGN((uintptr_t)buffer->user_data + size);
	if (end_page_addr > has_page_addr)
		end_page_addr = has_page_addr;
>>调用binder_update_page_range分配物理页,第一个参数alloc指定了进程,第二个参数说明是分配而非回收
>>第三个参数为binder_buffer起始地址,第四个参数为binder_buffer的实际结束地址(由实际请求的data size决定)
	ret = binder_update_page_range(alloc, 1, (void __user *)
		PAGE_ALIGN((uintptr_t)buffer->user_data), end_page_addr);
	if (ret)
		return ERR_PTR(ret);

	if (buffer_size != size) {
>>如果找到的binder_buffer的buffer_size与请求的size不相等(肯定是buffer_size>size),则新建一个binder_buffer
>>用于管理当前binder_buffer剩下的地址空间(因为实际上用不了这么多)
		struct binder_buffer *new_buffer;

		new_buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
		if (!new_buffer) {
			pr_err("%s: %d failed to alloc new buffer struct\n",
			       __func__, alloc->pid);
			goto err_alloc_buf_struct_failed;
		}
>>新建的binder_buffer起始地址为:当前binder_buffer起始地址+请求的size
		new_buffer->user_data = (u8 __user *)buffer->user_data + size;
		list_add(&new_buffer->entry, &buffer->entry);
>>置为free
		new_buffer->free = 1;
>>将新建的binder_buffer插入到free_buffers中
		binder_insert_free_buffer(alloc, new_buffer);
	}
>>将当前找到的binder_buffer从free_buffers中移除
	rb_erase(best_fit, &alloc->free_buffers);
>>将free置为0,表示正在被使用
	buffer->free = 0;
	buffer->allow_user_free = 0;
>>将当前binder_buffer插入到allocated_buffers中
	binder_insert_allocated_buffer_locked(alloc, buffer);
	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
		     "%d: binder_alloc_buf size %zd got %pK\n",
		      alloc->pid, size, buffer);
	buffer->data_size = data_size;
	buffer->offsets_size = offsets_size;
	buffer->async_transaction = is_async;
	buffer->extra_buffers_size = extra_buffers_size;
	buffer->pid = pid;
	buffer->oneway_spam_suspect = false;
	if (is_async) {
>>若为异步binder通信,且异步空间剩余不足20%(0.5Mb*20%),则打印log
		alloc->free_async_space -= size;
		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC_ASYNC,
			     "%d: binder_alloc_buf size %zd async free %zd\n",
			      alloc->pid, size, alloc->free_async_space);
		if (alloc->free_async_space < alloc->buffer_size / 10) {
			/*
			 * Start detecting spammers once we have less than 20%
			 * of async space left (which is less than 10% of total
			 * buffer size).
			 */
			buffer->oneway_spam_suspect = debug_low_async_space_locked(alloc, pid);
		} else {
			alloc->oneway_spam_detected = false;
		}
	}
	return buffer;



static size_t binder_alloc_buffer_size(struct binder_alloc *alloc,
				       struct binder_buffer *buffer)
{
>>如果当前binder_buffer为最后一个Buffer,则其buffer_size为:size= 整个虚拟地址空间起始地址+地址空间大小-当前binder_buffer起始地址
	if (list_is_last(&buffer->entry, &alloc->buffers))
		return alloc->buffer + alloc->buffer_size - buffer->user_data;
>>否则,即为:size=下一个binder_buffer的起始地址-当前binder_buffer起始地址
	return binder_buffer_next(buffer)->user_data - buffer->user_data;
}



free_buffers中的binder_buffer是按照其buffer_size大小排列的,故插入新的binder_buffer时也是按照buffer_size去插入的。

static void binder_insert_free_buffer(struct binder_alloc *alloc,
				      struct binder_buffer *new_buffer)
{
	struct rb_node **p = &alloc->free_buffers.rb_node;
	struct rb_node *parent = NULL;
	struct binder_buffer *buffer;
	size_t buffer_size;
	size_t new_buffer_size;

	BUG_ON(!new_buffer->free);
>>获取新创建的binder_buffer的buffer_size
	new_buffer_size = binder_alloc_buffer_size(alloc, new_buffer);

	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
		     "%d: add free buffer, size %zd, at %pK\n",
		      alloc->pid, new_buffer_size, new_buffer);
>>遍历free_buffers中的每一个buffer
	while (*p) {
		parent = *p;
		buffer = rb_entry(parent, struct binder_buffer, rb_node);
		BUG_ON(!buffer->free);

		buffer_size = binder_alloc_buffer_size(alloc, buffer);
>>如果新的binder_buffer的size小于当前遍历到的buffer_size,则指针左移;否则右移。
		if (new_buffer_size < buffer_size)
			p = &parent->rb_left;
		else
			p = &parent->rb_right;
	}

6.2 binder内存回收

binder内存有分配,也必然有对应的回收,其回收时机取决于用户空间何时向binder驱动发送BC_FREE_BUFFER命令。binder驱动在接收到BC_FREE_BUFFER命令后,会先获取数据的地址指针data_ptr,再根据data_ptr找到对应的binder_buffer,最终调用binder_free_buf函数去释放binder_buffer,主要做了这几件事:释放binder_buffer对应的物理页、切换binder_buffer为free状态、合并与当前buffer地址连续的前一个/后一个buffer、将当前buffer插入到free_buffers中待用。

/ drivers / android / binder.c
static int binder_thread_write(struct binder_proc *proc,
			struct binder_thread *thread,
			binder_uintptr_t binder_buffer, size_t size,
			binder_size_t *consumed)
{
...
>>执行BC_FREE_BUFFER命令
		case BC_FREE_BUFFER: {
			binder_uintptr_t data_ptr;
			struct binder_buffer *buffer;
>>获取data_ptr
			if (get_user(data_ptr, (binder_uintptr_t __user *)ptr))
				return -EFAULT;
			ptr += sizeof(binder_uintptr_t);
>>根据data_ptr找到对应的binder_buffer
			buffer = binder_alloc_prepare_to_free(&proc->alloc,
							      data_ptr);
...
>>释放binder_buffer
			binder_free_buf(proc, thread, buffer, false);
			break;
		}



static void
binder_free_buf(struct binder_proc *proc,
		struct binder_thread *thread,
		struct binder_buffer *buffer, bool is_failure)
{
	binder_inner_proc_lock(proc);
	if (buffer->transaction) {
		buffer->transaction->buffer = NULL;
		buffer->transaction = NULL;
	}
	binder_inner_proc_unlock(proc);
	if (buffer->async_transaction && buffer->target_node) {
...
	}
	trace_binder_transaction_buffer_release(buffer);
	binder_release_entire_buffer(proc, thread, buffer, is_failure);
>>释放binder_buffer
	binder_alloc_free_buf(&proc->alloc, buffer);
}


void binder_alloc_free_buf(struct binder_alloc *alloc,
			    struct binder_buffer *buffer)
{
	/*
	 * We could eliminate the call to binder_alloc_clear_buf()
	 * from binder_alloc_deferred_release() by moving this to
	 * binder_free_buf_locked(). However, that could
	 * increase contention for the alloc mutex if clear_on_free
	 * is used frequently for large buffers. The mutex is not
	 * needed for correctness here.
	 */
	if (buffer->clear_on_free) {
		binder_alloc_clear_buf(alloc, buffer);
		buffer->clear_on_free = false;
	}
	mutex_lock(&alloc->mutex);
>>释放binder_buffer
	binder_free_buf_locked(alloc, buffer);
	mutex_unlock(&alloc->mutex);
}



static void binder_free_buf_locked(struct binder_alloc *alloc,
				   struct binder_buffer *buffer)
{
	size_t size, buffer_size;

	buffer_size = binder_alloc_buffer_size(alloc, buffer);

	size = ALIGN(buffer->data_size, sizeof(void *)) +
		ALIGN(buffer->offsets_size, sizeof(void *)) +
		ALIGN(buffer->extra_buffers_size, sizeof(void *));

	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
		     "%d: binder_free_buf %pK size %zd buffer_size %zd\n",
		      alloc->pid, buffer, size, buffer_size);
...
>>释放binder_buffer对应的物理页,第二个参数为0说明是释放而非分配
>>第三个参数为binder_buffer起始地址,第四个参数为binder_buffer结束地址
	binder_update_page_range(alloc, 0,
		(void __user *)PAGE_ALIGN((uintptr_t)buffer->user_data),
		(void __user *)(((uintptr_t)
			  buffer->user_data + buffer_size) & PAGE_MASK));
>>将其从allocated_buffers中移除
	rb_erase(&buffer->rb_node, &alloc->allocated_buffers);
>>将其free状态置为1,恢复为空闲状态
	buffer->free = 1;
>>如果当前binder_buffer不是alloc->buffers中的最后一个
	if (!list_is_last(&buffer->entry, &alloc->buffers)) {
>>找到当前binder_buffer的下一个buffer(按照地址)
		struct binder_buffer *next = binder_buffer_next(buffer);
>>如果下一个binder_buffer也是free状态,则将其合并到当前buffer
>>合并的逻辑是:删除下一个地址对应的binder_buffer,则当前binder_buffer的结束地址即为原下一个binder_buffer的结束地址;
>>假如当前binderr_buffer对应的buffe_size为size1,起始地址为start1,结束地址为end1,下一个binder_buffer对应的buffer_size为size2,
>>起始地址为start2(等于end1),结束地址为end2,则合并后的buffer_size为size1+size2,起始地址仍为start1,但结束地址为end。
		if (next->free) {
>>将下一个binder_buffer从free_buffers中移除
			rb_erase(&next->rb_node, &alloc->free_buffers);
>>将下一个binder_buffer从alloc中移除
			binder_delete_free_buffer(alloc, next);
		}
	}
>>如果当前buffer不是alloc->buffers中的第一个
	if (alloc->buffers.next != &buffer->entry) {
>>获取当前binder_buffer的前一个buffer(按照地址)
		struct binder_buffer *prev = binder_buffer_prev(buffer);
>>如果前面一个buffer也是free状态,则合并,与前面流程类似
		if (prev->free) {
>>删掉当前binder_buffer
			binder_delete_free_buffer(alloc, buffer);
			rb_erase(&prev->rb_node, &alloc->free_buffers);
>>这里将buffer置为prev,因为当前binder_buffer已经被删除,也就是将当前binder_buffer管理的地址合并到prev中
			buffer = prev;
		}
	}
>>将当前binder_buffer插入到free_buffers中
	binder_insert_free_buffer(alloc, buffer);
}



static void binder_delete_free_buffer(struct binder_alloc *alloc,
				      struct binder_buffer *buffer)
{
	struct binder_buffer *prev, *next = NULL;
>>置to_free = true
	bool to_free = true;

	BUG_ON(alloc->buffers.next == &buffer->entry);
>>获取前一个buffer
	prev = binder_buffer_prev(buffer);
	BUG_ON(!prev->free);
	if (prev_buffer_end_page(prev) == buffer_start_page(buffer)) {
>>如果前一个buffer的end_page和当前buffer的start_page在同一物理页,则置to_free = false
>>由于会合并地址连续的free_buffer,所以这里获取的前一个buffer不处于free状态,所以其物理页不能释放
		to_free = false;
		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
				   "%d: merge free, buffer %pK share page with %pK\n",
				   alloc->pid, buffer->user_data,
				   prev->user_data);
	}
...
	if (!list_is_last(&buffer->entry, &alloc->buffers)) {
>>获取下一个buffer
		next = binder_buffer_next(buffer);
		if (buffer_start_page(next) == buffer_start_page(buffer)) {
>>如果下一个buffer的start_page和当前buffer的start_page在同一物理页,则置to_free=false
			to_free = false;
			binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
					   "%d: merge free, buffer %pK share page with %pK\n",
					   alloc->pid,
					   buffer->user_data,
					   next->user_data);
		}
	}
...
	if (to_free) {
		binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
				   "%d: merge free, buffer %pK do not share page with %pK or %pK\n",
				   alloc->pid, buffer->user_data,
				   prev->user_data,
				   next ? next->user_data : NULL);
>>释放buffer对应的物理页,只是释放了一个物理页?
		binder_update_page_range(alloc, 0, buffer_start_page(buffer),
					 buffer_start_page(buffer) + PAGE_SIZE);
	}
	list_del(&buffer->entry);
	kfree(buffer);
}

根据前面的代码流程,binder通信后释放buffer的时机是BC_FREE_BUFFER。而用户空间向binder驱动发送BC_FREE_BUFFER的时机主要有两个,其中一个就是server端在BR_TRANSACTION阶段执行业务结束后会调用IPCThreadState::releaseBuffer方法去向驱动发送BC_FREE_BUFFER命令,对应代码流程如下:

BBinder.transact->Parcel.setDataSize(0)->continueWrite->freeData->freeDataNoInit->IPCThreadState.freeBuffer

在server端响应BR_TRANSACTION时,会新建一个buffer(Parcel)主要用于存放client端发来的数据,并为其设置release_func,目的是为了在合适时机调用release_func去释放buffer。而这里的release_func就是freeBuffer方法,在这个方法中用户空间会向binder驱动发送BC_FREE_BUFFER命令,进而执行binder_buffer的释放流程,这样使用过的binder_buffer才能再次恢复为空闲状态。

所以,从分配binder_buffer与释放binder_buffer的时机,可以看到在binder通信的发送与回复阶段,其实都会消耗binder_buffer,也就是一次完整的binder通信会分配两次binder_buffer、同时释放两次binder_buffer。且其中一次是server进程分配binder_buffer,一次是client进程分配binder_buffer。

/frameworks/native/libs/binder/IPCThreadState.cpp
1274  status_t IPCThreadState::executeCommand(int32_t cmd)
...
1354      case BR_TRANSACTION_SEC_CTX:
1355      case BR_TRANSACTION:
1356          {
1357              binder_transaction_data_secctx tr_secctx;
1358              binder_transaction_data& tr = tr_secctx.transaction_data;
...
>>新建parcel用于保存从client端发来的数据,并设置release_func为freeBuffer
1371              Parcel buffer;
1372              buffer.ipcSetDataReference(
1373                  reinterpret_cast<const uint8_t*>(tr.data.ptr.buffer),
1374                  tr.data_size,
1375                  reinterpret_cast<const binder_size_t*>(tr.data.ptr.offsets),
1376                  tr.offsets_size/sizeof(binder_size_t), freeBuffer);
...
1404              Parcel reply;
...
1417              if (tr.target.ptr) {
1418                  // We only have a weak reference on the target object, so we must first try to
1419                  // safely acquire a strong reference before doing anything else with it.
1420                  if (reinterpret_cast<RefBase::weakref_type*>(
1421                          tr.target.ptr)->attemptIncStrong(this)) {
>>找到server端BBinder对象,执行transact方法
1422                      error = reinterpret_cast<BBinder*>(tr.cookie)->transact(tr.code, buffer,
1423                              &reply, tr.flags);
1424                      reinterpret_cast<BBinder*>(tr.cookie)->decStrong(this);
1425                  } else {
...
1436              if ((tr.flags & TF_ONE_WAY) == 0) {
>>如果不是one_way,开始回复client端
1437                  LOG_ONEWAY("Sending reply to %d!", mCallingPid);
1438                  if (error < NO_ERROR) reply.setError(error);
1439  
1440                  // b/238777741: clear buffer before we send the reply.
1441                  // Otherwise, there is a race where the client may
1442                  // receive the reply and send another transaction
1443                  // here and the space used by this transaction won't
1444                  // be freed for the client.
>>server端业务执行完毕,准备释放用于存放client数据的parcel
1445                  buffer.setDataSize(0);
1446  
1447                  constexpr uint32_t kForwardReplyFlags = TF_CLEAR_BUF;
>>发送BC_REPLY
1448                  sendReply(reply, (tr.flags & kForwardReplyFlags));
1449              } else {


/frameworks/native/libs/binder/Parcel.cpp
2557  void Parcel::ipcSetDataReference(const uint8_t* data, size_t dataSize, const binder_size_t* objects,
2558                                   size_t objectsCount, release_func relFunc) {
2559      // this code uses 'mOwner == nullptr' to understand whether it owns memory
2560      LOG_ALWAYS_FATAL_IF(relFunc == nullptr, "must provide cleanup function");
2561  
2562      freeData();
...
2569      kernelFields->mObjects = const_cast<binder_size_t*>(objects);
2570      kernelFields->mObjectsSize = kernelFields->mObjectsCapacity = objectsCount;
>>通过ipcSetDataReference方法设置release_func(IPCThreadState::freeBuffer)
2571      mOwner = relFunc;


//buffer.setDataSize(0)
401  status_t Parcel::setDataSize(size_t size)
402  {
403      if (size > INT32_MAX) {
404          // don't accept size_t values which may have come from an
405          // inadvertent conversion from a negative int.
406          return BAD_VALUE;
407      }
408  
409      status_t err;
>>传入的size为0
410      err = continueWrite(size);
411      if (err == NO_ERROR) {
412          mDataSize = size;
413          ALOGV("setDataSize Setting data size of %p to %zu", this, mDataSize);
414      }
415      return err;
416  }


2853  status_t Parcel::continueWrite(size_t desired)
2854  {
...
2886      if (mOwner) {
2887          // If the size is going to zero, just release the owner's data.
>>之前mOwner已经赋值为IPCThreadState::freeBuffer,且desired为0,所以执行freeData方法
2888          if (desired == 0) {
2889              freeData();
2890              return NO_ERROR;
2891          }


2732  void Parcel::freeData()
2733  {
>>执行freeDataNoInit方法
2734      freeDataNoInit();
2735      initState();
2736  }


2738  void Parcel::freeDataNoInit()
2739  {
2740      if (mOwner) {
2741          LOG_ALLOC("Parcel %p: freeing other owner data", this);
2742          //ALOGI("Freeing data ref of %p (pid=%d)", this, getpid());
2743          auto* kernelFields = maybeKernelFields();
2744          // Close FDs before freeing, otherwise they will leak for kernel binder.
2745          closeFileDescriptors();
>>执行IPCThreadState::freeBuffer方法,开始释放buffer
2746          mOwner(mData, mDataSize, kernelFields ? kernelFields->mObjects : nullptr,
2747                 kernelFields ? kernelFields->mObjectsSize : 0);
2748      } else {



1602  void IPCThreadState::freeBuffer(const uint8_t* data, size_t /*dataSize*/,
1603                                  const binder_size_t* /*objects*/, size_t /*objectsSize*/) {
1604      //ALOGI("Freeing parcel %p", &parcel);
1605      IF_LOG_COMMANDS() {
1606          std::ostringstream logStream;
1607          logStream << "Writing BC_FREE_BUFFER for " << data << "\n";
1608          std::string message = logStream.str();
1609          ALOGI("%s", message.c_str());
1610      }
1611      ALOG_ASSERT(data != NULL, "Called with NULL data");
1612      IPCThreadState* state = self();
>>向驱动发送BC_FREE_BUFFER
1613      state->mOut.writeInt32(BC_FREE_BUFFER);
1614      state->mOut.writePointer((uintptr_t)data);
1615      state->flushIfNeeded();
1616  }

另一个释放buffer时机就是在BR_REPLY阶段使用完存放server端数据的parcel后,会调用对象的recycle方法,进而调用Parcel的析构函数,执行freeDataNoInit函数,后续逻辑与前面第一种时机对应的流程相同。

/frameworks/native/libs/binder/IPCThreadState.cpp
1004  status_t IPCThreadState::waitForResponse(Parcel *reply, status_t *acquireResult)
1005  {
...
1059          case BR_REPLY:
1060              {
1061                  binder_transaction_data tr;
1062                  err = mIn.read(&tr, sizeof(tr));
1063                  ALOG_ASSERT(err == NO_ERROR, "Not enough command data for brREPLY");
1064                  if (err != NO_ERROR) goto finish;
1065  
1066                  if (reply) {
>>将server端返回的数据写入reply(Parcel)中,并设置release_func为freeBuffer
1067                      if ((tr.flags & TF_STATUS_CODE) == 0) {
1068                          reply->ipcSetDataReference(
1069                              reinterpret_cast<const uint8_t*>(tr.data.ptr.buffer),
1070                              tr.data_size,
1071                              reinterpret_cast<const binder_size_t*>(tr.data.ptr.offsets),
1072                              tr.offsets_size/sizeof(binder_size_t),
1073                              freeBuffer);
1074                      } else {


这里以ContentProviderNative为例,展示client端在BR_REPLY阶段释放buffer的时机。
/frameworks/base/core/java/android/content/ContentProviderNative.java
519      public String getType(AttributionSource attributionSource, Uri url) throws RemoteException
520      {
521          Parcel data = Parcel.obtain();
522          Parcel reply = Parcel.obtain();
523          try {
524              data.writeInterfaceToken(IContentProvider.descriptor);
525              attributionSource.writeToParcel(data, 0);
526              url.writeToParcel(data, 0);
527  
>>client端执行binder通信
528              mRemote.transact(IContentProvider.GET_TYPE_TRANSACTION, data, reply, 0);
529  
530              DatabaseUtils.readExceptionFromParcel(reply);
531              String out = reply.readString();
532              return out;
533          } finally {
>>结束生命周期,执行Parcel析构函数
534              data.recycle();
535              reply.recycle();
536          }
537      }


354  Parcel::~Parcel()
355  {
>>执行release_func去释放buffer
356      freeDataNoInit();
357      LOG_ALLOC("Parcel %p: destroyed", this);
358  }

最后来说说free_buffers的更新流程。前面提到,free_buffers是一颗红黑树,里面存放的空闲binder_buffer是按照对应的buffer_size来排序的。每次给新的binder_transaction分配binder_buffer时,都会从free_buffers中找到一个buffer_size最合适的。那么free_buffers中的binder_buffer都是在什么时机插入的?其buffer_size大小又如何?

首先,在进程调用mmap分配1M左右大小的虚拟地址空间时,在binder_alloc_mmap_handler函数中就会创建一个binder_buffer,对该buffer的起始地址赋值为刚刚分配的虚拟地址空间的首地址,然后将这个buffer置为free状态并插入free_buffers中。此时,free_buffers中就只有这一个binder_buffer,且其buffer_size与整个虚拟地址空间大小相同。待后面接收到binder请求、为binder_transaction分配binder_buffer时,binder_transaction所请求的size一般是远远小于1Mb,所以此时会再新建一个binder_buffer,用于管理剩余的地址空间。这样循环往复,free_buffers中的binder_buffer会不断的被分割,以便binder内存得到高效的利用。当binder通信结束,对应的binder_buffer会在binder_free_buf_locked函数中得到释放,其对应的物理页被回收,重新置为free状态,并插入到free_buffers中。插入前会判断与该buffer地址连续的前、后buffer是否为free状态,如果是则合并,即将多个地址连续的、均为free状态的binder_buffer合并为一个buffer_size更大的、处于free状态的binder_buffer。这样,binder_buffer既会被不断分割成更小块、也会不断合并成更大块,以便实现binder内存的高效利用。
 

7.binder线程优先级继承

binder通信中server端binder线程继承client优先级、以及优先级回落的代码逻辑,其实前面已经梳理过了,但是由于平时工作中这方面问题遇到的比较多,所以这里再专门拿出来说一说,也便于后面查询。

binder通信过程中的BC_TRANSACTION阶段,会去获取当前client线程的优先级并将其保存到binder_transaction结构体中。当binder线程被唤醒后,会先把binder线程优先级保存至saved_priority中,再把当前binder_transaction中保存的nice值设置给binder线程。再server端回复的BC_REPLY阶段,会把之前binder_transaction中保存的优先级saved_priority再设置给server端binder线程,使得binder线程恢复为原来的优先级(默认为120)。

根据上面binder通信过程中线程优先级的调整过程,可以知道binder线程是先被唤醒、再设置新的优先级,所以在trace中可以经常看到binder线程被唤醒后存在一些调度延迟。另外,binder线程也只能继承cfs调度策略的优先级(nice值),而无法继承rt线程的优先级。所以,这两点如果可以优化,将对性能提升有一定帮助。

static void binder_transaction(struct binder_proc *proc,
			       struct binder_thread *thread,
			       struct binder_transaction_data *tr, int reply,
			       binder_size_t extra_buffers_size)
{
...
	t->to_proc = target_proc;
	t->to_thread = target_thread;
	t->code = tr->code;
	t->flags = tr->flags;
>>将client线程优先级保存到binder_transaction结构体中
	t->priority = task_nice(current);



static int binder_thread_read(struct binder_proc *proc,
			      struct binder_thread *thread,
			      binder_uintptr_t binder_buffer, size_t size,
			      binder_size_t *consumed, int non_block)
{
	void __user *buffer = (void __user *)(uintptr_t)binder_buffer;
	void __user *ptr = buffer + *consumed;
	void __user *end = buffer + size;
...
>>target_node为binder_node类型,如果存在说明当前为binder通信发送阶段
		if (t->buffer->target_node) {
			struct binder_node *target_node = t->buffer->target_node;

			trd->target.ptr = target_node->ptr;
			trd->cookie =  target_node->cookie;
>>将当前server端binder线程优先级保存至saved_priority
			t->saved_priority = task_nice(current);
			if (t->priority < target_node->min_priority &&
			    !(t->flags & TF_ONE_WAY))
>>如果是非oneway且client线程优先级数值小于server端的最小数值(数值越低则优先级越高),
>>则将binder线程优先级设置为client线程的优先级
				binder_set_nice(t->priority);
			else if (!(t->flags & TF_ONE_WAY) ||
				 t->saved_priority > target_node->min_priority)
>>否则设为server端最低优先级(此时client线程优先级更低)
				binder_set_nice(target_node->min_priority);
			cmd = BR_TRANSACTION;
		} else {



static void binder_transaction(struct binder_proc *proc,
			       struct binder_thread *thread,
			       struct binder_transaction_data *tr, int reply,
			       binder_size_t extra_buffers_size)
{
...
>>BC_REPLY
	if (reply) {
		binder_inner_proc_lock(proc);
>>reply=true,获取当前thread对应的binder_transaction(为之前client端向server通信时的binder_transaction)
		in_reply_to = thread->transaction_stack;
...
		thread->transaction_stack = in_reply_to->to_parent;
		binder_inner_proc_unlock(proc);
>>restore流程,恢复当前线程原来的优先级
		binder_set_nice(in_reply_to->saved_priority);
>>binder_get_txn_from_and_acq_inner函数主要逻辑是从binder_transaction中获取到成员from
>>因此目标线程即为原来的client线程
		target_thread = binder_get_txn_from_and_acq_inner(in_reply_to);

前面提到binder线程目前无法继承caller thread的rt调度,实际上android部分已经预留了相关接口,上层可以通过某些接口来设置binder_node是否允许binder线程继承rt调度,但是binder driver中缺少相关实现,故默认binder线程是无法继承rt调度的。但是各大平台可以自行实现相关细节,以便实现binder线程继承rt调度的功能。

/frameworks/native/libs/binder/Binder.cpp
577  bool BBinder::isInheritRt() {
578      Extras* e = mExtras.load(std::memory_order_acquire);
579  
>>判断是否可以继承rt调度
580      return e && e->mInheritRt;
581  }
582  
583  void BBinder::setInheritRt(bool inheritRt) {
584      LOG_ALWAYS_FATAL_IF(mParceled,
585                          "setInheritRt() should not be called after a binder object "
586                          "is parceled/sent to another process");
587  
588      Extras* e = mExtras.load(std::memory_order_acquire);
589  
590      if (!e) {
591          if (!inheritRt) {
592              return;
593          }
594  
595          e = getOrCreateExtras();
596          if (!e) return; // out of memory
597      }
598  
>>设置是否允许继承rt调度,对当前binder实体(服务)生效
599      e->mInheritRt = inheritRt;
600  }


/frameworks/native/libs/binder/Parcel.cpp
209  status_t Parcel::flattenBinder(const sp<IBinder>& binder) {
210      BBinder* local = nullptr;
211      if (binder) local = binder->localBinder();
212      if (local) local->setParceled();
...
240      if (binder != nullptr) {
241          if (!local) {
242              BpBinder *proxy = binder->remoteBinder();
...
257          } else {
258              int policy = local->getMinSchedulerPolicy();
259              int priority = local->getMinSchedulerPriority();
...
269              if (local->isInheritRt()) {
>>给binder实体加上flag--FLAT_BINDER_FLAG_INHERIT_RT后,可以实现binder线程继承rt调度
270                  obj.flags |= FLAT_BINDER_FLAG_INHERIT_RT;
271              }
272              obj.hdr.type = BINDER_TYPE_BINDER;
273              obj.binder = reinterpret_cast<uintptr_t>(local->getWeakRefs());
274              obj.cookie = reinterpret_cast<uintptr_t>(local);
275          }

8.binder线程创建

binder主线程与普通binder线程有什么区别?之前对这一点有一些疑问,有的文章说binder主线程执行业务结束后不会退出thread loop,而普通binder线程优先级在执行业务结束后会退出thread loop,理由是代码中对于binder线程是否退出thread loop会判断binder线程是否为主线程。但是这次对binder通信流程梳理后发现并不完全是这样。

首先,binder主线程创建是在开启binder线程池的时候,初始化了一个PoolThread,然后调用joinThreadPool将自己加入binder线程池。由于传入的isMain参数为true,因此会向binder驱动发送BC_ENTER_LOOPER(如果是普通binder线程则会发送BC_REGISTER_LOOPER)。binder驱动会在binder_thread_write函数中处理此命令,主要是给线程的looper加上BINDER_LOOPER_STATE_ENTERED这个flag。如果是普通binder线程发送的BC_REGISTER_LOOPER命令,会先给binder_proc的成员变量requested_threads减一,表示当前请求创建的线程(或者说正在创建的线程)减一;同时把成员变量requested_threads_started加一,表示创建完成的普通binder线程加一;最后会给binder线程的looper加上BINDER_LOOPER_STATE_REGISTERED这个flag,该flag主要用于表示binder线程的身份,可以用于接收和处理client端发来的binder请求。


/frameworks/native/libs/binder/ProcessState.cpp
197  void ProcessState::startThreadPool()
198  {
199      AutoMutex _l(mLock);
200      if (!mThreadPoolStarted) {
201          if (mMaxThreads == 0) {
202              ALOGW("Extra binder thread started, but 0 threads requested. Do not use "
203                    "*startThreadPool when zero threads are requested.");
204          }
205          mThreadPoolStarted = true;
>>初始化binder主线程:isMain=true
206          spawnPooledThread(true);
207      }
208  }



406  void ProcessState::spawnPooledThread(bool isMain)
407  {
408      if (mThreadPoolStarted) {
>>确定binder线程名
409          String8 name = makeBinderThreadName();
410          ALOGV("Spawning new pooled thread, name=%s\n", name.string());
>>新建一个PoolThread
411          sp<Thread> t = sp<PoolThread>::make(isMain);
>>执行PoolThread.run方法
412          t->run(name.string());
413          pthread_mutex_lock(&mThreadCountLock);
414          mKernelStartedThreads++;
415          pthread_mutex_unlock(&mThreadCountLock);
416      }
417  }


63  class PoolThread : public Thread
64  {
65  public:
66      explicit PoolThread(bool isMain)
67          : mIsMain(isMain)
68      {
69      }
70  
71  protected:
72      virtual bool threadLoop()
73      {
>>将binder主线程添加到binder线程池:isMain=true
74          IPCThreadState::self()->joinThreadPool(mIsMain);
75          return false;
76      }


/frameworks/native/libs/binder/IPCThreadState.cpp
727  void IPCThreadState::joinThreadPool(bool isMain)
728  {
729      LOG_THREADPOOL("**** THREAD %p (PID %d) IS JOINING THE THREAD POOL\n", (void*)pthread_self(), getpid());
730      pthread_mutex_lock(&mProcess->mThreadCountLock);
731      mProcess->mCurrentThreads++;
732      pthread_mutex_unlock(&mProcess->mThreadCountLock);
>>如果是binder主线程,向binder驱动发送BC_ENTER_LOOPER
>>如果是普通binder线程,则向binder驱动发送BC_REGISTER_LOOPER
733      mOut.writeInt32(isMain ? BC_ENTER_LOOPER : BC_REGISTER_LOOPER);
734  
735      mIsLooper = true;
736      status_t result;


/ drivers / android / binder.c
static int binder_thread_write(struct binder_proc *proc,
			struct binder_thread *thread,
			binder_uintptr_t binder_buffer, size_t size,
			binder_size_t *consumed)
{
...
		case BC_REGISTER_LOOPER:
			binder_debug(BINDER_DEBUG_THREADS,
				     "%d:%d BC_REGISTER_LOOPER\n",
				     proc->pid, thread->pid);
			binder_inner_proc_lock(proc);
			if (thread->looper & BINDER_LOOPER_STATE_ENTERED) {
				thread->looper |= BINDER_LOOPER_STATE_INVALID;
				binder_user_error("%d:%d ERROR: BC_REGISTER_LOOPER called after BC_ENTER_LOOPER\n",
					proc->pid, thread->pid);
			} else if (proc->requested_threads == 0) {
				thread->looper |= BINDER_LOOPER_STATE_INVALID;
				binder_user_error("%d:%d ERROR: BC_REGISTER_LOOPER called without request\n",
					proc->pid, thread->pid);
			} else {
>>请求创建的线程(或者正在创建中的线程)requested_threads减一
				proc->requested_threads--;
>>创建好的线程加一
				proc->requested_threads_started++;
			}
>>给普通binder线程加上flag: BINDER_LOOPER_STATE_REGISTERED
			thread->looper |= BINDER_LOOPER_STATE_REGISTERED;
			binder_inner_proc_unlock(proc);
			break;
...
		case BC_ENTER_LOOPER:
			binder_debug(BINDER_DEBUG_THREADS,
				     "%d:%d BC_ENTER_LOOPER\n",
				     proc->pid, thread->pid);
			if (thread->looper & BINDER_LOOPER_STATE_REGISTERED) {
				thread->looper |= BINDER_LOOPER_STATE_INVALID;
				binder_user_error("%d:%d ERROR: BC_ENTER_LOOPER called after BC_REGISTER_LOOPER\n",
					proc->pid, thread->pid);
			}
>>给binder主线程加上flag: BINDER_LOOPER_STATE_ENTERED
			thread->looper |= BINDER_LOOPER_STATE_ENTERED;
			break;

前面提到binder主线程的创建时机就是开启binder线程池的时候,那么普通binder线程创建的时机是什么呢?我们知道binder线程的主要作用就是处理client端发来的binder请求,因此在binder_thread_read函数中当server端线程被binder驱动唤醒处理binder请求时,会去做一系列条件判断:当前是否已有创建binder线程的请求?当前server进程是否有处于空闲状态的binder线程?已经创建的binder线程数量是否达到上限?当前被唤醒的线程是否为binder线程?当满足这一些列条件后,binder驱动才会去向用户空间发送BR_SPAWN_LOOPER命令,通知用户空间创建新的binder线程。而这个数量上限是不包含binder主线程在内的,因为只有在创建普通binder线程时才会去更新成员变量requested_threads_started。
 

/ drivers / android / binder.c
static int binder_thread_read(struct binder_proc *proc,
			      struct binder_thread *thread,
			      binder_uintptr_t binder_buffer, size_t size,
			      binder_size_t *consumed, int non_block)
{
	void __user *buffer = (void __user *)(uintptr_t)binder_buffer;
	void __user *ptr = buffer + *consumed;
	void __user *end = buffer + size;
...
	*consumed = ptr - buffer;
	binder_inner_proc_lock(proc);
>>proc->requested_threads == 0: 当前没有正在创建的binder线程
	if (proc->requested_threads == 0 &&
>>list_empty(&thread->proc->waiting_threads): 进程没有空闲的binder线程
	    list_empty(&thread->proc->waiting_threads) &&
>>proc->requested_threads_started < proc->max_threads: 已经创建的binder线程数量小于最大限制
	    proc->requested_threads_started < proc->max_threads &&
>>thread->looper & (BINDER_LOOPER_STATE_REGISTERED |BINDER_LOOPER_STATE_ENTERED): 当前线程为binder线程
	    (thread->looper & (BINDER_LOOPER_STATE_REGISTERED |
	     BINDER_LOOPER_STATE_ENTERED)) /* the user-space code fails to */
	     /*spawn a new thread if we leave this out */) {
>>请求创建的线程数量加一
		proc->requested_threads++;
		binder_inner_proc_unlock(proc);
		binder_debug(BINDER_DEBUG_THREADS,
			     "%d:%d BR_SPAWN_LOOPER\n",
			     proc->pid, thread->pid);
>>向用户空间发送BR_SPAWN_LOOPER命令去创建普通binder线程
		if (put_user(BR_SPAWN_LOOPER, (uint32_t __user *)buffer))
			return -EFAULT;
		binder_stat_br(proc, thread, BR_SPAWN_LOOPER);


/frameworks/native/libs/binder/IPCThreadState.cpp
1274  status_t IPCThreadState::executeCommand(int32_t cmd)
1275  {
...
1512      case BR_SPAWN_LOOPER:
>>用户空间执行BR_SPAWN_LOOPER,通过spawnPooledThread去新建一个普通binder线程,后续流程前面已经介绍
1513          mProcess->spawnPooledThread(false);
1514          break;

前面提到创建新的binder线程的前提条件之一就是已经创建的binder线程数量没有达到上限,这个数量上限在binder驱动中通过proc->max_threads来表征,那么这个值是怎么被确定的?具体是多少?

首先,proc->max_threads在binder驱动中被赋值只有一处,就是执行BINDER_SET_MAX_THREADS命令时。

/ drivers / android / binder.c
static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
...
	case BINDER_SET_MAX_THREADS: {
		int max_threads;

		if (copy_from_user(&max_threads, ubuf,
				   sizeof(max_threads))) {
			ret = -EINVAL;
			goto err;
		}
		binder_inner_proc_lock(proc);
		proc->max_threads = max_threads;
		binder_inner_proc_unlock(proc);
		break;
	}

而用户空间向binder驱动发送BINDER_SET_MAX_THREADS命令有两个时机,一是进程通过open_driver开启binder设备时,会将系统默认值DEFAULT_MAX_BINDER_THREADS传入binder驱动,默认值为15,即最多创建15个普通binder线程。

 /frameworks/native/libs/binder/ProcessState.cpp
50  #define DEFAULT_MAX_BINDER_THREADS 15
 
496  static base::Result<int> open_driver(const char* driver) {
497      int fd = open(driver, O_RDWR | O_CLOEXEC);
498      if (fd < 0) {
499          return base::ErrnoError() << "Opening '" << driver << "' failed";
500      }
...
514      size_t maxThreads = DEFAULT_MAX_BINDER_THREADS;
515      result = ioctl(fd, BINDER_SET_MAX_THREADS, &maxThreads);
516      if (result == -1) {
517          ALOGE("Binder ioctl to set max threads failed: %s", strerror(errno));
518      }

另一个时机则是进程通过setThreadPoolMaxThreadCount方法将自定义的数量上限传给binder驱动,比如有的进程会设置数量上限为4。

419  status_t ProcessState::setThreadPoolMaxThreadCount(size_t maxThreads) {
420      LOG_ALWAYS_FATAL_IF(mThreadPoolStarted && maxThreads < mMaxThreads,
421             "Binder threadpool cannot be shrunk after starting");
422      status_t result = NO_ERROR;
423      if (ioctl(mDriverFD, BINDER_SET_MAX_THREADS, &maxThreads) != -1) {
424          mMaxThreads = maxThreads;
425      } else {
426          result = -errno;
427          ALOGE("Binder ioctl to set max threads failed: %s", strerror(-result));
428      }
429      return result;
430  }


/hardware/interfaces/graphics/composer/2.4/default/service.cpp
27  int main() {
28      // the conventional HAL might start binder services
29      android::ProcessState::initWithDriver("/dev/vndbinder");
>>设置上限为4
30      android::ProcessState::self()->setThreadPoolMaxThreadCount(4);
31      android::ProcessState::self()->startThreadPool();
32  

binder线程是否会退出线程池?退出的时机是什么?之前对这一点一直有疑问。从代码逻辑来看,binder线程在创建完成后会通过joinThreadPool方法加入到binder线程池,并一直在该方法中循环。只有当getAndExecuteCommand方法返回的result是TIMED_OUT、且不为binder主线程时,才会退出循环,并向驱动发送BC_EXIT_LOOPER命令,主要目的是给binder线程的looper加上BINDER_LOOPER_STATE_EXITED这个flag。

/frameworks/native/libs/binder/IPCThreadState.cpp

727  void IPCThreadState::joinThreadPool(bool isMain)
728  {
...
735      mIsLooper = true;
736      status_t result;
737      do {
738          processPendingDerefs();
739          // now get the next command to be processed, waiting if necessary
>>与binder驱动通信,获取result
740          result = getAndExecuteCommand();
741  
742          if (result < NO_ERROR && result != TIMED_OUT && result != -ECONNREFUSED && result != -EBADF) {
743              LOG_ALWAYS_FATAL("getAndExecuteCommand(fd=%d) returned unexpected error %d, aborting",
744                    mProcess->mDriverFD, result);
745          }
746  
747          // Let this thread exit the thread pool if it is no longer
748          // needed and it is not the main process thread.
>>如果result为TIMED_OUT且不为binder主线程,退出循环
749          if(result == TIMED_OUT && !isMain) {
750              break;
751          }
752      } while (result != -ECONNREFUSED && result != -EBADF);
>>退出binder线程池,打印log
754      LOG_THREADPOOL("**** THREAD %p (PID %d) IS LEAVING THE THREAD POOL err=%d\n",
755          (void*)pthread_self(), getpid(), result);
756  
>>向binder驱动发送BC_EXIT_LOOPER
757      mOut.writeInt32(BC_EXIT_LOOPER);
758      mIsLooper = false;


/ drivers / android / binder.c
static int binder_thread_write(struct binder_proc *proc,
			struct binder_thread *thread,
			binder_uintptr_t binder_buffer, size_t size,
			binder_size_t *consumed)
{
...
		case BC_EXIT_LOOPER:
			binder_debug(BINDER_DEBUG_THREADS,
				     "%d:%d BC_EXIT_LOOPER\n",
				     proc->pid, thread->pid);
>>给binder线程加上BINDER_LOOPER_STATE_EXITED这个flag
			thread->looper |= BINDER_LOOPER_STATE_EXITED;
			break;

那什么时候会返回TIMED_OUT这个结果呢,查看代码,是在binder驱动向用户空间发送BR_FINISHED这个命令时。该命令以BR开头,说明是binder驱动发送的这个命令。但是在binder驱动中没有找到这个命令发送的地方,但是在binder.h中定义BR_FINISHED的地方找到了一句注释"not currently supported stop threadpool thread",翻译过来就是目前尚不支持binder线程退出线程池,也即目前尚未实现BR_FINISHED协议。因此,不管是否为binder主线程还是普通binder线程,一旦创建后,只要没有异常,就不会退出线程池。

1274  status_t IPCThreadState::executeCommand(int32_t cmd)
1275  {
1276      BBinder* obj;
1277      RefBase::weakref_type* refs;
1278      status_t result = NO_ERROR;
...
1505      case BR_FINISHED:
>>当驱动发送BR_FINISHED命令到用户空间时,返回TIMED_OUT
1506          result = TIMED_OUT;
1507          break;


/ include / uapi / linux / android / binder.h
>>目前尚不支持停止线程池中的线程
	BR_FINISHED = _IO('r', 14),
	/*
	 * not currently supported
	 * stop threadpool thread
	 */

至此,binder主线程与普通binder线程的区别与联系是什么呢?

两者的区别在于:1.创建时机不同;2.数量上限不同(binder主线程是1个,而普通binder线程数量为0个或者多个,一般默认为15个);3.binder驱动给两类线程的looper添加的flag不同,用于区分。

两者的联系在于,1.作用都是用于处理client端的binder请求;2. 一旦创建完成、加入binder线程池后,不会主动退出线程池。

9.匿名binder

实名binder就是我们常见的一种情况,系统服务(如ActivityManagerService、WindowManagerService等)启动后会向ServiceManager去注册,注册的具体内容为binder引用与服务名称,其他进程想要获取系统服务的binder代理对象时就会通过服务名称向ServiceManager
去获取。而匿名binder则不会向ServiceManager去注册,一般是通过实名binder(但其实也可以是匿名binder)直接向目标进程发送binder代理对象或binder引用,然后目标进程可以通过持有的binder代理对象与原进程进行通信。其实匿名binder也是一种非常常见的情况,只是可能大多数时候并没有注意,比如app进程创建时会把ApplicationThread这个binder对象通过ActivityManager实名binder发送到system_server进程,后续server端就可以通过持有的ApplicationThread代理对象来和app进程去通信。

/frameworks/base/core/java/android/app/ActivityThread.java
343      @UnsupportedAppUsage
>>初始化mAppThread为IApplicationThread类型binder对象
344      final ApplicationThread mAppThread = new ApplicationThread();

1047      private class ApplicationThread extends IApplicationThread.Stub {

7853      private void attach(boolean system, long startSeq) {
7854          sCurrentActivityThread = this;
7855          mConfigurationController = new ConfigurationController(this);
7856          mSystemThread = system;
7857          mStartSeq = startSeq;
7858  
7859          if (!system) {
7860              android.ddm.DdmHandleAppName.setAppName("<pre-initialized>",
7861                                                      UserHandle.myUserId());
7862              RuntimeInit.setApplicationObject(mAppThread.asBinder());
7863              final IActivityManager mgr = ActivityManager.getService();
7864              try {
>>通过IActivityManager发送binder对象到system_server进程
7865                  mgr.attachApplication(mAppThread, startSeq);
7866              } catch (RemoteException ex) {
7867                  throw ex.rethrowFromSystemServer();
7868              }

那么匿名binder对象在binder驱动中是怎么发送到目标进程的?在binder驱动中,当为binder_transaction结构体分配了对应的binder_buffer后,会读取从用户空间传来的binder_object数据,判断其类型并根据不同类型做出不同的反应,具体来说有如下几种情况:

1. 用户空间传入的是BBinder或者binder原始对象,这时候根据flat_binder_object中的成员变量binder,去判断当前线程所属进程(可以理解为发起进程)是否存在与fp->binder对应的binder_node,如果不存在,则为当前进程新建一个与fp->binder对应的binder_node;然后再去
目标进程中查找是否有与binder_node对应的binder_ref,如果不存在,则为目标进程新建一个binder_ref,并为flat_binder_object中的handle成员赋值为新建binder_ref的data.desc,然后将fp拷贝到binder_buffer,用于目标进程初始化或查找对应的BpBinder对象,匿名binder就属于这种情况;

2. 用户空间传入的是BpBinder或者binder代理对象,这时候根据flat_binder_object中的成员变量handle,先在当前进程中找到与handle对应的binder_ref,然后再根据binder_ref找到对应的binder_node。找到binder_node后,再判断binder_node所属进程与目标进程是否为同一进程:

2.1 如果是同一进程,说明client与server为同一进程,则会给flat_binder_object中的binder和cookie成员进行赋值,用于在用户空间去初始化或找到对应的BBinder对象,比如System_server中通过ServiceManager获取wms服务,最终得到的不是BinderProxy对象而是Binder对象。
2.2 如果是不同进程,说明client与server为不同进程,则会去判断目标进程是否存在binder_node对应的binder_ref,如果不存在,则为目标进程新建一个binder_ref,然后将binder_ref的data.desc赋值给flat_binder_object中的handle成员,将fp拷贝到binder_buffer中,用户空间根据fp->handle去初始化或找到对应的BpBinder对象,比如app通过ServiceManager获取系统服务时,就属于这种情况。

/ drivers / android / binder.c
static void binder_transaction(struct binder_proc *proc,
			       struct binder_thread *thread,
			       struct binder_transaction_data *tr, int reply,
			       binder_size_t extra_buffers_size)
{
...
	for (buffer_offset = off_start_offset; buffer_offset < off_end_offset;
	     buffer_offset += sizeof(binder_size_t)) {
		struct binder_object_header *hdr;
		size_t object_size;
		struct binder_object object;
		binder_size_t object_offset;
		binder_size_t copy_size;
...
		user_offset = object_offset + object_size;

		hdr = &object.hdr;
		off_min = object_offset + object_size;
>>判断从用户空间传来的binder_object类型
		switch (hdr->type) {
		case BINDER_TYPE_BINDER:
		case BINDER_TYPE_WEAK_BINDER: {
			struct flat_binder_object *fp;

			fp = to_flat_binder_object(hdr);
>>若binder_object为BBinder类型,则根据fp->binder找到对应binder_node,然后再为目标进程新建一个binder_ref
			ret = binder_translate_binder(fp, t, thread);

			if (ret < 0 ||
>>将fp拷贝到binder_buffer中
			    binder_alloc_copy_to_buffer(&target_proc->alloc,
							t->buffer,
							object_offset,
							fp, sizeof(*fp))) {
...
		case BINDER_TYPE_HANDLE:
		case BINDER_TYPE_WEAK_HANDLE: {
			struct flat_binder_object *fp;

			fp = to_flat_binder_object(hdr);
>>若binder_object为BPBinder类型,则根据fp->handle先找到对应binder_ref、再通过binder_ref找到对应binder_node
>>然后再为目标进程新建一个binder_ref
			ret = binder_translate_handle(fp, t, thread);
			if (ret < 0 ||
>>将fp拷贝到binder_buffer中
			    binder_alloc_copy_to_buffer(&target_proc->alloc,
							t->buffer,
							object_offset,
							fp, sizeof(*fp))) {
				return_error = BR_FAILED_REPLY;
				return_error_param = ret;
				return_error_line = __LINE__;
				goto err_translate_failed;
			}
		} break;



static int binder_translate_binder(struct flat_binder_object *fp,
				   struct binder_transaction *t,
				   struct binder_thread *thread)
{
	struct binder_node *node;
	struct binder_proc *proc = thread->proc;
	struct binder_proc *target_proc = t->to_proc;
	struct binder_ref_data rdata;
	int ret = 0;
>>传入的是BBinder对象,根据binder_get_node找到Binder对象对应的binder_node
	node = binder_get_node(proc, fp->binder);
	if (!node) {
		node = binder_new_node(proc, fp);
		if (!node)
			return -ENOMEM;
	}
	if (fp->cookie != node->cookie) {
		binder_user_error("%d:%d sending u%016llx node %d, cookie mismatch %016llx != %016llx\n",
				  proc->pid, thread->pid, (u64)fp->binder,
				  node->debug_id, (u64)fp->cookie,
				  (u64)node->cookie);
		ret = -EINVAL;
		goto done;
	}
	if (security_binder_transfer_binder(proc->cred, target_proc->cred)) {
		ret = -EPERM;
		goto done;
	}
>>为目标进程新创建一个binder_ref,返回binder_ref对应的data---rdata
	ret = binder_inc_ref_for_node(target_proc, node,
			fp->hdr.type == BINDER_TYPE_BINDER,
			&thread->todo, &rdata);
	if (ret)
		goto done;

	if (fp->hdr.type == BINDER_TYPE_BINDER)
		fp->hdr.type = BINDER_TYPE_HANDLE;
	else
		fp->hdr.type = BINDER_TYPE_WEAK_HANDLE;
	fp->binder = 0;
>>将新建binder_ref的data->desc赋值给fp->handle,用于用户空间初始化BpBinder对象
>>这种app进程将ApplicationThread对象发送给AMS就属于这种情况,发送端发送的是Binder对象,但AMS接收到的是BinderProxy对象
	fp->handle = rdata.desc;
	fp->cookie = 0;

	trace_binder_transaction_node_to_ref(t, node, &rdata);
	binder_debug(BINDER_DEBUG_TRANSACTION,
		     "        node %d u%016llx -> ref %d desc %d\n",
		     node->debug_id, (u64)node->ptr,
		     rdata.debug_id, rdata.desc);
done:
	binder_put_node(node);
	return ret;
}


static int binder_translate_handle(struct flat_binder_object *fp,
				   struct binder_transaction *t,
				   struct binder_thread *thread)
{
	struct binder_proc *proc = thread->proc;
	struct binder_proc *target_proc = t->to_proc;
	struct binder_node *node;
	struct binder_ref_data src_rdata;
	int ret = 0;
>>传入的是BpBinder对象,根据binder_get_node_from_ref找到BpBinder对象对应的binder_node
	node = binder_get_node_from_ref(proc, fp->handle,
			fp->hdr.type == BINDER_TYPE_HANDLE, &src_rdata);
	if (!node) {
		binder_user_error("%d:%d got transaction with invalid handle, %d\n",
				  proc->pid, thread->pid, fp->handle);
		return -EINVAL;
	}
	if (security_binder_transfer_binder(proc->cred, target_proc->cred)) {
		ret = -EPERM;
		goto done;
	}

	binder_node_lock(node);
>>如果binder_node对应进程与目标进程一致,说明client与server为同一进程
	if (node->proc == target_proc) {
		if (fp->hdr.type == BINDER_TYPE_HANDLE)
			fp->hdr.type = BINDER_TYPE_BINDER;
		else
			fp->hdr.type = BINDER_TYPE_WEAK_BINDER;
>>则赋值fp->binder、fp->cookie,用于在用户空间初始化BBinder对象
>>比如System_server中通过ServiceManager获取wms服务,最终得到的不是BinderProxy对象而是Binder对象
		fp->binder = node->ptr;
		fp->cookie = node->cookie;
		if (node->proc)
			binder_inner_proc_lock(node->proc);
		else
			__acquire(&node->proc->inner_lock);
		binder_inc_node_nilocked(node,
					 fp->hdr.type == BINDER_TYPE_BINDER,
					 0, NULL);
		if (node->proc)
			binder_inner_proc_unlock(node->proc);
		else
			__release(&node->proc->inner_lock);
		trace_binder_transaction_ref_to_node(t, node, &src_rdata);
		binder_debug(BINDER_DEBUG_TRANSACTION,
			     "        ref %d desc %d -> node %d u%016llx\n",
			     src_rdata.debug_id, src_rdata.desc, node->debug_id,
			     (u64)node->ptr);
		binder_node_unlock(node);
	} else {
>>如果binder_node对应进程与目标进程不一致,说明client与server为不同进程,是更普遍的情况
		struct binder_ref_data dest_rdata;

		binder_node_unlock(node);
>>根据node为目标进程新建一个binder_ref
		ret = binder_inc_ref_for_node(target_proc, node,
				fp->hdr.type == BINDER_TYPE_HANDLE,
				NULL, &dest_rdata);
		if (ret)
			goto done;

		fp->binder = 0;
>>将新创建的binder_ref->data->desc赋给fp->handle,用于用户空间创建BpBinder对象
>>比如app通过ServiceManager获取系统服务时,就属于这种情况,通过从ServiceManager获取到对应服务的handle,然后再新建一个binder代理对象
		fp->handle = dest_rdata.desc;
		fp->cookie = 0;
		trace_binder_transaction_ref_to_ref(t, node, &src_rdata,
						    &dest_rdata);
		binder_debug(BINDER_DEBUG_TRANSACTION,
			     "        ref %d desc %d -> ref %d desc %d (node %d)\n",
			     src_rdata.debug_id, src_rdata.desc,
			     dest_rdata.debug_id, dest_rdata.desc,
			     node->debug_id);
	}
done:
	binder_put_node(node);
	return ret;
}



static int binder_inc_ref_for_node(struct binder_proc *proc,
			struct binder_node *node,
			bool strong,
			struct list_head *target_list,
			struct binder_ref_data *rdata)
{
	struct binder_ref *ref;
	struct binder_ref *new_ref = NULL;
	int ret = 0;

	binder_proc_lock(proc);
>>先根据binder_node查找在目标进程中是否对应的binder_ref
	ref = binder_get_ref_for_node_olocked(proc, node, NULL);
	if (!ref) {
		binder_proc_unlock(proc);
>>如果没有则为目标进程新建一个binder_ref
		new_ref = kzalloc(sizeof(*ref), GFP_KERNEL);
		if (!new_ref)
			return -ENOMEM;
		binder_proc_lock(proc);
>>给binder_ref结构体对应的data赋值
		ref = binder_get_ref_for_node_olocked(proc, node, new_ref);
	}
	ret = binder_inc_ref_olocked(ref, strong, target_list);
>>将ref->data赋给rdata,最终会赋给flat_binder_object中的handle成员
	*rdata = ref->data;
	if (ret && ref == new_ref) {

10.binder死亡通知

binder死亡通知,平时也见的很多了,主要是用于在server进程死亡后来通知client进程做一些善后工作。比如,WindowState中就会向IWindow去注册死亡通知,待死亡通知被回调时会调用其binderDied方法,来移除IWindow在system_server进程中对应的窗口。在这里,WindowState虽然是System_server进程中的窗口存在形式,但binder通信本来就是互为client与server,这里的IWindow类型变量反而代表着服务端。IWindow实际上是ViewRootImpl.W类型, 继承了IWindow.stub,因此这里也可看作是匿名binder,WindowState可通过IWindow去通知app一些事情。这里的IWindow类型变量本质上是BinderProxy类型,因此注册binder死亡通知后最终会调用BpBinder::linkToDeath方法。
注册binder死亡通知的过程,大概是这样:用户空间向binder驱动发送BC_REQUEST_DEATH_NOTIFICATION命令,同时将BpBinder对应的handle以及BpBinder对象指针发送到binder驱动,binder驱动接收到用户空间发来的cmd命令和数据后,会先通过handle找到对应的binder_ref,再检查binder_ref对应的death成员是否已经赋值,若没有则新建一个binder_ref_death类型变量,对其cookie成员赋值为BpBinder对象的指针,最后将binder_ref_death类型变量赋给binder_ref的death成员。至此,注册binder死亡通知的流程基本结束。

/frameworks/base/services/core/java/com/android/server/wm/WindowState.java
1083      WindowState(WindowManagerService service, Session s, IWindow c, WindowToken token,
1084              WindowState parentWindow, int appOp, WindowManager.LayoutParams a, int viewVisibility,
1085              int ownerId, int showUserId, boolean ownerCanAddInternalSystemWindow,
1086              PowerManagerWrapper powerManagerWrapper) {
1087          super(service);
1088          mTmpTransaction = service.mTransactionFactory.get();
1089          mSession = s;
1090          mClient = c;
...
1103          DeathRecipient deathRecipient = new DeathRecipient();
...
1121          try {
1122              c.asBinder().linkToDeath(deathRecipient, 0);
1123          } catch (RemoteException e) {


2916      private class DeathRecipient implements IBinder.DeathRecipient {
2917          @Override
2918          public void binderDied() {
2919              try {
2920                  synchronized (mWmService.mGlobalLock) {
2921                      final WindowState win = mWmService
2922                              .windowForClientLocked(mSession, mClient, false);
2923                      Slog.i(TAG, "WIN DEATH: " + win);
2924                      if (win != null) {
2925                          if (win.mActivityRecord != null
2926                                  && win.mActivityRecord.findMainWindow() == win) {
2927                              mWmService.mSnapshotController.onAppDied(win.mActivityRecord);
2928                          }
2929                          win.removeIfPossible();
2930                      } else if (mHasSurface) {
2931                          Slog.e(TAG, "!!! LEAK !!! Window removed but surface still valid.");
2932                          WindowState.this.removeIfPossible();
2933                      }
2934                  }
2935              } catch (IllegalArgumentException ex) {
2936                  // This will happen if the window has already been removed.
2937              }


385  // NOLINTNEXTLINE(google-default-arguments)
386  status_t BpBinder::linkToDeath(
387      const sp<DeathRecipient>& recipient, void* cookie, uint32_t flags)
388  {
...
409      Obituary ob;
410      ob.recipient = recipient;
411      ob.cookie = cookie;
412      ob.flags = flags;
...
427                  if (!isRpcBinder()) {
428                      if constexpr (kEnableKernelIpc) {
429                          getWeakRefs()->incWeak(this);
430                          IPCThreadState* self = IPCThreadState::self();
>>requestDeathNotification,传入handle与this
431                          self->requestDeathNotification(binderHandle(), this);
>>刷新命令
432                          self->flushCommands();
433                      }
434                  }
435              }


/frameworks/native/libs/binder/IPCThreadState.cpp
956  status_t IPCThreadState::requestDeathNotification(int32_t handle, BpBinder* proxy)
957  {
>>写入命令BC_REQUEST_DEATH_NOTIFICATION
958      mOut.writeInt32(BC_REQUEST_DEATH_NOTIFICATION);
>>写入int32_t类型的handle
959      mOut.writeInt32((int32_t)handle);
>>写入uintptr_t类型的proxy,指向BpBinder对象
960      mOut.writePointer((uintptr_t)proxy);
961      return NO_ERROR;
962  }




 / drivers / android / binder.c
static int binder_thread_write(struct binder_proc *proc,
			struct binder_thread *thread,
			binder_uintptr_t binder_buffer, size_t size,
			binder_size_t *consumed)
{
...
		case BC_REQUEST_DEATH_NOTIFICATION:
		case BC_CLEAR_DEATH_NOTIFICATION: {
			uint32_t target;
			binder_uintptr_t cookie;
			struct binder_ref *ref;
>>新建binder_ref_death
			struct binder_ref_death *death = NULL;
>>从用户空间获取uint32_t类型数据handle,赋值给target
			if (get_user(target, (uint32_t __user *)ptr))
				return -EFAULT;
			ptr += sizeof(uint32_t);
>>从用户空间获取uintptr_t类型的数据proxy,赋值给cookie
			if (get_user(cookie, (binder_uintptr_t __user *)ptr))
				return -EFAULT;
			ptr += sizeof(binder_uintptr_t);
			if (cmd == BC_REQUEST_DEATH_NOTIFICATION) {
				/*
				 * Allocate memory for death notification
				 * before taking lock
				 */
				death = kzalloc(sizeof(*death), GFP_KERNEL);
...
			binder_proc_lock(proc);
>>根据target找到对应的binder_ref
			ref = binder_get_ref_olocked(proc, target, false);
...
			binder_node_lock(ref->node);
			if (cmd == BC_REQUEST_DEATH_NOTIFICATION) {
				if (ref->death) {
>>如果binder_ref对应的death已经赋值,则返回
					binder_user_error("%d:%d BC_REQUEST_DEATH_NOTIFICATION death notification already set\n",
						proc->pid, thread->pid);
					binder_node_unlock(ref->node);
					binder_proc_unlock(proc);
					kfree(death);
					break;
				}
				binder_stats_created(BINDER_STAT_DEATH);
				INIT_LIST_HEAD(&death->work.entry);
>>给binder_ref_death结构体对应的cookie赋值,用于后续找到对应的BpBinder对象
				death->cookie = cookie;
>>给binder_ref结构体对应的death成员赋值,至此,注册binder死亡通知的流程基本结束
				ref->death = death;

再来说说binder死亡通知的触发流程,主要是驱动部分的逻辑。

首先,server进程死亡后,会释放相关的资源,包括服务对应的binder_node。在binder_node_release函数中,会遍历binder_node对应所有的binder_ref,如果binder_ref注册了死亡通知,则向其binder_ref_death添加工作项,类型为BINDER_WORK_DEAD_BINDER,然后将
此工作项binder_ref所在进程的todo队列,并唤醒进程中的一个空闲的binder线程来处理此工作项。binder_ref进程中的binder线程被唤醒后,从进程todo队列中取出工作项来处理,检查到其类型为BINDER_WORK_DEAD_BINDER后,先根据工作项找到对应的binder_ref_death,再确认发给用户空间的cmd命令为BR_DEAD_BINDER,从binder_ref_death中取出cookie,最后将cmd命令和cookie发送到用户空间。用户空间接收到BR_DEAD_BINDER命令后,会根据cookie找到对应的BpBinder对象,然后调用其sendObituary方法去发送通知,最终回调DeathRecipient的binderDied方法通知到上层。

/ drivers / android / binder.c
static int binder_node_release(struct binder_node *node, int refs)
{
	struct binder_ref *ref;
	int death = 0;
	struct binder_proc *proc = node->proc;

	binder_release_work(proc, &node->async_todo);

	binder_node_lock(node);
	binder_inner_proc_lock(proc);
	binder_dequeue_work_ilocked(&node->work);
...
	hlist_for_each_entry(ref, &node->refs, node_entry) {
		refs++;
		/*
		 * Need the node lock to synchronize
		 * with new notification requests and the
		 * inner lock to synchronize with queued
		 * death notifications.
		 */
		binder_inner_proc_lock(ref->proc);
>>如果binder_ref没有注册死亡通知,则遍历下一个
		if (!ref->death) {
			binder_inner_proc_unlock(ref->proc);
			continue;
		}

		death++;

		BUG_ON(!list_empty(&ref->death->work.entry));
>>给binder_ref->death->work.type赋值为BINDER_WORK_DEAD_BINDER
		ref->death->work.type = BINDER_WORK_DEAD_BINDER;
>>向binder_ref所在进程的todo队列添加待处理项
		binder_enqueue_work_ilocked(&ref->death->work,
					    &ref->proc->todo);
>>唤醒进程,实则是挑选目标进程中的一个空闲binder线程来唤醒
		binder_wakeup_proc_ilocked(ref->proc);
		binder_inner_proc_unlock(ref->proc);
	}


static int binder_thread_read(struct binder_proc *proc,
			      struct binder_thread *thread,
			      binder_uintptr_t binder_buffer, size_t size,
			      binder_size_t *consumed, int non_block)
{
...
>>binder_ref所在进程被唤醒后,从todo队列中取出binder_work,其类型为BINDER_WORK_DEAD_BINDER
		case BINDER_WORK_DEAD_BINDER:
		case BINDER_WORK_DEAD_BINDER_AND_CLEAR:
		case BINDER_WORK_CLEAR_DEATH_NOTIFICATION: {
			struct binder_ref_death *death;
			uint32_t cmd;
			binder_uintptr_t cookie;
>>根据成员变量名work、成员变量地址w找到对应结构体binder_ref_death
			death = container_of(w, struct binder_ref_death, work);
			if (w->type == BINDER_WORK_CLEAR_DEATH_NOTIFICATION)
				cmd = BR_CLEAR_DEATH_NOTIFICATION_DONE;
			else
>>确认cmd命令为BR_DEAD_BINDER
				cmd = BR_DEAD_BINDER;
>>从binder_ref_death中取出cookie
			cookie = death->cookie;

			binder_debug(BINDER_DEBUG_DEATH_NOTIFICATION,
				     "%d:%d %s %016llx\n",
				      proc->pid, thread->pid,
				      cmd == BR_DEAD_BINDER ?
				      "BR_DEAD_BINDER" :
				      "BR_CLEAR_DEATH_NOTIFICATION_DONE",
				      (u64)cookie);
			if (w->type == BINDER_WORK_CLEAR_DEATH_NOTIFICATION) {
				binder_inner_proc_unlock(proc);
				kfree(death);
				binder_stats_deleted(BINDER_STAT_DEATH);
			} else {
				binder_enqueue_work_ilocked(
						w, &proc->delivered_death);
				binder_inner_proc_unlock(proc);
			}
>>向用户空间发送BR_DEAD_BINDER命令
			if (put_user(cmd, (uint32_t __user *)ptr))
				return -EFAULT;
			ptr += sizeof(uint32_t);
>>向用户空间发送binder_uintptr_t类型数据cookie
			if (put_user(cookie,
				     (binder_uintptr_t __user *)ptr))
				return -EFAULT;
			ptr += sizeof(binder_uintptr_t);
			binder_stat_br(proc, thread, cmd);
			if (cmd == BR_DEAD_BINDER)
				goto done; /* DEAD_BINDER notifications can cause transactions */
		} break;


/frameworks/native/libs/binder/IPCThreadState.cpp
1274  status_t IPCThreadState::executeCommand(int32_t cmd)
1275  {
...
>>用户空间接收到BR_DEAD_BINDER
1491      case BR_DEAD_BINDER:
1492          {
>>读取binder_uintptr_t类型的数据proxy,即确认binder_ref_death对应的BpBinder对象
1493              BpBinder *proxy = (BpBinder*)mIn.readPointer();
>>回调BpBinder对象的sendObituary方法,发送死亡通知
1494              proxy->sendObituary();
1495              mOut.writeInt32(BC_DEAD_BINDER_DONE);
1496              mOut.writePointer((uintptr_t)proxy);
1497          } break;



/frameworks/native/libs/binder/BpBinder.cpp
489  void BpBinder::sendObituary()
490  {
...
502      mLock.lock();
>>获取列表
503      Vector<Obituary>* obits = mObituaries;
...
521      if (obits != nullptr) {
522          const size_t N = obits->size();
523          for (size_t i=0; i<N; i++) {
>>遍历每个sendObituary
524              reportOneDeath(obits->itemAt(i));
525          }
526  
527          delete obits;
528      }


531  void BpBinder::reportOneDeath(const Obituary& obit)
532  {
>>根据Obituary找到对应的DeathRecipient
533      sp<DeathRecipient> recipient = obit.recipient.promote();
534      ALOGV("Reporting death to recipient: %p\n", recipient.get());
535      if (recipient == nullptr) return;
536  
>>回调DeathRecipient的binderDied方法
537      recipient->binderDied(wp<BpBinder>::fromExisting(this));
538  }

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值