Caffe官网原文:
// Assuming that data are on the CPU initially, and we have a blob.
const Dtype* foo;
Dtype* bar;
foo = blob.gpu_data(); // data copied cpu->gpu.
foo = blob.cpu_data(); // no data copied since both have up-to-date contents.
bar = blob.mutable_gpu_data(); // no data copied.
// ... some operations ...
bar = blob.mutable_gpu_data(); // no data copied when we are still on GPU.
foo = blob.cpu_data(); // data copied gpu->cpu, since the gpu side has modified the data
foo = blob.gpu_data(); // no data copied since both have up-to-date contents
//
bar = blob.mutable_cpu_data(); // still no data copied.
bar = blob.mutable_gpu_data(); // data copied cpu->gpu.
bar = blob.mutable_cpu_data(); // data copied gpu->cpu
为什么有的地方需要data copy ,有点地方不需要??
首先需明确:
.gpu_data
and .cpu_data
are
used in cases were the data
is
used only as input and will not be modified by the algorithm. .mutable_*
is
used when the data itself gets updated while running the algorithm.
其次,需要关注(1)对数据Blob的两次操作是否采用相同的处理器(processor),(2)之前的一次操作是否有可能更新数据Blob
Whenever a the data is called, it checks whether the previous statement was a mutable_*
function
call and that too using the same processor (gpu or cpu). If it is using the same processor, data need not be copied. If it is using the other processor, there is a chance that the data might have been updated in the previous .mutable_*
call
and hence a data copy is required.