最近测试一个项目时高通8.1平台,每个版本做出来差分包有800M,因为版本中有个APK有700M,所以相当于这个APK没有走差分的流程,一开始有两个方面的猜想,第一种是做了限制,操过了一定大小的APK没有执行差分,第二种引入分割理论,将APK分割成小份进行差分
一、大致流程
二、问题来源
三、问题优化
四、问题探究
一、大致流程
这里不详细说,我们只看我们的单个问题,所以先不做具体分析
payload_generator/generate_delta_main.cc
GenerateUpdatePayloadFile(payload_config,FLAGS_out_file, FLAGS_private_key,&metadata_size)) {}
payload_generator/delta_diff_generator.cc
strategy->GenerateOperations(config,old_part,new_part,&blob_file,&aops)){}
payload_generator/ab_generator.cc
GenerateOperations( const PayloadGenerationConfig& config, const PartitionConfig& old_part, const PartitionConfig& new_part,BlobFileWriter* blob_file,vector<AnnotatedOperation>* aops){}
payload_generator/delta_diff_utils.cc
DeltaReadPartition(vector<AnnotatedOperation>* aops,const PartitionConfig& old_part,
const PartitionConfig& new_part,ssize_t hard_chunk_blocks, size_t soft_chunk_blocks, const PayloadVersion& version, BlobFileWriter* blob_file) {}
DeltaReadFile(aops, old_part.path,new_part.path,old_unvisited, new_unvisited, "<non-file-data>",
soft_chunk_blocks, version, blob_file)){}
ReadExtentsToDiff(old_part,new_part,old_extents_chunk,new_extents_chunk,version,&data,&operation)){}
这就是大致的方法调用顺序,目前大致的方法知道了,我们就先看这个问题如何进行排查
二、问题来源
首先呢,我把8.1的中间包放到了9.0的环境里面,发现做出的差分包时400M,而且apk还走差分了,那就不用说啥了,很明显payload_generator9.0处理的有优化的方法
对比了9.0 跟 8.0的做包log发现
1payload_generator/delta_diff_utils.cc
2
3bool DeltaReadPartition(vector<AnnotatedOperation>* aops,
4 const PartitionConfig& old_part,
5 const PartitionConfig& new_part,
6 ssize_t hard_chunk_blocks,
7 size_t soft_chunk_blocks,
8 const PayloadVersion& version,
9 BlobFileWriter* blob_file) {
10 ExtentRanges old_visited_blocks;
11 ExtentRanges new_visited_blocks;
12 //处理move 和zero的情况
13 TEST_AND_RETURN_FALSE(DeltaMovedAndZeroBlocks(
14 aops,
15 old_part.path,
16 new_part.path,
17 old_part.size / kBlockSize,
18 new_part.size / kBlockSize,
19 soft_chunk_blocks,
20 version,
21 blob_file,
22 &old_visited_blocks,
23 &new_visited_blocks));
24
25 map<string, vector<Extent>> old_files_map;
26 if (old_part.fs_interface) {
27 vector<FilesystemInterface::File> old_files;
28 old_part.fs_interface->GetFiles(&old_files);
29 for (const FilesystemInterface::File& file : old_files)
30 old_files_map[file.name] = file.extents;
31 }
32
33 TEST_AND_RETURN_FALSE(new_part.fs_interface);
34 vector<FilesystemInterface::File> new_files;
35 new_part.fs_interface->GetFiles(&new_files);
36
37 vector<FileDeltaProcessor> file_delta_processors;
38
39 // The processing is very straightforward here, we generate operations for
40 // every file (and pseudo-file such as the metadata) in the new filesystem
41 // based on the file with the same name in the old filesystem, if any.
42 // Files with overlapping data blocks (like hardlinks or filesystems with tail
43 // packing or compression where the blocks store more than one file) are only
44 // generated once in the new image, but are also used only once from the old
45 // image due to some simplifications (see below).
46 for (const FilesystemInterface::File& new_file : new_files) {
47 // Ignore the files in the new filesystem without blocks. Symlinks with
48 // data blocks (for example, symlinks bigger than 60 bytes in ext2) are
49 // handled as normal files. We also ignore blocks that were already
50 // processed by a previous file.
51 vector<Extent> new_file_extents = FilterExtentRanges(
52 new_file.extents, new_visited_blocks);
53 new_visited_blocks.AddExtents(new_file_extents);
54
55 if (new_file_extents.empty())
56 continue;
57
58 LOG(INFO) << "Encoding file " << new_file.name << " ("
59 << BlocksInExtents(new_file_extents) << " blocks)";
60
61 // We can't visit each dst image inode more than once, as that would
62 // duplicate work. Here, we avoid visiting each source image inode
63 // more than once. Technically, we could have multiple operations
64 // that read the same blocks from the source image for diffing, but
65 // we choose not to avoid complexity. Eventually we will move away
66 // from using a graph/cycle detection/etc to generate diffs, and at that
67 // time, it will be easy (non-complex) to have many operations read
68 // from the same source blocks. At that time, this code can die. -adlr
69 vector<Extent> old_file_extents = FilterExtentRanges(
70 old_files_map[new_file.name], old_visited_blocks);
71 old_visited_blocks.AddExtents(old_file_extents);
72 //处理差分的流程
73 TEST_AND_RETURN_FALSE(DeltaReadFile(aops,
74 old_part.path,
75 new_part.path,
76 old_file_extents,
77 new_file_extents,
78 new_file.name, // operation name
79 hard_chunk_blocks,
80 version,
81 blob_file));
82 }
83 // Process all the blocks not included in any file. We provided all the unused
84 // blocks in the old partition as available data.
85 vector<Extent> new_unvisited = {
86 ExtentForRange(0, new_part.size / kBlockSize)};
87 new_unvisited = FilterExtentRanges(new_unvisited, new_visited_blocks);
88 if (new_unvisited.empty())
89 return true;
90
91 vector<Extent> old_unvisited;
92 if (old_part.fs_interface) {
93 old_unvisited.push_back(ExtentForRange(0, old_part.size / kBlockSize));
94 old_unvisited = FilterExtentRanges(old_unvisited, old_visited_blocks);
95 }
96
97 LOG(INFO) << "Scanning " << BlocksInExtents(new_unvisited)
98 << " unwritten blocks using chunk size of "
99 << soft_chunk_blocks << " blocks.";
100 // We use the soft_chunk_blocks limit for the <non-file-data> as we don't
101 // really know the structure of this data and we should not expect it to have
102 // redundancy between partitions.
103 // 处理没有文件的数据的部分
104 TEST_AND_RETURN_FALSE(DeltaReadFile(aops,
105 old_part.path,
106 new_part.path,
107 old_unvisited,
108 new_unvisited,
109 "<non-file-data>", // operation name
110 soft_chunk_blocks,
111 version,
112 blob_file));
113
114 return true;
115}
这里面一共调用了三次DeltaReadFile处理不同的情况,很明显中间的差分部分必须会出现打印
LOG(INFO) << "Encoding file " << new_file.name << " ("
<< BlocksInExtents(new_file_extents) << " blocks)";
实际发现8.1的环境并没有这个打印,也就等于没进入for循环,很明显 new_part.fs_interface这个参数起到了至关重要的部分,我们只需要确认哪里赋值,为什么不对就可以了(实际做了很多无用功,最初一直以为是什么参数不对)
三、问题优化
确认fs_interface赋值
1payload_generator/payload_generation_config.cc
2
3bool PartitionConfig::OpenFilesystem() {
4 if (path.empty())
5 return true;
6 fs_interface.reset();
7 //lbb test问题的关键就出在这个判断,fs_interface这里时给到正确的值,但是没return,也就意味着
8 //走到方法最后去了
9 if (utils::IsExtFilesystem(path)) {
10 fs_interface = Ext2Filesystem::CreateFromFile(path);
11 if (fs_interface) {
12 //加入return 并给定blocksize为4096
13 TEST_AND_RETURN_FALSE(fs_interface->GetBlockSize() == kBlockSize);
14 return true;
15 }
16 }
17
18 if (!mapfile_path.empty()) {
19 fs_interface = MapfileFilesystem::CreateFromFile(path, mapfile_path);
20 if (fs_interface) {
21 TEST_AND_RETURN_FALSE(fs_interface->GetBlockSize() == kBlockSize);
22 return true;
23 }
24 }
25
26 // Fall back to a RAW filesystem.
27 TEST_AND_RETURN_FALSE(size % kBlockSize == 0);
28 fs_interface = RawFilesystem::Create(
29 "<" + name + "-partition>", kBlockSize, size / kBlockSize);
30 return true;
31}
加入该代码后重新编译delta_generator问题解决,这样8.1也可以走差分
四、问题探究
在发现这个问题之前其实做了很多无用功,这里面有几个重要的参数
1、payload_generator/generate_delta_main.cc
DEFINE_int32(chunk_size,200 * 1024 * 1024,
"Payload chunk size (-1 for whole files)");
2、payload_generator/delta_diff_utils.cc
// The maximum destination size allowed for bsdiff. In general, bsdiff should
// work for arbitrary big files, but the payload generation and payload
// application requires a significant amount of RAM. We put a hard-limit of
// 200 MiB that should not affect any released board, but will limit the
// Chrome binary in ASan builders.
const uint64_t kMaxBsdiffDestinationSize = 100 * 1024 * 1024; // bytes
// The maximum destination size allowed for imgdiff. In general, imgdiff should
// work for arbitrary big files, but the payload application is quite memory
// intensive, so we limit these operations to 50 MiB.
// lbb test
const uint64_t kMaxImgdiffDestinationSize = 100 * 1024 * 1024; // bytes
这两个参数好像毫不相干,却是他们控制着文件是否会走差分
1payload_generator/delta_diff_utils.cc
2
3bool DeltaReadFile(vector<AnnotatedOperation>* aops,
4 const string& old_part,
5 const string& new_part,
6 const vector<Extent>& old_extents,
7 const vector<Extent>& new_extents,
8 const string& name,
9 ssize_t chunk_blocks,
10 const PayloadVersion& version,
11 BlobFileWriter* blob_file) {
12 brillo::Blob data;
13 InstallOperation operation;
14
15 uint64_t total_blocks = BlocksInExtents(new_extents);
16 if (chunk_blocks == -1)
17 chunk_blocks = total_blocks;
18
19 for (uint64_t block_offset = 0; block_offset < total_blocks;
20 block_offset += chunk_blocks) {
21 // Split the old/new file in the same chunks. Note that this could drop
22 // some information from the old file used for the new chunk. If the old
23 // file is smaller (or even empty when there's no old file) the chunk will
24 // also be empty.
25 // 将文件按照chunk_size对文件进行分割,非常重要的流程
26 vector<Extent> old_extents_chunk = ExtentsSublist(
27 old_extents, block_offset, chunk_blocks);
28 vector<Extent> new_extents_chunk = ExtentsSublist(
29 new_extents, block_offset, chunk_blocks);
30 NormalizeExtents(&old_extents_chunk);
31 NormalizeExtents(&new_extents_chunk);
32
33 //lbb test
34 LOG(INFO) << "file name"<< name.c_str() << ".";
35 LOG(INFO) << "old_extents_DeltaReadFile"<< utils::BlocksInExtents(old_extents_chunk) * kBlockSize << " bytes";
36
37 TEST_AND_RETURN_FALSE(ReadExtentsToDiff(old_part,
38 new_part,
39 old_extents_chunk,
40 new_extents_chunk,
41 version,
42 &data,
43 &operation));
44
45 ......
46 ......
47 return true;
48}
1payload_generator/delta_diff_utils.cc
2
3bool ReadExtentsToDiff(const string& old_part,
4 const string& new_part,
5 const vector<Extent>& old_extents,
6 const vector<Extent>& new_extents,
7 const PayloadVersion& version,
8 brillo::Blob* out_data,
9 InstallOperation* out_op) {
10 InstallOperation operation;
11
12 // We read blocks from old_extents and write blocks to new_extents.
13 uint64_t blocks_to_read = BlocksInExtents(old_extents);
14 uint64_t blocks_to_write = BlocksInExtents(new_extents);
15
16 // Disable bsdiff and imgdiff when the data is too big.
17 //文件操过了kMaxBsdiffDestinationSize 200M,跳过执行bsdiff差分,采用整包
18 bool bsdiff_allowed =
19 version.OperationAllowed(InstallOperation::SOURCE_BSDIFF) ||
20 version.OperationAllowed(InstallOperation::BSDIFF);
21 if (bsdiff_allowed &&
22 blocks_to_read * kBlockSize > kMaxBsdiffDestinationSize) {
23 LOG(INFO) << "bsdiff blacklisted, data too big: "
24 << blocks_to_read * kBlockSize << " bytes";
25 bsdiff_allowed = false;
26 }
27 //文件操过kMaxImgdiffDestinationSize 50M,跳过执行imagediff差分,采用整包
28 bool imgdiff_allowed = version.OperationAllowed(InstallOperation::IMGDIFF);
29 if (imgdiff_allowed &&
30 blocks_to_read * kBlockSize > kMaxImgdiffDestinationSize) {
31 LOG(INFO) << "imgdiff blacklisted, data too big: "
32 << blocks_to_read * kBlockSize << " bytes";
33 imgdiff_allowed = false;
34 }
35 ......
36 ......
37}
38
这两个值的逻辑是这样的
1、首先进来的文件都会走一个分割,而分割的大小按chunk_size的大小200M
2、根据chunk_size 判断是不是操过了kMaxBsdiffDestinationSize ,如果超过了,那就不走差分了,考虑到机器内存和性能
3、目前来说,chunk_size 与 kMaxBsdiffDestinationSize相同,所以你多大的APK都会差分,只不过分割的份数不一样而已,看起来kMaxBsdiffDestinationSize判断有点儿多余
如果我们想更改流程,超过chunk_size大小的文件,就不走差分,那好办,更改kMaxBsdiffDestinationSize小鱼chunk_size即可
这次的分析,暂时不做流程上详细的讲解,我们只看这个问题的解决方法,后续流程讲解delta_generator的时候再详细说明

本文探讨了在高通8.1平台下,大文件(如700M的APK)未有效进行差分打包的问题及解决方案。通过对比9.0版本的优化,深入分析了payload_generator模块的代码逻辑,尤其是DeltaReadFile和ReadExtentsToDiff函数的作用。通过调整fs_interface参数和kMaxBsdiffDestinationSize阈值,实现了大文件的有效差分。
3799

被折叠的 条评论
为什么被折叠?



