MKV parser and quick seek.

本文深入探讨了MKV文件格式的组成结构,包括元数据索引、轨道信息、章节、集群等部分,并详细解释了如何通过元数据索引进行高效定位与解码关键元素。同时介绍了MKV文件中数据大小的编码方式,以及如何计算数据大小。最后讨论了MKV文件格式在技术应用中的优势与局限性。

1: seek the File's problem.

 

文件各个部分介绍:

The Metaseek section contains an index of where all of the other groups are in the file are located, such as the Track information, Chapters, Tags, Cues, Attachments, and so on. This element isn't technicaly required, but you would have to search the entire file to find all of the other Level 1 elements if you did not have it. This is because any of the items can occur in any order. For instance you could have the chapters section in the middle of the Clusters. This is part of the flexibility of EBML and Matroksa.

 

 

The Track section has basic information about each of the tracks.

 

The Clusters section has all of the Clusters. These contain all of the video frames and audio for each track.

 

解码关键:

判断标志位size. 例如:1A 45 DF A3   1A = 0001 1010 . size = 4. (000 +1)

Element ID coded with an UTF-8 like system :

bits, big-endian
1xxx xxxx - Class A IDs (2^7 -1 possible values) (base 0x8X)
01xx xxxx xxxx xxxx - Class B IDs (2^14-1 possible values) (base 0x4X 0xXX)
001x xxxx xxxx xxxx xxxx xxxx - Class C IDs (2^21-1 possible values) (base 0x2X 0xXX 0xXX)
0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx - Class D IDs (2^28-1 possible values) (base 0x1X 0xXX 0xXX 0xXX)

 

同样道理。 计算data size.

比如:1A 45 DF A3 A3 . size = 1010 0011  size =  1,  value = 10 0011= 35!

 

Data size, in octets, is also coded with an UTF-8 like system :

bits, big-endian
1xxx xxxx - value 0 to 2^7-2
01xx xxxx xxxx xxxx - value 0 to 2^14-2
001x xxxx xxxx xxxx xxxx xxxx - value 0 to 2^21-2
0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^28-2
0000 1xxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^35-2
0000 01xx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^42-2
0000 001x xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^49-2
0000 0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^56-2

MKV 文件格式类似一个单链表。顺序解析方便。
但是seek就很不方便。
Meta seek infomation 往往没有实际意义。


Header
Meta Seek Information
Segment Information
Track
Chapters
Clusters
Cueing Data
Attachment
Tagging

`Parser parser = Parser.builder(options).build();` 这行代码通常出现在使用 **Apache Commons CSV** 或类似解析库的场景中,用于构建一个自定义配置的解析器(如 CSV 文件解析)。以下是相关依赖和说明: --- ### 1. **Apache Commons CSV 依赖** 如果代码用于解析 CSV 文件,需添加以下 Maven 依赖: ```xml <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-csv</artifactId> <version>1.10.0</version> <!-- 建议使用最新版本 --> </dependency> ``` #### 示例代码(CSV 解析): ```java import org.apache.commons.csv.CSVFormat; import org.apache.commons.csv.CSVParser; import org.apache.commons.csv.CSVRecord; import java.io.Reader; import java.nio.file.Files; import java.nio.file.Paths; public class CsvExample { public static void main(String[] args) throws Exception { try (Reader reader = Files.newBufferedReader(Paths.get("data.csv")); CSVParser parser = new CSVParser(reader, CSVFormat.DEFAULT.withFirstRecordAsHeader())) { for (CSVRecord record : parser) { System.out.println(record.get("Column1")); // 按列名访问 } } } } ``` --- ### 2. **其他可能的库** 如果代码来自其他解析库(如 JSON、XML 或自定义解析器),需根据具体库添加依赖: - **JSON 解析(如 Jackson)**: ```xml <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-databind</artifactId> <version>2.15.2</version> </dependency> ``` - **XML 解析(如 JAXP 或 DOM4J)**: ```xml <dependency> <groupId>dom4j</groupId> <artifactId>dom4j</artifactId> <version>2.1.3</version> </dependency> ``` --- ### 3. **关键点** - **`Parser.builder(options)`**:表明解析器通过构建器模式配置,常见于需要灵活定制解析规则的场景(如分隔符、编码、头部处理等)。 - **`options` 参数**:通常是 `CSVFormat`(Commons CSV)或类似配置对象,定义解析行为。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值