原文:
http://gitbook.liuhui998.com/7_1.html
<wbr><div><span style="font-size:16px; line-height:28px"><strong>一、前言</strong></span></div> <div> <span style="color:#000080">所有的对象都以</span><span style="color:#993300">SHA</span><span style="color:#000080">值为索引用</span><span style="color:#993300">gzip格式</span><span style="color:#000080">压缩存储, 每个对象都包含了</span><span style="color:#99cc00">对象类型</span><span style="color:#000080">,</span><span style="color:#339966">大小</span><span style="color:#000080">和</span><span style="color:#008000">内容</span><span style="color:#000080">.</span> </div> <div> <span style="color:#993300">Git</span><span style="color:#000080">中存在两种对象 -</span><span style="color:#ff00ff">松散对象(loose object)</span><span style="color:#000080">和</span><span style="color:#ff00ff">打包对象(packed object)</span><span style="color:#000080">.</span> </div> <div><strong><span style="font-size:16px; line-height:28px">二、松散对象</span></strong></div> <div> <span style="color:#ff00ff">松散对象</span><span style="color:#003366">是一种比较简单格式. 它就是磁盘上的一个存储压缩数据的文件. 每一个对象都被写入一个单独文件中.</span> </div> <div> <span style="color:#000080">如果你对象的</span><span style="color:#993300">SHA</span><span style="color:#000080">值是</span><span style="color:#808000">ab04d884140f7b0cf8bbf86d6883869f16a46f65</span><span style="color:#000080">, 那么对应的文件会被存储在:</span> </div> <div><span style="color:#3366ff">GIT_DIR/objects/ab/04d884140f7b0cf8bbf86d6883869f16a46f65</span></div> <div> <span style="color:#993300">Git</span><span style="color:#003366">使用</span><span style="color:#993300">SHA</span><span style="color:#003366">值的前两个字符作为子目录名字, 所以一个目录中永远不会包含过多的对象. 文件名则是余下的38个字符</span><span style="color:#000080">.</span> </div> <div><span style="color:rgb(0,0,128)">可以用下面的Ruby代码说明对象数据是如何存储的:</span></div> <div> <pre class="prettyprint" style="padding-top:2px; padding-right:2px; padding-bottom:2px; padding-left:2px; border-top-width:1px; border-right-width:1px; border-bottom-width:1px; border-left-width:1px; border-top-style:solid; border-right-style:solid; border-bottom-style:solid; border-left-style:solid; border-top-color:rgb(136,136,136); border-right-color:rgb(136,136,136); border-bottom-color:rgb(136,136,136); border-left-color:rgb(136,136,136)"><p style="margin-top:0px; margin-bottom:10px; padding-top:0px; padding-bottom:0px"></p><div> <span class="kwd" style="color:rgb(0,0,136)">def</span><span class="pln"> put_raw_object</span><span class="pun" style="color:rgb(102,102,0)">(</span><span class="pln">content</span><span class="pun" style="color:rgb(102,102,0)">,</span><span class="pln"> type</span><span class="pun" style="color:rgb(102,102,0)">)</span> </div><div> <span class="pln"> size </span><span class="pun" style="color:rgb(102,102,0)">=</span><span class="pln"> content</span><span class="pun" style="color:rgb(102,102,0)">.</span><span class="pln">length</span><span class="pun" style="color:rgb(102,102,0)">.</span><span class="pln">to_s</span> </div><div> <span class="pln"> header </span><span class="pun" style="color:rgb(102,102,0)">=</span><span class="pln"> </span><span class="str" style="color:rgb(0,136,0)">"#{type} #{size}\0"</span><span class="pln"> </span><span class="com" style="color:rgb(136,0,0)"># type(space)size(null byte)</span> </div><div><span class="com" style="color:rgb(136,0,0)"> store = header + content</span></div><div><span class="com" style="color:rgb(136,0,0)"> sha1 = Digest::SHA1.hexdigest(store)</span></div><div><span class="com" style="color:rgb(136,0,0)"> path = @git_dir + '/' + sha1[0...2] + '/' + sha1[2..40]</span></div><div><span class="com" style="color:rgb(136,0,0)"> if !File.exists?(path)</span></div><div><span class="com" style="color:rgb(136,0,0)"> content = Zlib::Deflate.deflate(store)</span></div><div><span class="com" style="color:rgb(136,0,0)"> FileUtils.mkdir_p(@directory+'/'+sha1[0...2])</span></div><div><span class="com" style="color:rgb(136,0,0)"> File.open(path, 'w') do |f|</span></div><div><span class="com" style="color:rgb(136,0,0)"> f.write content</span></div><div><span class="com" style="color:rgb(136,0,0)"> end</span></div><div><span class="com" style="color:rgb(136,0,0)"> end</span></div><div><span class="com" style="color:rgb(136,0,0)"> return sha1</span></div><div><span class="com" style="color:rgb(136,0,0)">end</span></div><p style="margin-top:0px; margin-bottom:10px; padding-top:0px; padding-bottom:0px"></p></pre> </div> <div><strong><span style="font-size:16px; line-height:28px">三、打包对象</span></strong></div> <div> <span style="color:#003366">另外一种对象存储方式是使用</span><span style="color:#ff00ff">打包文件(packfile)</span><span style="color:#003366">. 由于Git把每个文件的每个版本都作为一个单独的对象, 它的效率可能会十分的低. 设想一下在一个数千行的文件中改动一行, Git会把修改后的文件整个存储下来, 很浪费空间.</span> </div> <div> <span style="color:#993300">Git</span><span style="color:#003366">使用打包文件(packfile)去节省空间. 在这个格式中, Git只会保存第二个文件中改变了的部分, 然后用一个指针指向相似的那个文件(译注: 即第一个文件).</span> </div> <div> <span style="color:#000080">对象通常是以松散格式写到磁盘上, 因为这个格式的访问代价比较低. 然后, 你最终会需要把对象存放到</span><span style="color:#993300">打包格式</span><span style="color:#000080">中去节省磁盘空间 - 这个工作可以通过</span><span style="color:#ff00ff">git gc</span><span style="color:#000080">来完成. 它使用一个相当复杂的启发式算法去决定哪些文件是最相似的, 然后基于此分析去计算差异. 可以存在多个打包文件, 在必要情况下, 它们可被解包(git unpack-objects)成为松散对象或者重新打包(git repack).</span> </div> <div> <span style="color:#000080">Git会为每一个打包文件创建一个较小的索引文件.</span><span style="color:#993300">索引文件</span><span style="color:#000080">中包含了对象在打包文件中的偏移, 以便于通过SHA值来快速找到特定的对象.</span> </div> 关于打<span style="color:#993300">包对象</span>的更多内容请阅读《<a title="阅读全文" target="_blank" href="http://hubingforever.blog.163.com/blog/static/171040579201257950790/" style="color:rgb(245,149,19); text-decoration:none; white-space:nowrap">Git打包文件</a>》</wbr>
<wbr><div><span style="font-size:16px; line-height:28px"><strong>一、前言</strong></span></div> <div> <span style="color:#000080">所有的对象都以</span><span style="color:#993300">SHA</span><span style="color:#000080">值为索引用</span><span style="color:#993300">gzip格式</span><span style="color:#000080">压缩存储, 每个对象都包含了</span><span style="color:#99cc00">对象类型</span><span style="color:#000080">,</span><span style="color:#339966">大小</span><span style="color:#000080">和</span><span style="color:#008000">内容</span><span style="color:#000080">.</span> </div> <div> <span style="color:#993300">Git</span><span style="color:#000080">中存在两种对象 -</span><span style="color:#ff00ff">松散对象(loose object)</span><span style="color:#000080">和</span><span style="color:#ff00ff">打包对象(packed object)</span><span style="color:#000080">.</span> </div> <div><strong><span style="font-size:16px; line-height:28px">二、松散对象</span></strong></div> <div> <span style="color:#ff00ff">松散对象</span><span style="color:#003366">是一种比较简单格式. 它就是磁盘上的一个存储压缩数据的文件. 每一个对象都被写入一个单独文件中.</span> </div> <div> <span style="color:#000080">如果你对象的</span><span style="color:#993300">SHA</span><span style="color:#000080">值是</span><span style="color:#808000">ab04d884140f7b0cf8bbf86d6883869f16a46f65</span><span style="color:#000080">, 那么对应的文件会被存储在:</span> </div> <div><span style="color:#3366ff">GIT_DIR/objects/ab/04d884140f7b0cf8bbf86d6883869f16a46f65</span></div> <div> <span style="color:#993300">Git</span><span style="color:#003366">使用</span><span style="color:#993300">SHA</span><span style="color:#003366">值的前两个字符作为子目录名字, 所以一个目录中永远不会包含过多的对象. 文件名则是余下的38个字符</span><span style="color:#000080">.</span> </div> <div><span style="color:rgb(0,0,128)">可以用下面的Ruby代码说明对象数据是如何存储的:</span></div> <div> <pre class="prettyprint" style="padding-top:2px; padding-right:2px; padding-bottom:2px; padding-left:2px; border-top-width:1px; border-right-width:1px; border-bottom-width:1px; border-left-width:1px; border-top-style:solid; border-right-style:solid; border-bottom-style:solid; border-left-style:solid; border-top-color:rgb(136,136,136); border-right-color:rgb(136,136,136); border-bottom-color:rgb(136,136,136); border-left-color:rgb(136,136,136)"><p style="margin-top:0px; margin-bottom:10px; padding-top:0px; padding-bottom:0px"></p><div> <span class="kwd" style="color:rgb(0,0,136)">def</span><span class="pln"> put_raw_object</span><span class="pun" style="color:rgb(102,102,0)">(</span><span class="pln">content</span><span class="pun" style="color:rgb(102,102,0)">,</span><span class="pln"> type</span><span class="pun" style="color:rgb(102,102,0)">)</span> </div><div> <span class="pln"> size </span><span class="pun" style="color:rgb(102,102,0)">=</span><span class="pln"> content</span><span class="pun" style="color:rgb(102,102,0)">.</span><span class="pln">length</span><span class="pun" style="color:rgb(102,102,0)">.</span><span class="pln">to_s</span> </div><div> <span class="pln"> header </span><span class="pun" style="color:rgb(102,102,0)">=</span><span class="pln"> </span><span class="str" style="color:rgb(0,136,0)">"#{type} #{size}\0"</span><span class="pln"> </span><span class="com" style="color:rgb(136,0,0)"># type(space)size(null byte)</span> </div><div><span class="com" style="color:rgb(136,0,0)"> store = header + content</span></div><div><span class="com" style="color:rgb(136,0,0)"> sha1 = Digest::SHA1.hexdigest(store)</span></div><div><span class="com" style="color:rgb(136,0,0)"> path = @git_dir + '/' + sha1[0...2] + '/' + sha1[2..40]</span></div><div><span class="com" style="color:rgb(136,0,0)"> if !File.exists?(path)</span></div><div><span class="com" style="color:rgb(136,0,0)"> content = Zlib::Deflate.deflate(store)</span></div><div><span class="com" style="color:rgb(136,0,0)"> FileUtils.mkdir_p(@directory+'/'+sha1[0...2])</span></div><div><span class="com" style="color:rgb(136,0,0)"> File.open(path, 'w') do |f|</span></div><div><span class="com" style="color:rgb(136,0,0)"> f.write content</span></div><div><span class="com" style="color:rgb(136,0,0)"> end</span></div><div><span class="com" style="color:rgb(136,0,0)"> end</span></div><div><span class="com" style="color:rgb(136,0,0)"> return sha1</span></div><div><span class="com" style="color:rgb(136,0,0)">end</span></div><p style="margin-top:0px; margin-bottom:10px; padding-top:0px; padding-bottom:0px"></p></pre> </div> <div><strong><span style="font-size:16px; line-height:28px">三、打包对象</span></strong></div> <div> <span style="color:#003366">另外一种对象存储方式是使用</span><span style="color:#ff00ff">打包文件(packfile)</span><span style="color:#003366">. 由于Git把每个文件的每个版本都作为一个单独的对象, 它的效率可能会十分的低. 设想一下在一个数千行的文件中改动一行, Git会把修改后的文件整个存储下来, 很浪费空间.</span> </div> <div> <span style="color:#993300">Git</span><span style="color:#003366">使用打包文件(packfile)去节省空间. 在这个格式中, Git只会保存第二个文件中改变了的部分, 然后用一个指针指向相似的那个文件(译注: 即第一个文件).</span> </div> <div> <span style="color:#000080">对象通常是以松散格式写到磁盘上, 因为这个格式的访问代价比较低. 然后, 你最终会需要把对象存放到</span><span style="color:#993300">打包格式</span><span style="color:#000080">中去节省磁盘空间 - 这个工作可以通过</span><span style="color:#ff00ff">git gc</span><span style="color:#000080">来完成. 它使用一个相当复杂的启发式算法去决定哪些文件是最相似的, 然后基于此分析去计算差异. 可以存在多个打包文件, 在必要情况下, 它们可被解包(git unpack-objects)成为松散对象或者重新打包(git repack).</span> </div> <div> <span style="color:#000080">Git会为每一个打包文件创建一个较小的索引文件.</span><span style="color:#993300">索引文件</span><span style="color:#000080">中包含了对象在打包文件中的偏移, 以便于通过SHA值来快速找到特定的对象.</span> </div> 关于打<span style="color:#993300">包对象</span>的更多内容请阅读《<a title="阅读全文" target="_blank" href="http://hubingforever.blog.163.com/blog/static/171040579201257950790/" style="color:rgb(245,149,19); text-decoration:none; white-space:nowrap">Git打包文件</a>》</wbr>