/var/lib/yum/yumdb

本文详细介绍了 Yum 的工作机制,包括 Yum 服务器如何组织 RPM 包及其元数据,客户端如何利用这些信息进行软件包的安装与管理。同时,还探讨了 Yum 使用的外部数据库 YumDB 的结构和功能。

YumDB

Since yum 3.2.26 yum has started storing additional information about installed packages in a location outside of the rpmdatabase. None of the information stored there is critical to performing its function but it enhances the user experience and makes it possible to know more about the context in which a package was installed.

Format

the yumdb is a simple flat file database. The filesystem creates a simple tree structure:

   /var/lib/yum/yumdb/
                      p/
                        $checksum-packagename-$ver-$rel.$arch/keyname

Each keyname is a file and the contents of that file are the values.

Note since 3.2.28 hardlinks are allowed between different keys, this saves on load time and storage but means that if you try to change the data using a text editor it'll probably change more than you want it to.

Why not a "real" database

The two main operations that yum uses the yumdb for are:

  • Given an installed package XYZ-2-1.noarch, get the value of yumdb key FOO. (Eg. yumdb get from_repo yum).
  • Given an installed package XYZ-2-1.noarch, set the value of yumdb key FOO to BAR. (Eg. yumdb set from_repo special yum).

...using the filesystem allows both those operations to be fast and atomic. It is unlikely to be significantly better to use any other approach for the two main uses, however the most common suggestions "sqlite" and a key/value store (like "libdb*") fail at least one of those tests. Using the filesystem makes it easy to:

  • Keep all the yum code simple.
  • Have isolation. Eg. Something goes wrong and the "reason" key for package XYZ is broken, nothing else should be affected.
  • Have a knowledgeable sysadmin fix any problems.
  • Have interoperability (it's trivial to to the get/set operations from any language without having to use the yum API -- although we still don't recommend it).

There are two minor downsides to using the filesystem:

  • Searching is not fast (Eg. yumdb search from_repo updates-testing). The main thing to realize here is that no yum tool currently needs to perform operations like this.
  • Load all keys of XYZ from all installed packages. The only usecase here is loading the checksum data to calculate rpmdb-versions, on install/etc. ... however we need a separate index for this anyway, as we when need to know this information quickly we don't want to load the packages at all.

Stored information

One of the desires for the yumdb is that users/plugins/etc. could store almost arbitrary information in the yumdb, and have it attached to specific packages. So listing a "canonical" set of keys is never going to be possible. At some point there may be an API to get a list of "keys that should migrate on a package update", but that isn't in 3.2.29 atm.

So here's a list of all the items that should be set for every package (from yumdb info) from 3.2.29 onwards:

  • from_repo: the name of the repo from which the pkg was installed
  • from_repo_revision: Repo. revision. Or ctime for a local package.
  • from_repo_timestamp: Repo. timestamp. Or mtime for a local package.
  • reason: reason for installing this pkg (user, dep, etc)
  • releasever: $releasever of the system at the time the pkg was installed (so you can look for pkgs which have lingered across release updates)
  • installed_by (3.2.28): The loginuid of the user who first installed this package (note that some tools which call yum don't obey loginuid, this not being set is one of many problems that introduces). This doesn't cross Obsoletes.
  • changed_by (3.2.28): The loginuid of the user who last installed this package.

These are known other keys:

  • checksum_type: The type of the checksum for the installed pkg. Eg. md5, sha1, sha256.
  • checksum_data: The value of the checksum for the installed pkg.
  • origin_url (3.2.29): Requires a newer urlgrabber, this is the url that the package was download from.
  • command_line: The command line used to install this pkg (only set if pkg. installed from a tool that has a command line).
  • installonly: Not set by yum, but looked at to see if installonly packages should be automatically removed.
  • group_member (3.2.29+?): Set by yum if a package was installed as part of a "group install" (beta patch).

Accessing this information

There is a script called 'yumdb' in yum-utils which allows you to access this information:

  • get the repo from which yum-utils was installed:
           yumdb get from_repo yum-utils
    

  • set a note on the packages 'joe' and 'geany'
           yumdb set note "installed by seth b/c he likes them" joe geany
    

  • Dump out all yumdb values about yum and yum-utils:
           yumdb info yum-utils yum
    

History

Long ago in a galaxy far away known as 2007 - we asked for the ability to write this kind of data into the rpmdb itself. We asked again in 2009. With no answer from the subject but told informally "no", we decided to implement it in a db outside of the rpmdb. In order to keep it flexible we just needed key,value pairs tied to a pkgid.

other info: rpm.org ticket on this subject: http://rpm.org/ticket/43


yum 运行原理

yum的工作需要两部分来合作,一部分是yum服务器,还有就是client的yum工具。下面分别介绍两部分工作原理。

  • yum服务器

    所有要发行的rpm包都放在yum服务器上以提供别人来下载,rpm包根据kernel的版本号,cpu的版本号分别编译发布。yum服务器只要提供简单的下载就可以了,ftp或者httpd的形式都可以。yum服务器有一个最重要的环节就是整理出每个rpm包的基本信息,包括rpm包对应的版本号,conf文件,binary信息,以及很关键的依赖信息。在yum服务器上提供了createrepo工具,用于把rpm包的基本概要信息做成一张"清单",这张"清单""就是描述每个rpm包的spec文件中信息。

  • yum client端

    client每次调用yum install或者search的时候,都会去解析/etc/yum.repos.d下面所有以.repo结尾的配置文件,这些配置文件指定了yum服务器的地址。yum会定期去"更新"yum服务器上的rpm包"清单",然后把"清单"下载保存到yum自己的cache里面,根据/etc/yum.conf里配置(默认是在/var/cache/yum下面),每次调用yum装包的时候都会去这个cache目录下去找"清单",根据"清单"里的rpm包描述从而来确定安装包的名字,版本号,所需要的依赖包等,然后再去yum服务器下载rpm包安装。(前提是不存在rpm包的cache)

搭建yum服务器

1. 安装工具createrepo
$ yum install createrepo

2. 建立repo发布目录
$ mkdir /var/www/yum/centos/5/{i386,x86_64}
$ mkdir /var/www/yum/centos/6/{i386,x86_64}

3. 用rpmbuild生成两个rpm包,比如说下面几个包,版本号不一样,包名字不一样
rpm_test-0.0.1-3.noarch.rpm
rpm_test-0.0.2-3.noarch.rpm
rpm_test2-0.0.2-3.noarch.rpm

4. copy rpm包到发布目录下
$ cp rpm_test-0.0.* /var/www/yum/centos/5/i386/

5. 用createrepo生成meta信息
$ createrepo -o /var/www/yum/centos/5/i386/ /var/www/yum/centos/5/i386
3/3 - rpm_test-0.0.1-3.noarch.rpm
Saving Primary metadata
Saving file lists metadata
Saving other metadata

6. 配置apache或者nginx到发布目录

在createrepo之后会在/var/www/yum/centos/5/i386/生成下面的目录和文件

$ tree repodata/
repodata/
|-- filelists.xml.gz
|-- other.xml.gz
|-- primary.xml.gz
`-- repomd.xml

$ gunzip filelists.xml.gz
$ gunzip primary.xml.gz

filelists.xml里面记录了所有rpm包列表,版本号,配置文件等

<package pkgid="19c82aa653a394ee1f7dbc7b694fbf0221bc1848" name="rpm_test" arch="noarch"><version epoch="0" ver="0.0.1" rel="3"/><file>/usr/local/rpm_test/conf/test.conf</file><file>/usr/local/rpm_test/test.py</file><file type="dir">/usr/local/rpm_test/conf</file></package> ...

primary.xml里面记录描述了rpm包的依赖等信息

配置客户端

$ vim /etc/yum.repos.d/firefoxbug.repo

[firefoxbug]
name=firefoxbug
baseurl=http://42.120.7.71/centos/5/i386/
enabled=1
gpgcheck=0
gpgkey=

查看本地yum cache

默认是在/var/cache/yum下这里记录着每个repo对应的cache

/var/cache/yum/
|-- base
|   |-- cachecookie
|   |-- mirrorlist.txt
|   |-- packages
|   |-- primary.xml.gz
|   |-- primary.xml.gz.sqlite
|   `-- repomd.xml
|-- epel
|   |-- 76c4dcbfaf075e55d5876839eb11c4f33b3a2495-primary.sqlite
|   |-- cachecookie
|   |-- mirrorlist.txt
|   |-- packages
|   `-- repomd.xml
|-- firefoxbug
|   |-- cachecookie
|   |-- packages
|   |-- primary.xml.gz
|   |-- primary.xml.gz.sqlite
|   `-- repomd.xml
|-- timedhosts.txt
|-- updates
|   |-- cachecookie
|   |-- mirrorlist.txt
|   |-- packages
|   |-- primary.sqlite
|   `-- repomd.xml
  • 查看firefoxbug这个repo,primary.xml.gz就是yum服务器上的"清单",但是这里以sqlite方式存储了,可以查看sqlite的db
$ sqlite3 primary.xml.gz.sqlite
sqlite> .table
conflicts  db_info    files      obsoletes  packages   provides   requires
sqlite> select * from packages;
1|896712eb4b4af2d61745dd30e0a6f6513043fd69|rpm_test|noarch|0.0.2|0|3|rpm_test|rpm_test by Wanghua||1406360629|1406360561|Commercial||tools|firefoxbug|rpm_test-0.0.2-3.src.rpm|280|2402||2734|268|816|rpm_test-0.0.2-3.noarch.rpm||sha
2|3ad546bd3ce28b0a82a1387f438f456349e20c78|rpm_test2|noarch|0.0.2|0|3|rpm_test|rpm_test by Wanghua||1406363739|1406363674|Commercial||tools|firefoxbug|rpm_test2-0.0.2-3.src.rpm|280|2406||2738|268|816|rpm_test2-0.0.2-3.noarch.rpm||sha
3|19c82aa653a394ee1f7dbc7b694fbf0221bc1848|rpm_test|noarch|0.0.1|0|3|rpm_test|rpm_test by Wanghua||1406360629|1406356964|Commercial||tools|firefoxbug|rpm_test-0.0.1-3.src.rpm|280|2402||2733|268|816|rpm_test-0.0.1-3.noarch.rpm||sha
sqlite> select * from requires;
/bin/sh|||||1|TRUE
python|GE|0|2.4.3||1|FALSE
/bin/sh|||||2|TRUE
python|GE|0|2.4.3||2|FALSE
/bin/sh|||||3|TRUE
python|GE|0|2.4.3||3|FALSE
  • 每次yum装包或者卸载的时候都会来查询这个sqlite的DB,然后做出相应的操作。
  • 清除本地yum cache

调用sudo yum clean会把这份"清单""全都清除,下次调用yum install等操作又会重新生成。

$ sudo yum clean

/var/cache/yum/
|-- base
|   |-- packages
|-- epel
|   |-- packages
|-- firefoxbug
|   |-- packages
|-- updates
|   |-- packages
  • timedhosts.txt这个文件记录着所有源地址访问所需要的时间,可以查到哪些源的地址比较慢
  • 如果/etc/yum.conf中keepcache选项是1,那么下载的rpm包都会保存到/var/cache/yum/xxx/package下
  • yum install package的时候怎么确定package已经安装了呢?这部分确定不是在/var/cache/yum中得到的,而是在/var/lib/rpm/下面得到。因为装包的时候会要用root去写这个文件夹下面的db。具体这块的内容就得看rpm的源码了


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值