openwrt----The Boot Process

本文详细解读了OpenWrt启动过程中涉及的关键组件及流程,包括引导加载程序、内核初始化、用户空间启动和系统关闭等阶段。通过具体实例展示了如何在不同设备上进行启动操作,以及初始化脚本的作用和配置。

The Boot Process

Please also see requirements.boot.process
This guide it not up-to-date! It does not mention procd

This guide shall help you understand, e.g.

Process Trinity

The Machine gets powered on and some very very basic very low level hardware stuff gets done. You could connect to it over the JTAG Portport and issue commands.

Bootloader

  1. the  bootloader on the flash gets executed
  2. the bootloader performs the  POST, which is a low-level hardware initialization
  3. the bootloader decompresses the Kernel image from its (known!) location on the flash storage into main memory (=RAM)
  4. the bootloader executes the Kernel with  init=… option (default is  /etc/preinit)

Kernel

  1. the Kernel further bootstraps itself (sic!)
  2. issues the command/op-code  start_kernel
  3. kernel scans the mtd partition  rootfs for a valid superblock and mounts the SquashFS partition (which contains  /etc) once found. (More info at  technical.details)
  4. /etc/preinit does pre-initialization setups (create directories, mount fs, /proc, /sys, … )
  5. the Kernel  mounts any other partition (e.g. jffs2 partition) under  rootfs (root file system). see  flash.layoutpreinit and root mount, and also  udev  FIXME  make sure
  6. if "INITRAMFS" is not defined, calls  /sbin/init (the mother of all processes)
  7. finally some  kernel thread becomes the userspace  init process

Init

The user space starts when kernel mounts rootfs and the very first program to run is (by default) /sbin/init. Please remember, that the interface between application and kernel is the clib and the syscalls it offers.

  1. init reads  /etc/inittab for the "sysinit" entry (default is "::sysinit:/etc/init.d/rcS S boot")
  2. init calls  /etc/init.d/rcS S boot
  3. rcS executes the symlinks to the actual startup scripts located in  /etc/rc.d/S##xxxxxx with option  "start":
  4. after rcS finishes, system should be up and running
Vanilla Startup Scripts

NOTE: Packages you install with opkg will likely add additional scripts!

S05defconfig create config files with default values for platform (if config file is not exist), really does this on first start after OpenWRT installed (copy unexisted files from /etc/defconfig/$board/ to /etc/config/)
S10boot starts hotplug-script, mounts filesystesm, starts .., starts syslogd, …
S39usb mount -t usbfs none /proc/bus/usb
S40network start a network subsystem (run /sbin/netifd, up interfaces and wifi
S45firewall create and implement firewall rules from /etc/config/firewall
S50cron starts crond, see → /etc/crontabs/root for configuration
S50dropbear starts dropbear, see → /etc/config/dropbear for configuration
S50telnet checks for root password, if non is set, /usr/sbin/telnetd gets started
S60dnsmasq starts dnsmasq, see → /etc/config/dhcp for configuration
S95done executes /etc/rc.local
S96led load a LED configuration from /etc/config/system and set up LEDs (write values to /sys/class/leds/*/*)
S97watchdog start the watchdog daemon (/sbin/watchdog)
S99sysctl interprets /etc/sysctl.conf

The init daemon will run all the time. On a shutdown command, init

  1. reads  /etc/inittab for shutdown (default is "::shutdodwn:/etc/init.d/rcS K stop")
  2. init calls  /etc/init.d/rcS K stop
  3. rcS executes the shutdown scripts located in /etc/rc.d/K##xxxxxx with option "stop"
  4. system halts/reboots
K50dropbear kill all instances of dropbear
K90network down all interfaces and stop netifd
K98boot stop logger daemons: /sbin/syslogd and /sbin/klogd
K99umount writes caches to disk, unmounts all filesystems

Detailed boot sequence

Boot loader

After the bootloader (grub, in this example) initializes and parses any options that are presented at the boot menu, the bootloader loads the kernel.

Example from the openwrt-x86-ext2-image.kernel file entry for normal boot:

  • "kernel /boot/vmlinuz root=/dev/hda2 init=/etc/preinit [rest of options]"

This entry in the boot/grub/menu.lst file tells grub that the kernel is located under the /boot directory and the filename is vmlinuz. The rest of the lines are the options that are passed to the kernel. To see how the kernel was started, you can view the options by reading the /proc/cmdline file. You can see what options were passed from grub by logging into the device and typing "cat /proc/cmdline".

For my test system, the options that were passed to the kernel at load time was:

  • "root=/dev/hda2 rootfstype=ext2 init=/etc/preinit noinitrd console=ttyS0,38400,n,8,1 reboot=bios"

The options are:

  1. root: root device/partition where the rest of the OpenWrt system is located
  2. rootfstype: Format for the root partition - ext2 in this example
  3. init: The first program to call after the kernel is loaded and initialized
  4. noinitrd: All drivers for access to the rest of the system are built into the kernel, so no need to load an initial ramdisk with extra drivers
  5. console: Which device to accept console login commands from - talk to ttyS0 (first serial port) at 38400 speed using no flow control, eight data bits and one stop bit. See the kernel documentation for other options
  6. reboot: Not sure, but I believe that this option tells the kernel how to perform a reboot

The first program called after the kernel loads is located at the kernel options entry of the boot loader. For grub, the entry is located in the openwrt–.image.kernel.image file in the /boot/grub/menu.lst file.

[ NOTE: See the man page on grub for all of the grub parameters ] In this example, the entry "init=/etc/preinit" tells the kernel that the first program to run after initializing is "preinit" found in the "/etc" directory located on the disk "/dev/hda" and partition "hda2".

/etc/preinit script

The preinit script's primary purpose is initial checks and setups for the rest of the startup scripts. One primary job is to mount the /proc and /sys pseudo filesystems so access to status information and some control functions are made available. Another primary function is to prepare the /dev directory for access to things like console, tty, and media access devices. The final job of preinit is to start the init daemon process itself.

Busybox init

Init is considered the "Mother Of All Processes" since it controls things like starting daemons, changing runlevels, setting up the console/pseudo-consoles/tty access daemons, as well as some other housekeeping chores.

Once init is started, it reads the /etc/inittab configuration file to tell it what process to start, monitor for certain activities, and when an activity is triggered to call the relevant program.

The init program used by busybox is a minimalistic daemon. It does not have the knowledge of runlevels and such, so the config file is somewhat abbreviated from the normal init config file. If you are running a full linux desktop, you can "man inittab" and read about the normal init process and entries. Fields are separated by a colon and are defined as:

  • [ID] : [Runlevel(s)] : [Action] : [Process to execute ]

For busybox init, the only fields needed are the "ID" (1st), "Action" (3rd) and "Process" (4th) entries. Busybox init has several caveats from a normal init: the ID field is used for controlling TTY/Console, and there are no defined runlevels. A minimalistic /etc/inittab would look like:

  1. ::sysinit:/etc/init.d/rcS S boot
  2. ::shutdown:/etc/init.d/rcS K stop
  3. tts/0::askfirst:/bin/ash –login
  4. ttyS0::askfirst:/bin/ash –login
  5. tty1::askfirst:/bin/ash –login

Lines 1 and 2 with a blank ID field indicate they are not specific to any terminal or console. The other lines are directed to specific terminals/consoles.

Notice that both the "sysinit" and "shutdown" actions are calling the same program (the "/etc/init.d/rcS" script). The only difference is the options that are passed to the rcS script. This will become clearer later on.

At this point, init has parsed the configuration file and is looking for what to do next. So, now we get to the "sysinit" entry: call /etc/init.d/rcS with the options "S" and "boot"

/etc/rc.d/rcS Script At Startup

At this point, all basic setup has been done, all programs and system/configuration files are accessible, and we are now ready to start the rest of the processes.

The rcS script is pretty simplistic in it's function - it's sole purpose is to execute all of the scripts in the /etc/rc.d directory with the appropriate options. if you paid attention to the sysinit entry, the rcS script was called with the "S" and "boot" options. Since we called rcS with 2 options ("S" and "boot"), the rcS script will substitute $1 with "S" and $2 with "boot". The relevant lines in rcS are:

   -  for i in /etc/rc.d/$1* ; do
  2.      [ -x $i ] && $i $2
  3.  done

The basic breakdown is:

  1. Execute the following line once for every entry (file/link) in the /etc/rc.d directory that begins with "S"
  2. If the file is executable, execute the file with the option "boot"
  3. Repeat at step 1, replacing $i with the next filename until there are no more files to check

Unlike Microsoft programs, Linux uses file permissions rather than filename extensions to tell it if this entry is executable or not. For an explanation of file permissions, see "man chmod" on a Linux/Unix machine on explanations for permissions and executable files.

If you look at the /etc/rc.d directory, you may notice that some scripts have relevant links for startup, but no shutdown (i.e., /etc/init.d/httpd), while some others have no startup script, but do have a shutdown script (i.e., /etc/init.d/umount).

In the case of httpd (the webserver), it doesn't matter if it dies or not, there's nothing to clean up before quitting.

On the other hand, the umount script MUST be executed before shutdown to ensure that all data is flushed to the media before unmounting of any relevant storage media, otherwise data corruption could occur. There's no need to call unmount at startup, since storage media mounting is handled somewhere else (like /etc/preinit), so there's no startup script for this one.

After the last startup script is executed, you should have a fully operational OpenWrt system.

Notes

<think>我们注意到用户执行了 `vmkfstools -e openwrt_lan.vmdk` 命令,输出为 "Disk chain is consistent.",这表示VMDK描述文件与-flat.vmdk文件的关联是正常的。但是用户之前的问题显示CID值可能仍然是00007063(非随机)且几何参数可能不正确。因此,问题可能在于: 1. 虽然磁盘链一致,但CID非随机可能影响ESXi的磁盘识别(特别是在多虚拟机环境下) 2. 几何参数不准确可能导致启动问题(尽管磁盘链检查通过) ### 深度解决方案 #### 步骤1:强制更新CID和几何参数 即使磁盘链一致,也需要确保元数据正确: ```bash # 备份原文件 cp openwrt_lan.vmdk openwrt_lan.vmdk.bak # 获取实际-flat.vmdk文件大小 FLAT_SIZE=$(stat -c %s openwrt_lan-flat.vmdk) SECTORS=$((FLAT_SIZE / 512)) # 计算正确的cylinders值 CYLINDERS=$(( (SECTORS + 16*63 - 1) / (16*63) )) # 更新VMDK描述文件(保留原UUID避免虚拟机配置变更) sed -i \ -e "s/CID=.*/CID=$(dd if=/dev/urandom bs=4 count=1 2>/dev/null | hexdump -e '1/4 \"%08x\"')/" \ -e "s/RW [0-9]* FLAT/RW $SECTORS FLAT/" \ -e "s/ddb\.geometry\.cylinders = \".*\"/ddb.geometry.cylinders = \"$CYLINDERS\"/" \ openwrt_lan.vmdk ``` #### 步骤2:验证虚拟机配置 ```bash # 重新注册虚拟机(如果已存在) vim-cmd solo/registervm /vmfs/volumes/datastore1/path/to/vmx-file.vmx # 检查虚拟机状态 esxcli vm process list ``` #### 步骤3:启动测试与日志分析 如果启动失败,检查: ```bash # 查看虚拟机启动日志 tail -f /var/log/vmware/hostd.log | grep -i "openwrt_lan" ``` ### 常见错误场景处理 **场景1:启动时卡在BIOS界面** - 原因:几何参数与实际分区表不匹配 - 解决方案:使用fdisk校正分区表 ```bash # 在Linux中挂载镜像并校正 fdisk -l openwrt_lan-flat.vmdk # 检查分区起始扇区 # 如果分区起始扇区超过C/H/S范围,需调整分区表或几何参数 ``` **场景2:ESXi拒绝启动并报错"Failed to lock the file"** - 原因:CID冲突 - 解决方案:生成新CID并重启主机服务 ```bash /etc/init.d/hostd restart ``` ### 终极修复:重建VMDK ```bash # 使用vmkfstools直接重建(保留原始数据) vmkfstools -i openwrt_lan-flat.vmdk -d monolithicFlat fixed.vmdk # 覆盖原文件(确保虚拟机已关闭) mv fixed.vmdk openwrt_lan.vmdk ``` ### 维护脚本 ```bash #!/bin/sh # ESXi VMDK修复工具 VMDK_PATH="$1" FLAT_PATH="${VMDK_PATH%.*}-flat.vmdk" # 验证文件存在 [ -f "$FLAT_PATH" ] || { echo "错误:未找到-flat.vmdk文件"; exit 1; } # 计算参数 FLAT_SIZE=$(stat -c %s "$FLAT_PATH") SECTORS=$((FLAT_SIZE / 512)) CYLINDERS=$(( (SECTORS + 16*63 - 1) / (16*63) )) # 备份原文件 cp "$VMDK_PATH" "${VMDK_PATH}.bak" # 更新描述文件 sed -i \ -e "s/CID=.*/CID=$(dd if=/dev/urandom bs=4 count=1 2>/dev/null | hexdump -e '1/4 \"%08x\"')/" \ -e "s/RW [0-9]* FLAT/RW $SECTORS FLAT/" \ -e "s/ddb\.geometry\.cylinders = \".*\"/ddb.geometry.cylinders = \"$CYLINDERS\"/" \ "$VMDK_PATH" echo "修复完成!请重启虚拟机。" ``` > **重要提示**:在修改生产环境磁盘前务必创建快照备份! §§相关问题§§ 1. 当使用`vmkfstools -e`命令显示磁盘链一致但虚拟机仍无法启动时,应该优先检查宿主机的哪些日志文件以定位问题? 2. 在ESXi环境中,如何通过命令行工具直接修改已注册虚拟机的VMDK文件的CID值而不影响快照链? 3. 对于使用GPT分区表的OpenWRT镜像,VMDK几何参数配置错误会导致哪些特定的内核启动错误? 4. 当多个虚拟机共享同一个-flat.vmdk文件时,ESXi的锁机制如何通过CID和UUID字段防止写冲突? 5. 在ESXi 8.0中,使用`vmkfstools --fixdiskmetadata`命令可以自动修复哪些类型的VMDK元数据损坏?</think>### 问题诊断与解决方案 您的 `vmkfstools -e` 检查显示磁盘链一致,表明 VMDK 描述文件和对应的 -flat.vmdk 文件关联正确。但问题可能出在其他配置上: ### 核心问题排查 1. **CID 值问题**: ```bash CID=00007063 # 仍然是非随机值 ``` - 虽然磁盘链一致,但 CID 值不符合 ESXi 要求(必须是 8 位随机十六进制) - 这会导致虚拟机启动时出现 "Invalid CID" 错误 2. **几何参数不匹配**: ```bash ddb.geometry.cylinders = "16383" # 固定值 RW 246335 FLAT "openwrt-flat.vmdk" 0 ``` - 16383×16×63 = 16,515,648 扇区 ≈ 8.4GB - 但 RW 值 246335 扇区 ≈ 126MB - 这种不匹配会导致启动时分区识别错误 ### 终极修复方案 #### 步骤1:强制重建 CID 和几何参数 ```bash # 获取实际磁盘大小 FLAT_SIZE=$(stat -c %s openwrt-flat.vmdk) SECTORS=$((FLAT_SIZE / 512)) # 计算正确的 cylinders CYLINDERS=$(( (SECTORS + 16*63 - 1) / (16*63) )) # 更新 VMDK 文件 sed -i \ -e "s/CID=.*/CID=$(dd if=/dev/urandom bs=4 count=1 2>/dev/null | hexdump -e '1/4 \"%08x\"')/" \ -e "s/RW [0-9]* FLAT/RW $SECTORS FLAT/" \ -e "s/ddb\.geometry\.cylinders = \".*\"/ddb.geometry.cylinders = \"$CYLINDERS\"/" \ openwrt.vmdk ``` #### 步骤2:验证虚拟机配置 ```bash # 检查虚拟机是否已注册 vim-cmd vmsvc/getallvms | grep "openwrt" # 如果已注册,强制重新加载配置 VMID=$(vim-cmd vmsvc/getallvms | awk '/openwrt/{print $1}') vim-cmd vmsvc/reload $VMID ``` #### 步骤3:深度验证磁盘 ```bash # 1. 检查分区表 fdisk -lu openwrt-flat.vmdk # 应显示正确的分区信息 # 2. 验证启动扇区 dd if=openwrt-flat.vmdk bs=512 count=1 | file - # 应显示: DOS/MBR boot sector # 3. 完整磁盘扫描 vmkfstools --checkconsistency openwrt.vmdk ``` ### 高级故障排除 **问题依旧?尝试以下方法:** 1. **完全重建 VMDK**: ```bash vmkfstools -i openwrt-flat.vmdk -d monolithicFlat new_openwrt.vmdk ``` 2. **修复分区表**(如果镜像损坏): ```bash # 使用 gdisk 修复 GPT 分区 gdisk openwrt-flat.vmdk <<EOF r e w Y EOF ``` 3. **ESXi 缓存问题**: ```bash # 清除主机缓存 /etc/init.d/hostd restart /etc/init.d/vpxa restart ``` ### 成功启动检查表 1. [ ] CID 是 8 位随机十六进制(如 `a1b2c3d4`) 2. [ ] RW 扇区数 = 文件大小 ÷ 512 3. [ ] cylinders = ceil(扇区数 ÷ (16×63)) 4. [ ] 虚拟机配置中磁盘控制器类型为 `lsilogic` 5. [ ] 无残留的 `.lck` 锁文件 6. [ ] 使用 `vmkfstools -e` 确认磁盘链一致 > **最终建议**:如果仍无法启动,检查虚拟机 BIOS/UEFI 设置: > ```bash > vim-cmd vmsvc/get.guest $VMID | grep firmware > # 确保与镜像类型匹配(传统 BIOS 或 UEFI) > ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值