文章目录
问题一,kernel启动过程卡住但无明显报错
问题现象
内核启动过程中,log打印在usb阶段停住,并且循环提示rcu错误
[ 0.948601] libphy: stmmac: probed
[ 0.952107] usbcore: registered new interface driver rtl8150
[ 0.952217] usbcore: registered new interface driver r8152
[ 0.952298] usbcore: registered new interface driver asix
[ 0.952376] usbcore: registered new interface driver ax88179_178a
[ 0.952452] usbcore: registered new interface driver cdc_ether
[ 0.952527] usbcore: registered new interface driver rndis_host
[ 0.952631] usbcore: registered new interface driver cdc_ncm
[ 0.952707] usbcore: registered new interface driver cdc_mbim
[ 0.954561] dwc3 fcc00000.dwc3: Failed to get clk 'ref': -2
[ 0.955032] naneng-combphy fe820000.phy: phy type select 4 overwriting type 1
[ 0.962840] dwc3 fd000000.dwc3: Failed to get clk 'ref': -2
[ 0.970340] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 0.970402] ehci-pci: EHCI PCI platform driver
[ 0.970533] ehci-platform: EHCI generic platform driver
[ 0.972955] ehci-platform fd800000.usb: EHCI Host Controller
[ 0.973272] ehci-platform fd800000.usb: new USB bus registered, assigned bus number 1
[ 0.973945] ehci-platform fd800000.usb: irq 13, io mem 0xfd800000
[ 0.985264] ehci-platform fd800000.usb: USB 2.0 started, EHCI 1.00
[ 0.985658] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.19
[ 0.985691] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 0.985718] usb usb1: Product: EHCI Host Controller
[ 0.985740] usb usb1: Manufacturer: Linux 4.19.206-CGELV6.03.10.B1 ehci_hcd
[ 0.985761] usb usb1: SerialNumber: fd800000.usb
[ 0.986496] hub 1-0:1.0: USB hub found
[ 0.986572] hub 1-0:1.0: 1 port detected
[ 0.989438] ehci-platform fd880000.usb: EHCI Host Controller
[ 0.989749] ehci-platform fd880000.usb: new USB bus registered, assigned bus number 2
[ 0.990314] ehci-platform fd880000.usb: irq 15, io mem 0xfd880000
[ 1.001954] ehci-platform fd880000.usb: USB 2.0 started, EHCI 1.00
[ 1.002355] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.19
[ 1.002388] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 1.002425] usb usb2: Product: EHCI Host Controller
[ 1.002448] usb usb2: Manufacturer: Linux 4.19.206-CGELV6.03.10.B1 ehci_hcd
[ 1.002469] usb usb2: SerialNumber: fd880000.usb
[ 1.003251] hub 2-0:1.0: USB hub found
[ 1.003329] hub 2-0:1.0: 1 port detected
[ 1.004592] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 1.004656] ohci-platform: OHCI generic platform driver
[ 1.005072] ohci-platform fd840000.usb: Generic Platform OHCI controller
[ 1.005470] ohci-platform fd840000.usb: new USB bus registered, assigned bus number 3
[ 1.115131] ata2: SATA link down (SStatus 0 SControl 300)
[ 1.115240] ata1: SATA link down (SStatus 0 SControl 300)
[ 61.018540] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 61.018581] rcu: 1-...0: (1 GPs behind) idle=eb6/1/0x4000000000000000 softirq=29/29 fqs=6000
[ 61.018599] rcu: 2-...0: (1 GPs behind) idle=db2/1/0x4000000000000000 softirq=30/30 fqs=6000
[ 61.018610] rcu: (detected by 3, t=18002 jiffies, g=-1119, q=1)
[ 61.018626] Task dump for CPU 1:
[ 61.018638] kworker/u8:2 R running task 0 75 2 0x0000002a
[ 61.018675] Workqueue: events_power_efficient hub_init_func2
[ 61.018689] Call trace:
[ 61.018708] __switch_to+0xc0/0x124
[ 61.018722] 0xffffffc03c0e39f0
[ 61.018733] Task dump for CPU 2:
[ 61.018743] swapper/0 R running task 0 1 0 0x0000002a
[ 61.018757] Call trace:
[ 61.018771] __switch_to+0xc0/0x124
[ 61.018782] 0xffffffc03ef18080
......
[36679.944299] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[36679.944340] rcu: 2-...0: (1 GPs behind) idle=82e/1/0x4000000000000000 softirq=44/45 fqs=3430204
[36679.944352] rcu: (detected by 3, t=16281339 jiffies, g=-1127, q=4)
[36679.944368] Task dump for CPU 2:
[36679.944379] swapper/0 R running task 0 1 0 0x0000002a
[36679.944394] Call trace:
[36679.944410] __switch_to+0xc0/0x124
[36679.944423] 0xffffffc03ef18080
问题原因
常见于软件模块驱动错误、硬件配置出错,比如设备树开启了USB节点但是kernel的USB驱动出错和不匹配,或者USB硬件电路有问题。
由于南瑞RK3568板子没有使用到USB2,所以在USB2不用的时候,可以不供电,建议接地。
但是要特别注意的:在PHY未供电的情况下,如果没有disable内核DTS中PHY对应的USB控制器节点,将会导致内核初始化时,卡死在USB控制器的初始化。
解决方法
关闭设备树usb2的phy相关节点
arch/arm64/boot/dts/rockchip/rk3568-evb.dtsi
,将usb2phy0和usb2phy1的status 属性值改为 “disabled”,修改后如下
&usb2phy0 {
status = "disabled";
};
&usb2phy1 {
status = "disabled";
};
问题二,内核启动循环报错 rcu: INFO: rcu_sched detected stalls on CPUs/tasks
问题现象
rcu: INFO: rcu_sched detected stalls on CPUs/tasks 表示你的CPU长时间没有进行CPU调度了。
问题原因
这种情况大概率是系统在启动的过程中,卡在某段程序中了,比如进了死循环、进入死锁、等待一个不可能发生的中断。
常见于软件模块驱动错误、硬件配置出错。
解决方法
将前后可能运行的模块给关闭,排除该模块驱动的影响,比如USB、PCIE、GMAC。
参考资料
网友类似的问题
内核会启动一部分,但是会卡地方,也没有发现有明显提示错误的地方。
这种现象,就是内核里面先把不用的模块关了。一个一个排查,应该模块驱动有问题了
解决方法
在内核的menuconfig界面,将pcie关闭
然后pcie打印通过了,但有出现usb的错误打印了,现在为了快速启动系统,先继续关闭usb的功能
在内核的menuconfig界面,将usb关闭
内核启动成功