在boot.s中,内核已经移至0x0000 0000处,并且开始执行,首先执行的就是head.s,这部分代码执行完后,也会被覆盖,严格意义上来讲,不是完全属于内核代码。
这部分代码采用AT&T汇编编写,下面来详细分析下其功能:
/*
* head.s contains the 32-bit startup code.
*
* NOTE!!! Startup happens at absolute address 0x00000000, which is also where
* the page directory will exist. The startup code will be overwritten by
* the page directory.
*/
head.s中入口函数被放在0x0000 0000处,执行完后,会被page directory覆盖掉。在boot.s中,已经开启了分页,即设置PE了,同时设置了中断,GDTR,但是没有对页目录表和页表的初始化。这个也是head.s的主要任务。
.text
.globl _idt,_gdt,_pg_dir
_pg_dir:
startup_32:
//...
_pg_dir 从名字可以看出来,就是初始化分页机制,startup_32是head.s入口
movl $0x10,%eax
mov %ax,%ds
mov %ax,%es
mov %ax,%fs
mov %ax,%gs
设置段寄存器,0x10展开后0001 0000,RPL为00,查GDT表,index是10,也就是第二项,在boot.s中设置的内容为:
gdt:
.word 0,0,0,0 | dummy
.word 0x07FF | 8Mb - limit=2047 (2048*4096=8Mb)
.word 0x0000 | base address=0
.word 0x9A00 | code read/exec
.word 0x00C0 | granularity=4096, 386
.word 0x07FF | 8Mb - limit=2047 (2048*4096=8Mb)
.word 0x0000 | base address=0
.word 0x9200 | data read/write
.word 0x00C0 | granularity=4096, 386
可以看到第二项目的base address 是0.接下来是设置好内核堆栈,在分页机制还没建立起来的时候,都要程序员自己规划内存。
接下来是便是重新设置idt和gdt。
lss _stack_start,%esp
call setup_idt
call setup_gdt
/*
* setup_idt
*
* sets up a idt with 256 entries pointing to
* ignore_int, interrupt gates. It then loads
* idt. Everything that wants to install itself
* in the idt-table may do so themselves. Interrupts
* are enabled elsewhere, when we can be relatively
* sure everything is ok. This routine will be over-
* written by the page tables.
*/
setup_idt:
lea ignore_int,%edx
movl $0x00080000,%eax
movw %dx,%ax /* selector = 0x0008 = cs */
movw $0x8E00,%dx /* interrupt gate - dpl=0, present */
lea _idt,%edi
mov $256,%ecx
rp_sidt:
movl %eax,(%edi)
movl %edx,4(%edi)
addl $8,%edi
dec %ecx
jne rp_sidt
lidt idt_descr
ret
256个中断向量
接下来是相关gdt设置:
/*
* setup_gdt
*
* This routines sets up a new gdt and loads it.
* Only two entries are currently built, the same
* ones that were built in init.s. The routine
* is VERY complicated at two whole lines, so this
* rather long comment is certainly needed :-).
* This routine will beoverwritten by the page tables.
*/
setup_gdt:
lgdt gdt_descr
ret
lgdt也就是向gdtr中存放gdt的地址:
gdt_descr:
.word 256*8-1 # so does gdt (not that that's any
.long _gdt # magic number, but it works for me :^)
.align 3
_gdt: .quad 0x0000000000000000 /* NULL descriptor */
.quad 0x00c09a00000007ff /* 8Mb */
.quad 0x00c09200000007ff /* 8Mb */
.quad 0x0000000000000000 /* TEMPORARY - don't use */
.fill 252,8,0 /* space for LDT's and TSS's etc */
gdt表项的长为quad,也就是64bit,一共设置了四项。按照intel的意图,是每个进程,都要有自己的gdt表项,实现分段机制。但linux没有这么做,这四项中只有第二和第三项有意义。我们对照下gdt表项结构,来看下其用意:
段限长limit从【0,15】(47,50】共20位,在G=0,时寻址空间1m,G=1时,为4G。
段基址【16,31】,【32,39】,(55,64】,共32位,寻址空间为4G。
0x00c0 9a00 0000 07ff,与上图对照可知,其段基址是:0x00 00 00 00 也就是3G,而段限长为:0x0 07ff
即82564k,也就是8m。C即是1100,对应用的G=1,打开分页状态。9是1001,对应DPL为00,也就是内核特权级。a是1010,type为a,表示这段可读可执行。
同样的分析,可以0x00c09200000007ff,仅仅是type不一样,type为2,表示此段可读可写。
在设置完中断和描述符后,要重新设置段寄存器:
movl $0x10,%eax # reload all the segment registers
mov %ax,%ds # after changing gdt. CS was already
mov %ax,%es # reloaded in 'setup_gdt'
mov %ax,%fs
mov %ax,%gs
lss _stack_start,%esp
接着为执行main.c作准备:
xorl %eax,%eax
1: incl %eax # check that A20 really IS enabled
movl %eax,0x000000
cmpl %eax,0x100000
je 1b
movl %cr0,%eax # check math chip
andl $0x80000011,%eax # Save PG,ET,PE
testl $0x10,%eax
jne 1f # ET is set - 387 is present
orl $4,%eax # else set emulate bit
1: movl %eax,%cr0
jmp after_page_tables
先判断a20是否开启,如果开启,则0x00 00 00与0x10 00 00处的值不一样,跳出循环。再设置cr0,完成后,跳转到after_pamge_tables执行:
.org 0x4000
after_page_tables:
pushl $0 # These are the parameters to main :-)
pushl $0
pushl $0
pushl $L6 # return address for main, if it decides to.
pushl $_main
jmp setup_paging
L6:
jmp L6 # main should never return here, but
# just in case, we know what happens.
L6处是个死循环,main函数返回后才会执行,但是,一般情况下不会发生,setup_paging返回后,将执行main函数,下面我们来看看setup_paging
/*
* Setup_paging
*
* This routine sets up paging by setting the page bit
* in cr0. The page tables are set up, identity-mapping
* the first 8MB. The pager assumes that no illegal
* addresses are produced (ie >4Mb on a 4Mb machine).
*
* NOTE! Although all physical memory should be identity
* mapped by this routine, only the kernel page functions
* use the >1Mb addresses directly. All "normal" functions
* use just the lower 1Mb, or the local data space, which
* will be mapped to some other place - mm keeps track of
* that.
*
* For those with more memory than 8 Mb - tough luck. I've
* not got it, why should you :-) The source is here. Change
* it. (Seriously - it shouldn't be too difficult. Mostly
* change some constants etc. I left it at 8Mb, as my machine
* even cannot be extended past that (ok, but it was cheap :-)
* I've tried to show which constants to change by having
* some kind of marker at them (search for "8Mb"), but I
* won't guarantee that's all :-( )
*/
.align 2
setup_paging:
movl $1024*3,%ecx
xorl %eax,%eax
xorl %edi,%edi /* pg_dir is at 0x000 */
cld;rep;stosl
movl $pg0+7,_pg_dir /* set present bit/user r/w */
movl $pg1+7,_pg_dir+4 /* --------- " " --------- */
movl $pg1+4092,%edi
movl $0x7ff007,%eax /* 8Mb - 4096 + 7 (r/w user,p) */
std
1: stosl /* fill pages backwards - more efficient :-) */
subl $0x1000,%eax
jge 1b
xorl %eax,%eax /* pg_dir is at 0x0000 */
movl %eax,%cr3 /* cr3 - page directory start */
movl %cr0,%eax
orl $0x80000000,%eax
movl %eax,%cr0 /* set paging (PG) bit */
ret /* this also flushes prefetch-queue */
.align 2
.word 0
idt_descr:
.word 256*8-1 # idt contains 256 entries
.long _idt
.align 2
.word 0