32位64位的检查
进程的优先级
zombie processes[@more@]
内容:
32/64位
检查程序32/64位:
# dump -ov filename |grep bit
# dump -Xany -ov filename |grep bit
检查AIX内核32、64位
# bootinfo -K
cpu是32、64位
# bootinfo -y
# pmcycles 查看CPU主频
# prtconf 查看更多内容
Number Of Processors: 4
Processor Clock Speed: 4005 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
HMT SMT HT技术
SMT(Simultaneous Multi_Thread,并发多线程)
HMT(Hardware Multi-Thread)
HT(Hyper Thread, Intel对并发多线程的称呼)
在同一时钟周期内,两条指令流同时在一个CPU内核内运行。
AIX 的SMT控制命令 smtctl
[node1:root] smtctl
[node1:root] lsdev -Cc processor
[node1:root] lsattr -El proc0
[node1:root] sar -P ALL 1 2
使用schedo来影响进程的优先级
-D 参数 -> 将所有的schedtune参数都恢复为默认值
# vmo -> AIX5.2 以后提供的新命令,代替vmtune
# schedo -> AIX5.2 以后提供的新命令,代替schedtune
# bindprocessor -q -> 查看可用的CPU
# bindprocessor 13039 3 -> 将进程13039绑定到3号CPU
# bindprocessor -u 13039 -> 解除对13039的绑定
# nice -5 vmstat 10 3 -> 设定vmstat以降低5的优先级运行(降低5,从60到65)
# nice --5 vmstat 10 3 -> 设定vmstat以提高5的优先级运行(从60到55)
# renice -5 7569 -> 在原来的基础上提高5(比如从65到70)
查看oracle用户的进程优先级
[ora:root:/] ps -lu oracle
PRI: 进程/线程的优先级
NI: nice的设定值
SZ: 未注释
zombie 僵尸进程 defunct
1 系统中如果有很多僵尸进程,建议重启AIX清除
2 如何确定僵尸进程的原来命令名称? 使用kdb命令 参考《IBM Defunct processes on AIX.pdf》
# kdb
列出所有的僵尸进程
(0)> p * |grep -i defunct
使用16进制PID过滤,注意这个进程的slot号
(0)> p * |grep
使用slot号查询
(0)> p
查询这个进程的所有线程 注意slot号和线程名称
(0)> tpid
查询一个线程的信息
(0)> th
查询一个线程的user area
(0)> u |grep "exec file"
下面例子:(我机器上没有defunct的进程,找一个LOCAL 来测试)
[ora:root:/] ps -ef|grep LOCAL
oracle 114846 1 0 Oct 24 - 0:00 oracleptdb (LOCAL=NO)
oracle 749604 1 0 Oct 24 - 3:30 oracleptdb (LOCAL=NO)
oracle 778468 1 0 Oct 24 - 2:51 oracleptdb (LOCAL=NO)
oracle 786432 1 0 Oct 24 - 0:51 oracleptdb (LOCAL=NO)
114846 的16进制是 001C09E
[ora:root:/] kdb
The specified kernel file is a 64-bit kernel
Preserving 1424670 bytes of symbol table
First symbol __mulh
START END
0000000000001000 0000000003E5C050 start+000FD8
F00000002FF47600 F00000002FFDC940 __ublock+000000
000000002FF22FF4 000000002FF22FF8 environ+000000
000000002FF22FF8 000000002FF22FFC errno+000000
F100070F00000000 F100070F10000000 pvproc+000000
F100070F10000000 F100070F18000000 pvthread+000000
PFT:
PVT:
id....................0002
raddr.....0000000002000000 eaddr.....F200800050000000
size..............00080000 align.............00001000
valid..1 ros....0 fixlmb.1 seg....0 wimg...2
(0)> p * |grep -i defunct
(0)> p * |grep 001C09E
pvproc+007000 28 oracle ACTIVE 001C09E 0000001 00000003B37F4590 0 0001
E(0)>
(0)> p * |more
SLOT NAME STATE PID PPID ADSPACE CL #THS
pvproc+000000 0 swapper ACTIVE 0000000 0000000 0000000030004190 0 0001
pvproc+000400 1 init ACTIVE 0000001 0000000 0000000018001480 0 0001
pvproc+000800 2 wait ACTIVE 0002004 0000000 0000000020006190 0 0001
pvproc+000C00 3 sched ACTIVE 0003006 0000000 0000000050008190 0 0001
pvproc+001000 4 lrud ACTIVE 0004008 0000000 000000004000A190 0 0006
pvproc+001400 5 vmptacrt ACTIVE 000500A 0000000 000000007000C190 0 0001
pvproc+001800 6 psmd ACTIVE 000600C 0000000 000000006000E190 0 0006
pvproc+001C00 7 xmfreed ACTIVE 000700E 0000000 0000000090010190 0 0002
pvproc+002000 8 memp_rbd ACTIVE 0008010 0000000 0000000080012190 0 0001
pvproc+002400 9 memgrdd ACTIVE 0009012 0000000 00000000B0014190 0 0001
pvproc+002800 10 psgc ACTIVE 000A014 0000000 00000000A0016190 0 0001
pvproc+002C00 11 pilegc ACTIVE 000B016 0000000 0000000000002190 0 0003
pvproc+003000 12 xmgc ACTIVE 000C018 0000000 000000017002C190 0 0001
pvproc+003400 13 wait ACTIVE 000D01A 0000000 000000016002E190 0 0001
pvproc+003800 14 wait ACTIVE 000E01C 0000000 0000000190030190 0 0001
pvproc+003C00 15 wait ACTIVE 000F01E 0000000 0000000188033190 0 0001
pvproc+004800 18 aioserve ACTIVE 0012056 0000001 0000000330164190 0 0001
pvproc+004C00 19 lvmbb ACTIVE 0013046 0000000 00000001E00BE190 0 0001
pvproc+005000 20 aioserve ACTIVE 001407A 0000001 0000000310160190 0 0001
pvproc+005400 21 dog ACTIVE 0015080 0000000 00000000C811B190 0 0009
pvproc+005800 22 aioserve ACTIVE 0016084 0000001 0000000352568190 0 0001
pvproc+005C00 23 aioserve ACTIVE 001706E 0000001 00000002F015C190 0 0001
pvproc+006000 24 aioserve ACTIVE 0018062 0000001 0000000350168190 0 0001
pvproc+006400 25 aioserve ACTIVE 001905C 0000001 000000037016C190 0 0001
pvproc+006800 26 dtlogin ACTIVE 001A05E 0000001 0000000038185480 0 0001
pvproc+006C00 27 syncd ACTIVE 001B07C 0000001 0000000388173480 0 000E
pvproc+007000 28 oracle ACTIVE 001C09E 0000001 00000003B37F4590 0 0001
(0)>
001C09E进程的slot号是28
(0)> tpid 001C09E (因为这个进程没有多个线程,所以只有一行)
SLOT NAME STATE TID PRI RQ CPUID CL WCHAN
pvthread+013E00 318 oracle SLEEP 13E07F 03C 2 0 F10006000079C8C8
(0)> (换一个看看多线程)
(0)> tpid 001B07C
SLOT NAME STATE TID PRI RQ CPUID CL WCHAN
pvthread+003B00 59 syncd SLEEP 03B0F1 03C 2 0 F10001003D7AF4B0
pvthread+004200 66 syncd SLEEP 0420F9 03C 0 0 F10001003D907FB0
pvthread+803900 32825 syncd SLEEP 0391F7 03C 6 0 F1000110114713B0
pvthread+004100 65 syncd SLEEP 0410F5 03C 2 0 F10001003D5E2130
pvthread+803800 32824 syncd SLEEP 0381F3 03C 6 0 F10001100FFB24B0
pvthread+004000 64 syncd SLEEP 0400F1 03C 0 0 F10001003D99FF30
pvthread+803700 32823 syncd SLEEP 0371EF 03C 6 0 F1000110145BE630
pvthread+003F00 63 syncd SLEEP 03F0ED 03C 2 0 F10001003D598BB0
pvthread+803600 32822 syncd SLEEP 0361EB 03C 6 0 F100011011471230
pvthread+003E00 62 syncd SLEEP 03E0E9 03C 2 0 F10001003D5DD4B0
pvthread+803500 32821 syncd SLEEP 0351E7 03C 4 0 F10001101412BB30
pvthread+001C00 28 syncd SLEEP 01C0B7 03C 0 0 F10001003D743D30
pvthread+803400 32820 syncd SLEEP 0341E5 03C 2 0
pvthread+003100 49 syncd SLEEP 0310FF 03C 0 0 F10001003D9073B0
(0)> (你怎么知道001B07C有多个线程呢? 看上个命令结果中 #THS 一列 E代表有14个线程 包含自身)
(还回到oracle的一个LOCAL进程继续...)
(0)> tpid 001C09E (因为这个进程没有多个线程,所以只有一行)
SLOT NAME STATE TID PRI RQ CPUID CL WCHAN
pvthread+013E00 318 oracle SLEEP 13E07F 03C 2 0 F10006000079C8C8
(注意这个进程的这个线程的 thread slot=318 !! 下面查看线程318的信息)
(0)> th 318
SLOT NAME STATE TID PRI RQ CPUID CL WCHAN
pvthread+013E00 318 oracle SLEEP 13E07F 03C 2 0 F10006000079C8C8
NAME................ oracle
WTYPE............... WEVENT
.................tid :000000000013E07F ......tsleep :FFFFFFFFFFFFFFFF
...............flags :00000000 ..............flags2 :00000000
...........pmcontext :00000000
DATA.........pvprocp :F100070F00007000
LINKS.....prevthread :F100070F10013E00
..........nextthread :F100070F10013E00
DISPATCH.......synch :FFFFFFFFFFFFFFFF
SCHEDULER...affinity :00000002 .................pri :0000003C
.............boosted :00000000 ...............wchan :F10006000079C8C8
...............state :00000003 ...............wtype :00000001
MISC ..tv_eyec :7076746850524F43 (pvthPROC)
CHECKPOINT......vtid :00000000 .............chkfile :0000000000000000
LOCK........ lock_d @ F100070F10013E30 0000000000000000
PROCFS......procfsvn :0000000000000000
NUMA............rset :0000000000000000
PROFILING.....prbase :0000000000000000 ....prpinned :0000000000000000
.....prflags :00000000 ..........prbufcount :00000000
WLM........class/wlm :00/0000
.............wlm_tag :
THREAD.......threadp :F10001003D9CB400 ........size :00000100
FLAGS............... WAKEONSIG WAKEONCHKPT CDEFER
(0)> more (^C to quit) ?
查询这个318线程的用户区
(0)> u 318 |grep "exec file"
Current exec file information:
exec file..oracle
(0)>
使用这个途径,可以跟踪defunct进程的以前命令是什么。
How to find the process that created a zombie when its PPID is 1 ?
当PPID=1时,可以借助于trace来定位是哪个进程在创建zombie进程。
注意:仅当新的zombie创建时,才能跟踪到。
此处没有实验,只记一下笔记。
# trace -an -J tidhk ; sleep 5 ; trcstop
# trcrpt -O pid=on,ids=on,timestamp=3 > /tmp/trace.out
# more /tmp/trace.out
...
...
106 dispatch: cmd=ksh pid=18784 tid=46551 priority=60
old_tid=28153 old_priority=62 CPUID=0
139 fork: pid=24218 tid=28155
106 dispatch: cmd=ksh pid=24218 tid=28155 priority=60
old_tid=46551 old_priority=60 CPUID=0
134 exec: cmd=sleep 5 pid=24218 tid=28155
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/271063/viewspace-1060114/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/271063/viewspace-1060114/