pio and topio

pio与topio程序介绍
pio与topio是一组用于监控系统中进程I/O活动的工具。pio显示进程的读写操作统计数据,而topio则根据这些数据进行排序,帮助用户识别I/O密集型进程。这两个工具支持多个操作系统,包括Solaris、HP-UX和Windows。

pio and topio

I originally wrote this program for Solaris. Now I have Windows and HP-UX versions. For Solaris, read on until the end of purple background. For Windows, click here. For HP-UX, you can jump here but I recommend you glance through the Solaris section first.

top or prstat command cannot do. Jump to the most useful command if you're impatient.

 

For instance, the following command shows that your current shell has read and written 10994 characters so far (we'll talk about other columns later).

$ pio -p $$
PID     InpBlk  OutpBlk RWChar  MjPgFlt Comm
392     0       33      10994   0       -ksh

You can look at a process continuously ( -H removes header)
$ pio -p 554
PID     InpBlk  OutpBlk RWChar  MjPgFlt Comm
554     991     0       116099  922     find /
$ while true; do
> pio -Hp 554
> sleep 1
> done
554     2266    0       243623  2095    find /
554     2339    0       251878  2166    find /
554     2408    0       258030  2229    find /
^C$

or look at all processes (note that pio doesn't need SUID permission to look at other users' processes, and -A prevents header from being printed)
$ pio -A #columns are pid, InpBlk, OutpBlk, RWChar, MjPgFlt, Comm
0       280     1       0       93      sched
0       280     1       0       93      sched
0       280     1       0       93      sched
1       125     1       68121   92      /etc/init -
2       0       0       0       0       pageout
3       0       210     0       0       fsflush
342     12      4       8691    11      /usr/lib/saf/sac -t 300
...

The next useful thing to do is write a program to sort the RWChar column. I wrote a Perl script specifically for this purpose, appropriately named topio.

$ topio
** WARNING: Running topio without -d may not be **
** what you want. Type topio -h for help.       **
PID     InpBlk  OutpBlk RWChar  MjPgFlt Command
338     530     4       2394164 408     /usr/openwin/bin/Xsun :0 -nobanner -defdepth 24 -auth /var/dt/A:0-oHaGLa
368     96      5       2027102 86      /usr/lib/ssh/sshd
363     188     0       834232  163     dtgreet -display :0
348     196     0       679939  147     /usr/sfw/sbin/snmpd
388     0       4       291163  0       /usr/lib/ssh/sshd
214     7       0       128703  7       /usr/sbin/inetd -s
350     50      1       84051   40      /usr/dt/bin/dtlogin -daemon
1       125     1       68121   92      /etc/init -
247     0       3       57536   0       /usr/lib/utmpd
314     8       0       34965   4       /usr/lib/snmp/snmpdx -y -c /etc/snmp/conf
^C$
pio.c and type gcc -o pio pio.c. Also download topio and read the line below #!. You can put pio and topio in /usr/local/bin and chmod 755. Compared to previous version pio.c and topio, the current version additionally probes the process major page fault in the hope that true disk I/O excluding page cache I/O can be deduced. More research is being done on this. (Note for x86 Solaris users: gcc 3 has problems with some headers. Use gcc 2.95 instead, unless you want to fix the header files)

Probably the most useful of pio and topio is the -d option of topio, which sorts based on the delta or difference of the process Read/Write Characters between two consecutive runs. (While the examples above are run on my laptop, the screen shot below is captured on a server so the numbers differ.)

$ topio -d -s2 -n5	#display 5 top Delta-I/O processes every 2 seconds
--PID-------RWChar-----DltRWC-----MjPgFlt-DltMPF Command------------------------
 8872     64025286    1835008           7      0 ora_dbw0_ORATEST
 5945    289675626      56832          62      0 ora_lgwr_ORATRN
 5947    497918441      49152         324      0 ora_ckpt_ORATRN
 5773   3917910706      28672          47      0 ora_lgwr_INTTST
 5943   3609392512      16384           1      0 ora_dbw0_ORATRN
--PID-------RWChar-----DltRWC-----MjPgFlt-DltMPF Command------------------------
 8874   3130856681    2108416         112      0 ora_lgwr_ORATEST
18528     11064724     831589           0      0 oracleORATEST
 5945    289729898      54272          62      0 ora_lgwr_ORATRN
 5775   1537267122      49152         223      0 ora_ckpt_INTTST
 8876    302708422      16384         241      0 ora_ckpt_ORATEST
--PID-------RWChar-----DltRWC-----MjPgFlt-DltMPF Command------------------------
 8872     68752070    4726784           7      0 ora_dbw0_ORATEST
 8874   3132257001    1400320         112      0 ora_lgwr_ORATEST
18528     11361185     296461           0      0 oracleORATEST
18526     48811015     178640         113      0 oracleORATEST
 5775   1537381810     114688         223      0 ora_ckpt_INTTST
^C$

Process 5773 has the highest absolute I/O's under RWChar column according to topio output (without -d, not shown here). But its delta I/O, difference of absolute I/O's between two consecutive runs, only shows up near the top occasionally. This process happens to be an Oracle background process LGWR which writes to the redo logfiles of INTTST database. This LGWR process at the time we captured wrote 28672 bytes to logfiles in a 2 second period. (LGWR does not read, unless the database is being recovered from crash.) If you're only checking Oracle processes' I/O, you may want to supplement this information with that offered by Oracle's tools such as the statistics collected in Oracle v$sess_io view. (Unfortunately v$sess_io doesn't record physical writes.)

Download source code

How does it work? Before Solaris 10,note1 there're two ways to get the I/O count of a process on Solaris. Brendan Gregg's psio Perl program uses the prex utility to probe into kernel and filter on a specific process. My pio, originally written by looking at Jim Mauro and Richard McDougall's msacctnote2 published in Appendix C of Solaris Internals, fetches the I/O count from /proc filesystem. (I'm not using microstate accounting as in Jim's program, which is essential in CPU costing but would pose some performance overhead.) Basically, pio gets process I/O statistics from /proc/pid/usage, specifically the fields pr_inblock, pr_oublock and pr_ioch of struct prusage, as explained on pp.314-5 of Solaris Internals and proc(4) man page. You may wonder how much precious information is collected by our UNIX box without ever being used! That's right. If you don't write programs like this to fetch the data, they're collected and simply thrown away.

What the numbers mean pr_inblock and pr_oublock are generally not very useful. According to Adrian Cockcroft, "inblock and outblock [sic] counters are uninteresting as they only refer to filesystem metadata for the old-style buffer cache". Indeed, beginning with Solaris 2, the old buffer cache is largely replaced by page cache and is only used to store metadata. So if you see occasional number jump in InpBlk and OutpBlk, it is, for instance, because the allocated file blocks needs to be extended/shrunk to accomodate more/less data, so the inode is updated. What I observed is, when a process continuously does I/O, RWChar keeps increasing. InpBlk and OutpBlk remain the same for some time and suddenly jump, remain the same for a while again and jump again. But the ratio of this jump in blocks to the number of characters incremented in RWChar is not consistent for each file. That's why the 2nd and 3rd columns of pio output don't look important to me.

The statistic pr_ioch or Read/Write Characters lumps reads and writes together and there's no way to separate them. The only workaround I can think of is something like

#trace read/write syscalls, redirect stderr (which truss outputs to) to Perl filter,
#which prints syscall return value, i.e. number of chars read/written
truss -t read,pread -p pid 2>&1 | perl -nle '/= (/d+)$/; print $1'
truss -t write,pwrite -p pid 2>&1 | perl -nle '/= (/d+)$/; print $1'
You can do some math there to sum up the return values for a period of time. If scatter-gather I/O is used, you may want to add readv and writev to -t.

Another problem with pio is that RWChar includes all kinds of I/O, i.e. disk I/O as well as terminal and network I/O. If somebody has left the top program running for a long time (because he doesn't know the lighter-weight prstat!), topio may show this top process has accumulated a lot of RWChar, and possibly a lot of delta I/O in topio -d output, particularly if top was launched with a short interval (like top -s1). You can test this problem of pio and topio with a tight loop of echo "some characters" without a sleep in the loop. The current version of my program incorporates major page fault statistic in order to hopefully uniquely identify real disk I/O. Fortunately people often use topio to monitor daemon processes including Oracle server processes. So terminal I/O is completely off. But disk I/O and network I/O (if any) are still mixed.

____________________
note1DTrace facility which can be used to provide process I/O statistics. Solaris 10 has the powerful

note2

Solaris section, I only highlight a few points here. pio on HP-UX tells you how many read and write operations a process has performed. topio sorts all processes by either reads or writes.

$ pio -p $$
PID     InpOps  OutpOps MjPgFlt Comm
25240   8       16      0       sh

$ topio -n3 -s2 -kW #display 3 top Delta-Write processes every 2 seconds
--PID ProcName--------- -----Reads ---DltR -----Writs ---DltW -----PFlts ---DltF
   52 vxfsd                    225       0     322383       6          0       0
 1700 midaemon                   0       0          0       0          0       0
13730 ia64_corehw                0       0          0       0          0       0
--PID ProcName--------- -----Reads ---DltR -----Writs ---DltW -----PFlts ---DltF
   52 vxfsd                    225       0     322388       5          0       0
 1429 java                     191       0       8080       1        794       0
 1700 midaemon                   0       0          0       0          0       0

While the Solaris version lumps read and write characters together, the HP-UX version separately counts input and output, and it counts read and write operations, not number of characters. In addition, the HP-UX version no longer needs -d to sort on deltas.

Download source code pio.c and type cc -D_PSTAT64 -o pio pio.c. Also download topio and read the line below #!. You can put pio and topio in /usr/local/bin and chmod 755. [Dec 2008, Alexander Beyn comments "on HPUX 11.00, I had to #define _RUSAGE_EXTENDED before sys/pstat.h was included, otherwise pst_inblock and pst_oublock were not part of the pst_status structure...It looks like HP-UX 11.11 (released in 2000) and newer expose those fields without _RUSAGE_EXTENDED."]

How does it work? pio fetches I/O statistics from pstat, specifically pst_inblock and pst_oublock fields of struct pst_status. You can see these fields in /usr/include/sys/pstat/pm_pstat_body.h (thanks to Don Morris and Christof Meerwald on the newsgroup). Note that judging by the names, you would think they represent number of input and output blocks, just like pr_inblock and pr_oublock on Solaris. But the header file comment says they are block input and output operations.

Windows

Windows Task Manager allows you to view process statistics. On 2000, XP and above, if you go to View | Select Columns, you can add I/O-related counters. There are, however, two limitations. First, Task Manager can't display processes on a remote computer. Second, the I/O counters are absolute values accumulated since process startup. The absolute values answer the question such as "What process has done the most reading in bytes or in number of times of read?" But in reality, one would ask another question more often, "What process currently is doing the most read?" My topio program answers the second question. Here's a screen shot showing top 5 processes on server 123.45.67.89 every 2 seconds sorted by delta write bytes (DltWBts column). I launched Winzip to compress some files right after I started topio.

D:/>perl d:/systools/topio.pl -m123.45.67.89 -n5 -s2 -kw
--PID ProcName---- -----RBytes DltRBts --Reads DltR -----WBytes DltWBts --Writs DltW -----CBytes DltCBts ---Cntls DltC --PFlts DltF
 1864 WinMgmt         24645682   16410    4973   90     5778555   48320   20850   87   448413706       0  7765109   14 1367503  257
  304 SERVICES       377067071    2208 6365120   48   618084696   32980 5756074   51    49798992     510  5599270   56   82296    3
    8 System             34644       0      83    0   504278961    8258  708219   12    71108078       0  4601016   11   41190    5
  316 LSASS            6986726    5024  102785   10    10817344    1756   84879    9     7439645       8   185753    8   48880   25
  552 svchost           562472    1157     999    4      265419    1286     605    2      140400       0     3367    6    3141    1
--PID ProcName---- -----RBytes DltRBts --Reads DltR -----WBytes DltWBts --Writs DltW -----CBytes DltCBts ---Cntls DltC --PFlts DltF
 4256 WINZIP32         1396315 1396168      60   55      216465  216463      20   19       22268    9218      796  417    1506  196
 1864 WinMgmt         24662620   16938    5067   94     5828011   49456   20942   92   448413706       0  7765123   14 1367530  262
  304 SERVICES       377067959     888 6365136   16   618119520   34824 5756100   26    49807164    8172  5599355   85   82296    0
  316 LSASS            6993022    6296  102827   42    10821296    3952   84910   31     7439733      88   185814   61   48880    0
  552 svchost           563629    1157    1003    4      266705    1286     607    2      140400       0     3373    6    3142    1
^C

You might think that if a process is doing a lot of I/O, it must be burning a lot of CPU. That's not always true or obvious. For instance, when the Oracle database is running on my laptop but with all database sessions idle, I notice hard disk activity about once every three seconds. Task Manager doesn't show oracle.exe as a top CPU process. My topio does. A trivial example can also be set up where a process is doing nothing but busy loop on null operation, while another process genuinely reads a big file, and the first process is higher on CPU usage. These are the cases where topio can be of some use. It actually can sort on any I/O counters, including page faults, which hopefully can be used to deduce real disk I/O instead of I/O against system cache. (Task Manager has PF Delta column, equivalent to my DltF, but I include it here for your convenience.) One caveat, though, is that all these I/O counters lump disk, terminal and network I/O's together. There's no way to separate them out. You have to use other information to know which of the three types it really is. But generally, a Windows service process such as oracle.exe has no terminal I/O so you can eliminate that.

Download

pio.vbs and topio.pl to the same folder, say, d:/systools. (Rename the files to pio.vbs and topio.pl after you download.) Change the $PIO line in topio.pl as needed. Download and install ActivePerl. First, type perl d:/systools/topio.pl -h to verify Help works. Then type perl d:/systools/topio.pl to run the program with all default values. (If you have associated .pl with perl.exe, you can try just topio.pl. But I find that it may mess up command line options. Always prepending perl solves the problem.) Please read help first or find the help in the Usage part of topio.pl source code. Make sure your console window is 132 characters wide to avoid line wrapping.

How does it work? topio is a Perl script that sorts on values supplied by pio.vbs, a VBScript that fetches I/O-related statistics for all processes running on the system. The statistics are collected by WMI (Windows Management Instrumentation) so make sure that service is not stopped on the target machine. pio.vbs uses Microsoft WMI class Win32_PerfRawData_PerfProc_Process to obtain this data.

The functionality of topio may eventually be merged into pstats, another freeware tool for Windows. The reason I'm developing topio separately is that Microsoft WMI class Win32_PerfRawData_PerfProc_Process is totally flawed: all those "... per second" counters are not per-second values at all; instead they're cumulative since process startup (see my message posted to the Microsoft official newsgroup without an answer.) Until that problem is addressed by Microsoft, we have to sort on delta values for I/O as well as CPU usage counters. Only memory counters can stay as absolute values, because a question "What process is using the most memory now?" is more practical than "What process has gained the most memory in the past few seconds?" If you do need an answer to the second question, pmon has a "Mem Diff" column in its output.

Other OSes The only other major OSes I care about that I can't port my program to are AIX and Linux. Unfortunately, Linux does not have process I/O usage info either in the proc filesystem or getrusage call . If we really want process I/O count, we may have to write a kernel module to catch read(2) and write(2) syscalls. I think this is exactly what AT Consultancy's atop does. Also see this thread. Alternatively, SystemTap's uid-iotop can be used as well. For AIX, alternatives are discussed here. I tried nmon. It works as expected on AIX (press t then 5 to sort on I/O). They also have nmon for Linux. But on Linux, it actually sorts on delta page faults.

[Update 2010-04] Linux has been planning to add /proc/pid/io to the mainstream build for some time. For example, kernel 2.6.18-164 already has it; in fact it appeared much earlier, if TASK_DELAY_ACCT and TASK_IO_ACCOUNTING are configured in the kernel. A Red Hat DTrace article talks about it. It's been a FAQ. Fortunately, Guillaume Chazarain already implemented iotop based on this IO counters. The only minor issue is that he used fancy features of Python. Before long, Red Hat 6 will be widely deployed and it comes with iotop. So I'm hesitating whether to port my topio to Linux. If I do, it will run on a Linux box as long as you have /proc/pid/io, and it definitely will not require anything else.

To my Computer Page

Jim Mauro's msacct uses printf("%ld".. for process usage. I changed it to printf("%lu".. in pio.c. Otherwise numbers greater than 2 billion would show as negative. They're defined as unsigned long anyway.

 

 

 

 

 

 

--------------------------------------------------------------------

HP监控:

pio源码:

/*
pio (Process I/O, Ver 1.0): Process I/O, shows I/O statistics for processes.
(C) Yong Huang, 2006
yong321.freeshell.org/freeware/pio.html
Thanks to Don Morris and Christof Meerwald for helping with HPUX port
http://groups.google.com/group/comp.sys.hp.hpux/browse_frm/thread/fa83e5a31c9864a6
*/

#include <stdio.h>
#include <sys/pstat.h>

static char *command;
void probe_one(pid_t);
void print_usage(void);
void probe_all(void);

int nl = 1;

main(int argc, char **argv)
{
  pid_t pid;
  int hdr = 1, probeall = 0, i;

  if ((command = (char *)strrchr(argv[0], '/')) != NULL)
    command++;
  else
    command = argv[0];

  if (argc <= 1)
  {
    print_usage();
    exit(1);
  }

  while ((i = getopt(argc, argv, "HhnAp:")) != EOF)
   {
    switch(i)
    {
      case 'H':			/* No header */
	hdr = 0;
	break;
      case 'h':			/* Show Help or Usage */
	print_usage();
	exit(1);
	break;
      case 'n':			/* No Newline at line end */
	nl = 0;
	break;
      case 'p':			/* Process to be Probed */
	pid = atoi(optarg);
	break;
      case 'A':
	probeall = 1;		/* All processes to be probed */
	pid = 0;
	hdr = 0;
	break;
     }
   }

  if (hdr == 1)
    printf("PID/tComm/tInpOps/tOutpOps/tMjPgFlt/n");

  if (pid)
    probe_one(pid);
  else if (probeall)
    probe_all();

  exit(0);
}

void probe_one(pid_t pid)
{
  struct pst_status pst;

  if (pstat_getproc(&pst, sizeof(pst), (size_t)0, pid) != -1)
    printf("%d/t%s/t%lu/t%lu/t%lu/n", (int)pst.pst_pid, pst.pst_ucomm, (long)pst.pst_inblock, (long)pst.pst_oublock, (long)pst.pst_majorfaults);
  else
    perror("pstat_getproc");

  if (nl == 1)
    printf("/n");
}

void probe_all()
{ /* modified from Example 5 of `man pstat` */
#define BURST ((size_t)10)

  struct pst_status pst[BURST];
  int i, count;
  int idx = 0; /* index within the context */

  /* loop until count == 0, will occur when all have been returned */
  while ((count=pstat_getproc(pst, sizeof(pst[0]),BURST,idx))>0) {
    /* got count (max of BURST) this time. process them */
    for (i = 0; i < count; i++) {
      printf("%d/t%s/t%lu/t%lu/t%lu/n", (int)pst[i].pst_pid, pst[i].pst_ucomm, (long)pst[i].pst_inblock, (long)pst[i].pst_oublock, (long)pst[i].pst_majorfaults);
    }

    /*
     * now go back and do it again, using the next index after
     * the current 'burst'
     */
    idx = pst[count-1].pst_idx + 1;
  }

  if (count == -1)
    perror("pstat_getproc()");

#undef BURST
}

void print_usage()
{
  (void) fprintf(stderr, "Usage: %s [-Hhn] -p <PID>/n/t-H: no header/n/t-h: help/n/t-n: no newline at line end/n/t-p: process ID follows/n/t-A: print I/O stats for all processes/n", command);
}
--------------------------------------
topio源码:
#!/usr/bin/perl
#Modify the first line and the line below ($PIO =...) as needed.
#topio for HP-UX 1.0: This Perl script launches pio (Process I/O) to probe all
#processes on the system and sort them based on delta Read/Write Operations
#for each process.
#(C) Yong Huang, 2006,2010
#http://yong321.freeshell.org/freeware/pio.html

$PIO = "/usr/local/bin/pio";    #path to pio

########## No need to modify below this line but hacking is welcome. ##########

use Getopt::Std;

getopts('s:n:k:h');

if (defined $opt_h)
 { print "Usage: $0 [-s Delay] [-n Top_n_lines] [-k sortkey] [-h]
    -s: Number of seconds delay between calls (default 5)
    -n: Only show top n processes (default 10)
    -k: sort key (default DltR)
        R: DltR (delta reads); I: Reads;
        W: DltW (delta writes); O: Writes
        f: DltPF (delta page faults); F: PF
    -h: Show this Usage
    Example (show top 3 processes every 2 seconds, sorted by DltW): perl $0 -s2 -n3 -kW/n";
   exit;
 }

$opt_s = 5 if !defined $opt_s;
$opt_n = 10 if !defined $opt_n;
$opt_k = "R" if !defined $opt_k;

#Format to be used by write:
#pid, name, reads, delta reads, writes, delta writes, pagefaults, delta pagefaults
format =
@>>>> @<<<<<<<<<<<<<<<< @>>>>>>>>> @>>>>>> @>>>>>>>>> @>>>>>> @>>>>>>>>> @>>>>>>
$_,$HoA{$_}[0],$HoA{$_}[1],$dltrd{$_},$HoA{$_}[2],$dltwt{$_},$HoA{$_}[3],$dltpf{$_}
.

undef $/;
while (1)
 { $_ = qx{$PIO -A};	#slurp in all pio -A output

   @lines = split //n/;
   %allpids = ();	#@allpids is used to delete pid's that're gone between
			#calls to pio
   foreach (@lines)
    { #Each line is: pid ProcName Rds Wts MjPgFlt
      next unless /^(/d+)/t(.*)/;
      $pid = $1;
      #@pval: all last 4 columns of one line, i.e. values for this pid
      #pval[0]:ProcName; [1]:Rds; [2]:Wts; [3]:MjPgFlt
      @pval = split //t/,$2;

      if (defined $HoA{$pid}[1]) #it's not defined in first iteration
       {
         $dltrd{$pid} = $pval[1] - $HoA{$pid}[1];    #Delta Rds
         $dltwt{$pid} = $pval[2] - $HoA{$pid}[2];    #Delta Wts
         $dltpf{$pid} = $pval[3] - $HoA{$pid}[3];    #Delta PFs
       }
      $HoA{$pid} = [ @pval ];	#$pid is key to this Hash of Array

      $allpids{$pid} = 1;	#hash for all current processes
    }

   foreach (keys %HoA)
    { if (! exists $allpids{$_}) #this process has disappeared, clean the hashes
       { delete $HoA{$pid}; delete $dltrd{$_};
         delete $dltwt{$_}; delete $dltpf{$_};
       }
    }

   if (defined $show)	#prevent printing all 0's the first time around
    { $n = 0;
      print "--PID ProcName--------- -----Reads ---DltR -----Writs ---DltW -----PFlts ---DltF/n";

      if ($opt_k eq "R") #sort on delta reads
       { foreach (sort { $dltrd{$b} <=> $dltrd{$a} } keys %dltrd)
          { write; last if ++$n == $opt_n; }
       }
      elsif ($opt_k eq "I") #sort on reads
       { foreach (sort { $HoA{$b}[1] <=> $HoA{$a}[1] } keys %HoA)
          { write; last if ++$n == $opt_n; }
       }
      elsif ($opt_k eq "W") #sort on delta writes
       { foreach (sort { $dltwt{$b} <=> $dltwt{$a} } keys %dltwt)
          { write; last if ++$n == $opt_n; }
       }
      elsif ($opt_k eq "O") #sort on writes
       { foreach (sort { $HoA{$b}[2] <=> $HoA{$a}[2] } keys %HoA)
          { write; last if ++$n == $opt_n; }
       }
      elsif ($opt_k eq "f") #sort on delta pagefaults
       { foreach (sort { $dltpf{$b} <=> $dltpf{$a} } keys %dltpf)
          { write; last if ++$n == $opt_n; }
       }
      elsif ($opt_k eq "F") #sort on pagefaults
       { foreach (sort { $HoA{$b}[3] <=> $HoA{$a}[3] } keys %HoA)
          { write; last if ++$n == $opt_n; }
       }

     }
   $show = 1;

   sleep $opt_s;
 }
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值