MongoDB Network 排错笔记

本文详细探讨了如何使用strace、ethtool、sar等工具检查MongoDB中系统调用、网络接口速度、网络饱和度及特定进程消耗网络资源的方法,并通过实例分析了如何利用iftop和mongostat进行网络流量监控与分析,同时指出并修正了关于单位理解的常见误区。文章最后还讨论了官方文档中的单位表述错误,并提供了正确的单位解释。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

那天Snow很Xfan在查这个问题,我也一起看了看,收集了一些命令,以后再碰到类似的问题应该可以节省不少时间。 

1.       I want to check what system call from mongo cost most oftime

[jianxu1@phx7b02c-1d65 ~]$ sudostrace -f-c -p 16931    make sure add –f to record vfork LWP

% time    seconds  usecs/call     calls    errorssyscall

------ ----------- -------------------- --------- ----------------

79.27  90.786037        6783    13385           recvfrom

  8.24   9.439600       8684     1087             nanosleep

  7.87   9.014995        240     37634      6914 futex

  3.00   3.440658       2870     1199           select

  1.47   1.681699     210212         8          restart_syscall

  0.09   0.107694         20     5282           sendto

  0.03   0.034485        454       76           mmap

  0.01   0.009579        195       49           write

  0.00   0.004785         10       492      164 stat

  0.00   0.001218         25       49           fdatasync

  0.00   0.000556         37       15           read

  0.00   0.000342          7       49           lseek

  0.00   0.000153          8       19           close

  0.00   0.000000           0       19           open

  0.00   0.000000          0       15           fstat

  0.00   0.000000          0       15           munmap

  0.00   0.000000          0        8           getdents

------ ----------- -------------------- --------- ----------------

100.00 114.521801                59401      7078 total

 

2.       How to find my network interface speed

[jianxu1@phx7b02c-1d65 ~]$ sudoethtool eth0

Settings for eth0:

       Supported ports: [ TP ]

       Supported link modes:   10baseT/Half 10baseT/Full

                               100baseT/Half 100baseT/Full

                               1000baseT/Full

       Supports auto-negotiation: Yes

       Advertised link modes:  10baseT/Half 10baseT/Full

                               100baseT/Half 100baseT/Full

                               1000baseT/Full

       Advertised pause frame use: No

       Advertised auto-negotiation: Yes

       Speed: 1000Mb/s   network is 1G NIC  --> 这里要提醒大家,Duplex是full,所以1G只是单方向的速度,双向速度是 2G

       Duplex: Full

       Port: Twisted Pair

       PHYAD: 1

       Transceiver: internal

       Auto-negotiation: on

       MDI-X: Unknown

       Supports Wake-on: pumbg

       Wake-on: g

       Current message level: 0x00000003 (3)

       Link detected: yes

 

3.       How to check if my network is saturated

[jianxu1@phx7b02c-1d65 ~]$ sar -nDEV | head -10

Linux 2.6.32-220.el6.x86_64(phx7b02c-1d65.stratus.phx.ebay.com)       11/08/2012     _x86_64_        (24 CPU)

 

12:00:01AM     IFACE   rxpck/s  txpck/s    rxkB/s    txkB/s  rxcmp/s   txcmp/s  rxmcst/s

12:10:01AM        lo     7.52      7.52    18.32     18.32     0.00      0.00      0.00

12:10:01AM      eth0  33121.49  77322.77  27613.92 110934.60     0.00     0.00      0.02   

12:10:01AM      eth1     0.00      0.00     0.00      0.00     0.00      0.00      0.00


我当时的理解是: receive is 27MB/s,send is  110MB/s, our NIC speed is 128MB/s, signal of network saturated

其实我上面的解释是不对的,1G 的duplex=Full的网卡,进出加起来的理论速度峰值应该是 2G,而不是1G。但是理论值不等于实际情况。

 

4.       How to check what process is eating my network resources.

 

When we use mongostat:

[jianxu1@xxxxxxx ~]$mongostat

connected to: 127.0.0.1

insert  query update deletegetmore command flushes mapped  vsize    res faults locked% idx miss %     qr|qw   ar|aw  netIn netOut  conn         set repl       time

     0   500    100     0     142    108       0  32.2g 77.2g    26g      0     6.2          0      0|0    1|0    12m   62m   400 aaaaaaa    M   17:47:18

    0    976    181     0     170    107       0  32.2g 77.2g    26g     0     26.2         0       0|0    0|0    16m   93m   400 aaaaaaa    M   17:47:19

    0   1077    219     0     203    133       0  32.2g 77.2g    26g     0    11.8         0       0|0    2|0    17m   100m   400 aaaaaaa   M   17:47:20

 

NetIn/Out’s unit is bits, so totalnet in/out is around 120 M bits/second, much less than 1G bits/second, meaningmongostat does not cover all network related with mongo.

注意: 我上面关于unit is bits 的判断也是错误的,后面会解释。 


Use iftop !  http://www.ex-parrot.com/~pdw/iftop/

Here’s the place you can download http://pkgs.repoforge.org/iftop/, I downloaded http://pkgs.repoforge.org/iftop/iftop-0.17-1.el6.rf.x86_64.rpmbecause our redhat server is 6.2

 

[jianxu1@xxxxxx ~]$ sudoiftop -i eth0

 

You can press 1 to let it sort ,you can press p to let it display port info, press p again to only displaymachine information.

Here, take  yyyyyyy:43892 in the other side as an example,loginto that machine, use

 

[jianxu1@yyyyyyy~]$ sudonetstat -tup | grep 43892

tcp       0      0 yyyyyyy:43892  xxxxxxx:27017 ESTABLISHED 21811/mongod   // now we know most of network resource is cost by sync among mongo nodes within cluster.

 

I’m thinking of if we should splitthe NICs,  we use dedicated NIC just for SYNC.


我当时送了上面的小结以后,一位叫John 的 资深DBA 纠正了我的错误:

>>NetIn/Out’s unit is bits,so total net in/out is around 120 M bits/second,

>> much less than 1Gbits/second, meaning mongostat does not cover all network related with mongo.

 

I don't think that's correct. Ithink mongostat gets its network data from db.serverStatus().network, whichnames the fields "bytesIn" and "bytesOut". Anyway, thenumbers make a lot more sense if you assume bytes. 100M is about 80% of thetheoretical max for a GigE interface, and is likely to be the practical limitin this case. Note that the interface is full-duplex so you should not add inand out traffic; the limit is 1 GB/sec in and 1 GB/sec out. 


我后来自己调查了一下,特别是NetIn/NetOut的单位问题,John是对的,他后来给 10gen发了Bug,让他们纠正官方文档:

For the unit of netIn andnetOut:  I says it’s bits because I refer to http://docs.mongodb.org/manual/reference/mongostat/ 

netIn

The amount of network traffic, in bits,received by the MongoDB. This includes traffic from mongostat itself.

But after I dig into the sourcecode, it seems the official document is wrong !  soyou are right again J

 

Here’s why I say the officialdocument is wrong:

 

When I check https://github.com/mongodb/mongo/blob/master/src/mongo/tools/stat.cpp à the source code comments says it’s bits.

            out << "   netIn    \t- network traffic in - bits\n";
            out << "   netOut   \t- network traffic out - bits\n";

 

 

Then I dig into the source codehere:

You are right, mongostat actuallyleverage the output of serverStatus(), what mongostat actually does is to diffthe “network” output from serverStatus() between two consecutive samples.

 

Output of serverStatus:

"network" : {

               "bytesIn" : 62159758,

               "bytesOut" : 216745737,

               "numRequests" : 591863

       },

 

https://github.com/mongodb/mongo/blob/master/src/mongo/tools/stat_util.cpp

 

    if ( a["network"].isABSONObj() && b["network"].isABSONObj() ) {

         BSONObj ax = a["network"].embeddedObject();

         BSONObj bx = b["network"].embeddedObject();

         _appendNet( result , "netIn" , diff( "bytesIn" , ax , bx ) ); //jianxu1: ax and bx represents two samples from serverStatus()

         _appendNet( result , "netOut", diff( "bytesOut", ax , bx ) );

    }

 

    double StatUtil::diff( const string& name , const BSONObj& a , const BSONObj& b ) {
        BSONElement x = a.getFieldDotted( name.c_str() );
        BSONElement y = b.getFieldDotted( name.c_str() );
        if ( ! x.isNumber() || ! y.isNumber() )
            return -1;
        return ( y.number() - x.number() ) / _seconds;   à//jianxu1: the result of diff is still bytes now
    }

 

    void StatUtil::_appendNet( BSONObjBuilder& result , const string& name , double diff ) { //jianxu1: just beautify the output, does not do byte to bits logic
        // I think 1000 is correct for megabit, but I've seen conflicting things (ERH 11/2010)
        const double div = 1000;
        string unit = "b";
        if ( diff >= div ) {
            unit = "k";
            diff /= div;
        }
 
        if ( diff >= div ) {
            unit = "m";
            diff /= div;
        }
 
        if ( diff >= div ) {
            unit = "g";
            diff /= div;
        }
        string out = str::stream() << (int)diff << unit;
        _append( result , name , 6 , out );
    }



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值