采集指标的地址(以HBase39集群的HDFS为例):http://xxxxxx:50070/jmx?qry=Hadoop:service=NameNode,name=*
一、NameNode文件系统详细信息(核心指标)
Hadoop:service=NameNode,name=FSNamesystem
Hadoop:service=NameNode,name=FSNamesystemStat
Metric | Type(GAUGE,COUNTER) | 类型 | 业务意义 | 备注 |
---|---|---|---|---|
MissingBlocks | GAUGE | Current number of missing blocks | ||
ExpiredHeartbeats | GAUGE | Total number of expired heartbeats | ||
TransactionsSinceLastCheckpoint | GAUGE | Total number of transactions since last checkpoint | ||
TransactionsSinceLastLogRoll | GAUGE | Total number of transactions since last edit log roll | ||
LastCheckpointTime | GAUGE | ms | Time in milliseconds since epoch of last checkpoint | |
CapacityTotal | GAUGE | Byte | Current raw capacity of DataNodes in bytes | |
CapacityUsed | GAUGE | Byte | Current used capacity across all DataNodes in bytes | |
CapacityRemaining | GAUGE | Byte | Current remaining capacity in bytes | |
TotalLoad | GAUGE | Current number of connections | ||
SnapshottableDirectories | GAUGE | Current number of snapshottable directories | ||
Snapshots | GAUGE | Current number of snapshots | ||
BlocksTotal | GAUGE | 块数量 | ||
FilesTotal | GAUGE | 文件数量 | ||
NumLiveDataNodes | GAUGE | 活跃的DN数量 | ||
NumDeadDataNodes | GAUGE | 死掉的DN数量 | ||
NumDecomLiveDataNodes | GAUGE | 活跃的DN中处于“ Decommission”的数量 | ||
NumDecomDeadDataNodes | GAUGE | 死亡的DN中处于“ Decommission”的数量 |
二、NameNode JvmMetrics详细信息(核心指标)
Hadoop:service=NameNode,name=JvmMetrics
Metric | Type(GAUGE,COUNTER) | 类型 | 业务意义 | 备注 |
---|---|---|---|---|
GcCountParNew | COUNTER | 新生代GC次数 | ||
GcTimeMillisParNew | COUNTER | ms | 新生代GC耗时(ms) | |
GcCountConcurrentMarkSweep | COUNTER | 老年代GC次数 | ||
GcTimeMillisConcurrentMarkSweep | COUNTER | ms | 老年代GC耗时(ms) | |
GcCount | COUNTER | 总的GC次数 | ||
GcTimeMillis | COUNTER | ms | 总的GC耗时(ms) |
三、NameNode操作信息(核心指标)
Hadoop:service=NameNode,name=NameNodeActivity
Metric | Type(GAUGE,COUNTER) | 类型 | 业务意义 | 备注 |
---|---|---|---|---|
CreateFileOps | COUNTER | Total number of files created | ||
FilesCreated | COUNTER | Total number of files and directories created by create or mkdir operations | ||
FilesAppended | COUNTER | Total number of files appended | ||
GetBlockLocations | COUNTER | Total number of getBlockLocations operations | ||
FilesRenamed | COUNTER | Total number of rename operations (NOT number of files/dirs renamed) | ||
GetListingOps | COUNTER | Total number of directory listing operations | ||
DeleteFileOps | COUNTER | Total number of delete operations | ||
FilesDeleted | COUNTER | Total number of files and directories deleted by delete or rename operations | ||
FileInfoOps | COUNTER | Total number of getFileInfo and getLinkFileInfo operations | ||
AddBlockOps | COUNTER | Total number of addBlock operations succeeded | ||
GetAdditionalDatanodeOps | COUNTER | Total number of getAdditionalDatanode operations | ||
CreateSymlinkOps | COUNTER | Total number of createSymlink operations | ||
GetLinkTargetOps | COUNTER | Total number of getLinkTarget operations | ||
FilesInGetListingOps | COUNTER | Total number of files and directories listed by directory listing operations | ||
AllowSnapshotOps | COUNTER | Total number of allowSnapshot operations | ||
DisallowSnapshotOps | COUNTER | Total number of disallowSnapshot operations | ||
CreateSnapshotOps | COUNTER | Total number of createSnapshot operations | ||
DeleteSnapshotOps | COUNTER | Total number of deleteSnapshot operations | ||
RenameSnapshotOps | COUNTER | Total number of renameSnapshot operations | ||
ListSnapshottableDirOps | COUNTER | Total number of snapshottableDirectoryStatus operations | ||
SnapshotDiffReportOps | COUNTER | Total number of getSnapshotDiffReport operations | ||
TransactionsNumOps | COUNTER | Total number of Journal transactions | ||
TransactionsAvgTime | GAUGE | ms | Average time of Journal transactions in milliseconds | |
SyncsNumOps | COUNTER | Total number of Journal syncs | ||
SyncsAvgTime | GAUGE | ms | Average time of Journal syncs in milliseconds | |
TransactionsBatchedInSync | COUNTER | Total number of Journal transactions batched in sync | ||
BlockReportNumOps | COUNTER | Total number of processing block reports from DataNode | ||
BlockReportAvgTime | GAUGE | ms | Average time of processing block reports in milliseconds | |
CacheReportNumOps | COUNTER | Total number of processing cache reports from DataNode | ||
CacheReportAvgTime | GAUGE | ms | Average time of processing cache reports in milliseconds | |
SafeModeTime | GAUGE | ms | The interval between FSNameSystem starts and the last time safemode leaves in milliseconds. | |
FsImageLoadTime | GAUGE | Time loading FS Image at startup in milliseconds | ||
GetEditNumOps | COUNTER | Total number of edits downloads from SecondaryNameNode | ||
GetEditAvgTime | GAUGE | ms | Average edits download time in milliseconds | |
GetImageNumOps | COUNTER | Total number of fsimage downloads from SecondaryNameNode | ||
GetImageAvgTime | GAUGE | ms | Average fsimage download time in milliseconds | |
PutImageNumOps | COUNTER | Total number of fsimage uploads to SecondaryNameNode | ||
PutImageAvgTime | GAUGE | ms | Average fsimage upload time in milliseconds |
四、NameNode RPC详细信息(非核心指标,暂不采集)
hadoop:service=NameNode,name=RpcDetailedActivityForPort*
Metric | Type(GAUGE,COUNTER) | 类型 | 业务意义 | 备注 |
---|---|---|---|---|
SetSafeModeNumOps | COUNTER | |||
SetSafeModeAvgTime | GAUGE | ms | ||
GetFileInfoNumOps | COUNTER | Total number of getFileInfo and getLinkFileInfo operations | ||
GetFileInfoAvgTime | GAUGE | ms | ||
GetBlockLocationsNumOps | COUNTER | |||
GetBlockLocationsAvgTime | GAUGE | ms | ||
GetListingNumOps | COUNTER | |||
GetListingAvgTime | GAUGE | ms | ||
GetContentSummaryNumOps | COUNTER | |||
GetContentSummaryAvgTime | GAUGE | ms | ||
MkdirsNumOps | COUNTER | |||
MkdirsAvgTime | GAUGE | ms | ||
SetPermissionNumOps | COUNTER | |||
SetPermissionAvgTime | GAUGE | ms | ||
CreateNumOps | COUNTER | |||
CreateAvgTime | GAUGE | ms | ||
AddBlockNumOps | COUNTER | |||
AddBlockAvgTime | GAUGE | ms | ||
GetServerDefaultsNumOps | COUNTER | |||
GetServerDefaultsAvgTime | GAUGE | ms | ||
CompleteNumOps | COUNTER | |||
CompleteAvgTime | GAUGE | ms | ||
DeleteNumOps | COUNTER | |||
DeleteAvgTime | GAUGE | ms | ||
AppendNumOps | COUNTER | |||
AppendAvgTime | GAUGE | ms | ||
RenameNumOps | COUNTER | |||
RenameAvgTime | GAUGE | ms | ||
FileNotFoundExceptionNumOps | COUNTER | |||
FileNotFoundExceptionAvgTime | GAUGE | ms | ||
SetOwnerNumOps | COUNTER | |||
SetOwnerAvgTime | GAUGE | ms | ||