HDFS block丢失过多进入安全模式(safe mode)的解决方法

 报错信息

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RetriableException): org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create file/user/root/.flink/application_1616479659685_0007/plugins/README.txt. Name node is in safe mode.
The reported blocks 5761 needs additional 14 blocks to reach the threshold 0.9990 of total blocks 5781.
The number of live datanodes 3 has reached the minimum number 1. Safe mode will be turned off automatically once the thresholds have been reached. NamenodeHostName:devcdh2
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1439)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2372)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2318)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:771)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:451)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

 查看HDFS block丢失情况

 hdfs fsck /

 

 

Connecting to namenode via http://devcdh2:9870/fsck?ugi=root&delete=1&path=%2F
FSCK started by root (auth:SIMPLE) from /192.168.6.76 for path / at Tue Mar 23 16:02:31 CST 2021

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/83c4457d-249d-4bbd-a7c6-dcc74eb61b53: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448618

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/83c4457d-249d-4bbd-a7c6-dcc74eb61b53: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448620

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/a1ff8d8b-e377-4d4b-a13b-f7d464d7386c: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448619

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/a1ff8d8b-e377-4d4b-a13b-f7d464d7386c: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/4e6cc205-6f92-49a7-b0af-13d28a2a3f93: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448623

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/4e6cc205-6f92-49a7-b0af-13d28a2a3f93: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448624

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/bdf1ed78-9c1c-4beb-a517-ca9ac407b6d4: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448622

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/bdf1ed78-9c1c-4beb-a517-ca9ac407b6d4: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/8bb95575-6366-4b33-bd75-5b27a503b94c: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448626

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/8bb95575-6366-4b33-bd75-5b27a503b94c: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448628

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/bc2a28b6-df8f-4d32-9ff9-635dd0687337: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448627

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/bc2a28b6-df8f-4d32-9ff9-635dd0687337: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/3c2b5cc6-4b14-480e-807a-2f8a1945bd79: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448630

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/3c2b5cc6-4b14-480e-807a-2f8a1945bd79: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/5efea232-a61a-4a9b-90bc-7b19fb1eebdc: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448631

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/5efea232-a61a-4a9b-90bc-7b19fb1eebdc: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448632

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/5eb9d4d3-62c6-4d06-9972-7bc6f0b19dbe: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448635

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/5eb9d4d3-62c6-4d06-9972-7bc6f0b19dbe: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/71bdf921-e9e8-443c-ad13-ce3e5f58c8de: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448634

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/71bdf921-e9e8-443c-ad13-ce3e5f58c8de: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448636

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/dbecccd1c9948bd10b9ed39062552d46/chk-16747/70d5887c-7987-4ef1-9317-b8993fa284e1:  Under replicated BP-199721927-192.168.6.76-1609304252767:blk_1073857082_116258. Target Replicas is 3 but found 1 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).

/user/flink/checkpoint/dbecccd1c9948bd10b9ed39062552d46/chk-16747/_metadata:  Under replicated BP-199721927-192.168.6.76-1609304252767:blk_1073857084_116260. Target Replicas is 3 but found 1 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).

/user/flink/checkpoint/dbecccd1c9948bd10b9ed39062552d46/chk-16747/eea2bc39-c441-4339-84cb-b5c994fe1ad2:  Under replicated BP-199721927-192.168.6.76-1609304252767:blk_1073857083_116259. Target Replicas is 3 but found 1 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).

/user/flink/ha/completedCheckpoint09231ef9ff2b: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448633

/user/flink/ha/completedCheckpoint09231ef9ff2b: CORRUPT 1 blocks of total size 4505 B.
/user/flink/ha/completedCheckpoint5e461d0cf506: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448621

/user/flink/ha/completedCheckpoint5e461d0cf506: CORRUPT 1 blocks of total size 4505 B.
/user/flink/ha/completedCheckpoint740c86d9dede: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448625

/user/flink/ha/completedCheckpoint740c86d9dede: CORRUPT 1 blocks of total size 4505 B.
/user/flink/ha/completedCheckpointd5d2a51448b6: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448637

/user/flink/ha/completedCheckpointd5d2a51448b6: CORRUPT 1 blocks of total size 4505 B.
/user/flink/ha/completedCheckpointff0168a104e2: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448629

/user/flink/ha/completedCheckpointff0168a104e2: CORRUPT 1 blocks of total size 4505 B.
Status: CORRUPT
 Number of data-nodes:  3
 Number of racks:               1
 Total dirs:                    3883
 Total symlinks:                0

Replicated Blocks:
 Total size:    30817714125 B
 Total files:   5783
 Total blocks (validated):      5781 (avg. block size 5330862 B)
  ********************************
  UNDER MIN REPL'D BLOCKS:      20 (0.34596092 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:        20
  CORRUPT BLOCKS:       20
  CORRUPT SIZE:         40955 B
  ********************************
 Minimally replicated blocks:   5761 (99.65404 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       3 (0.051894136 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.9885833
 Missing blocks:                0
 Corrupt blocks:                20
 Missing replicas:              6 (0.03459609 %)
 Blocks queued for replication: 0

Erasure Coded Block Groups:
 Total size:    0 B
 Total files:   0
 Total block groups (validated):        0
 Minimally erasure-coded block groups:  0
 Over-erasure-coded block groups:       0
 Under-erasure-coded block groups:      0
 Unsatisfactory placement block groups: 0
 Average block group size:      0.0
 Missing block groups:          0
 Corrupt block groups:          0
 Missing internal blocks:       0
 Blocks queued for replication: 0
FSCK ended at Tue Mar 23 16:02:32 CST 2021 in 1798 milliseconds
FSCK ended at Tue Mar 23 16:02:32 CST 2021 in 1800 milliseconds
fsck encountered internal errors!


Fsck on path '/' FAILED
[root@devcdh1 data]# hadoop dfsadmin -safemode leave
WARNING: Use of this script to execute dfsadmin is deprecated.
WARNING: Attempting to execute replacement "hdfs dfsadmin" instead.

Safe mode is OFF in devcdh1/192.168.6.76:8020
Safe mode is OFF in devcdh2/192.168.6.77:8020
[root@devcdh1 data]# hdfs fsck  /  -delete
Connecting to namenode via http://devcdh2:9870/fsck?ugi=root&delete=1&path=%2F
FSCK started by root (auth:SIMPLE) from /192.168.6.76 for path / at Tue Mar 23 16:04:25 CST 2021

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/83c4457d-249d-4bbd-a7c6-dcc74eb61b53: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448618

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/83c4457d-249d-4bbd-a7c6-dcc74eb61b53: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448620

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/a1ff8d8b-e377-4d4b-a13b-f7d464d7386c: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448619

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47124/a1ff8d8b-e377-4d4b-a13b-f7d464d7386c: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/4e6cc205-6f92-49a7-b0af-13d28a2a3f93: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448623

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/4e6cc205-6f92-49a7-b0af-13d28a2a3f93: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448624

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/bdf1ed78-9c1c-4beb-a517-ca9ac407b6d4: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448622

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47125/bdf1ed78-9c1c-4beb-a517-ca9ac407b6d4: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/8bb95575-6366-4b33-bd75-5b27a503b94c: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448626

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/8bb95575-6366-4b33-bd75-5b27a503b94c: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448628

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/bc2a28b6-df8f-4d32-9ff9-635dd0687337: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448627

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47126/bc2a28b6-df8f-4d32-9ff9-635dd0687337: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/3c2b5cc6-4b14-480e-807a-2f8a1945bd79: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448630

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/3c2b5cc6-4b14-480e-807a-2f8a1945bd79: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/5efea232-a61a-4a9b-90bc-7b19fb1eebdc: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448631

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/5efea232-a61a-4a9b-90bc-7b19fb1eebdc: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448632

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47127/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/5eb9d4d3-62c6-4d06-9972-7bc6f0b19dbe: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448635

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/5eb9d4d3-62c6-4d06-9972-7bc6f0b19dbe: CORRUPT 1 blocks of total size 1535 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/71bdf921-e9e8-443c-ad13-ce3e5f58c8de: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448634

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/71bdf921-e9e8-443c-ad13-ce3e5f58c8de: CORRUPT 1 blocks of total size 1240 B.
/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/_metadata: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448636

/user/flink/checkpoint/39e8593bbff53abc516aeeffeaf549bc/chk-47128/_metadata: CORRUPT 1 blocks of total size 911 B.
/user/flink/checkpoint/dbecccd1c9948bd10b9ed39062552d46/chk-16747/70d5887c-7987-4ef1-9317-b8993fa284e1:  Under replicated BP-199721927-192.168.6.76-1609304252767:blk_1073857082_116258. Target Replicas is 3 but found 1 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).

/user/flink/checkpoint/dbecccd1c9948bd10b9ed39062552d46/chk-16747/_metadata:  Under replicated BP-199721927-192.168.6.76-1609304252767:blk_1073857084_116260. Target Replicas is 3 but found 1 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).

/user/flink/checkpoint/dbecccd1c9948bd10b9ed39062552d46/chk-16747/eea2bc39-c441-4339-84cb-b5c994fe1ad2:  Under replicated BP-199721927-192.168.6.76-1609304252767:blk_1073857083_116259. Target Replicas is 3 but found 1 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).

/user/flink/ha/completedCheckpoint09231ef9ff2b: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448633

/user/flink/ha/completedCheckpoint09231ef9ff2b: CORRUPT 1 blocks of total size 4505 B.
/user/flink/ha/completedCheckpoint5e461d0cf506: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448621

/user/flink/ha/completedCheckpoint5e461d0cf506: CORRUPT 1 blocks of total size 4505 B.
/user/flink/ha/completedCheckpoint740c86d9dede: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448625

/user/flink/ha/completedCheckpoint740c86d9dede: CORRUPT 1 blocks of total size 4505 B.
/user/flink/ha/completedCheckpointd5d2a51448b6: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448637

/user/flink/ha/completedCheckpointd5d2a51448b6: CORRUPT 1 blocks of total size 4505 B.
/user/flink/ha/completedCheckpointff0168a104e2: CORRUPT blockpool BP-199721927-192.168.6.76-1609304252767 block blk_1075448629

/user/flink/ha/completedCheckpointff0168a104e2: CORRUPT 1 blocks of total size 4505 B.
Status: CORRUPT
 Number of data-nodes:  3
 Number of racks:               1
 Total dirs:                    3883
 Total symlinks:                0

Replicated Blocks:
 Total size:    30817714125 B
 Total files:   5783
 Total blocks (validated):      5781 (avg. block size 5330862 B)
  ********************************
  UNDER MIN REPL'D BLOCKS:      20 (0.34596092 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:        20
  CORRUPT BLOCKS:       20
  CORRUPT SIZE:         40955 B
  ********************************
 Minimally replicated blocks:   5761 (99.65404 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       3 (0.051894136 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.9885833
 Missing blocks:                0
 Corrupt blocks:                20
 Missing replicas:              6 (0.03459609 %)
 Blocks queued for replication: 0

Erasure Coded Block Groups:
 Total size:    0 B
 Total files:   0
 Total block groups (validated):        0
 Minimally erasure-coded block groups:  0
 Over-erasure-coded block groups:       0
 Under-erasure-coded block groups:      0
 Unsatisfactory placement block groups: 0
 Average block group size:      0.0
 Missing block groups:          0
 Corrupt block groups:          0
 Missing internal blocks:       0
 Blocks queued for replication: 0
FSCK ended at Tue Mar 23 16:04:26 CST 2021 in 763 milliseconds


The filesystem under path '/' is CORRUPT

 解决方案

 hadoop dfsadmin -safemode leave


// 注意,用delete会造成数据的丢失,

 hdfs fsck  /  -delete 
Status: HEALTHY
 Number of data-nodes:  3
 Number of racks:               1
 Total dirs:                    3883
 Total symlinks:                0

Replicated Blocks:
 Total size:    30817673170 B
 Total files:   5763
 Total blocks (validated):      5761 (avg. block size 5349361 B)
 Minimally replicated blocks:   5761 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       3 (0.05207429 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.9989586
 Missing blocks:                0
 Corrupt blocks:                0
 Missing replicas:              6 (0.034716196 %)
 Blocks queued for replication: 0

Erasure Coded Block Groups:
 Total size:    0 B
 Total files:   0
 Total block groups (validated):        0
 Minimally erasure-coded block groups:  0
 Over-erasure-coded block groups:       0
 Under-erasure-coded block groups:      0
 Unsatisfactory placement block groups: 0
 Average block group size:      0.0
 Missing block groups:          0
 Corrupt block groups:          0
 Missing internal blocks:       0
 Blocks queued for replication: 0
FSCK ended at Tue Mar 23 16:04:57 CST 2021 in 452 milliseconds


The filesystem under path '/' is HEALTHY

再次查看状态

hadoop fsck /

在NameNode主节点启动时,HDFS会首先进入安全模式,检查包括文件副本的数量、可用的datanode数量、集群可用block比例等参数。以上参数达到阈值(可配置)后,H即可视为系统达到安全标准,HDFS自动离开安全模式。在安全模式下,文件系统只接受读数据请求,而不接受删除、修改等变更请求。且文件block不能进行任何的副本复制操作,因此达到最小的副本数量要求是基于datanode启动时的状态来判定的,启动时不会再做任何复制(从而达到最小副本数量要求)安全模式相关配置系统什么时候才离开安全模式,需要满足哪些条件?可以根据以下配置内容进行确定 如果有必要,也可以通过命令强制离开安全模式。与安全模式相关的主要配置在hdfs-site.xml文件中,主要有下面几个属性dfs.namenode.replication.min: 最小的block副本数量,默认为1。dfs.namenode.safemode.threshold-pct: 副本数达到最小要求的block占系统总block数的百分比,当实际比例超过该配置后,才能离开安全模式(但是还需要其他条件也满足)。默认为0.999f,也就是说符合最小副本数要求的block占比超过99.9%时,并且其他条件也满足才能离开安全模式。如果小于等于0,则不会等待任何block副本达到要求即可离开。如果大于1,则永远处于安全模式。dfs.namenode.safemode.min.datanodes: 离开安全模式的最小可用datanode数量要求,默认为0。即所有datanode都不可用,仍然可以离开安全模式。dfs.namenode.safemode.extension: 集群可用block比例、可用datanode都达到要求之后,如果在extension配置的时间段之后依然能满足要求,此时集群才离开安全模式。单位为毫秒,默认为1。也就是当满足条件并且能够维持1毫秒之后,离开安全模式。这个配置主要是对集群的稳定程度做进一步的确认。避免达到要求后马上又不符合安全标准。总结一下,要离开安全模式,需要满足以下条件:1)达到副本数量要求的block比例满足要求;2)可用的datanode节点数满足配置的数量要求;3) 1、2 两个条件满足后维持的时间达到配置的要求。
最新发布
04-06
### HDFS安全模式的工作原理 HDFS安全模式是一种保护机制,用于确保集群中的数据块在启动或其他特定情况下能够保持一致性。在这种模式下,HDFS仅允许元数据的读取操作,而禁止任何写入或删除操作[^1]。具体来说: - **工作原理** 当NameNode启动时,它会自动进入安全模式。在此期间,系统会对所有的数据块进行扫描和验证,以确认它们的副本数量是否满足预设的要求。如果发现某些数据块的副本不足,则会在其他DataNode上重新创建这些缺失的副本[^2]。 - **离开条件** NameNode会持续监控整个集群的状态,并计算已报告的数据块占总数据块的比例。当这个比例超过`dfs.namenode.safemode.threshold-pct`所设定的阈值时,NameNode将退出安全模式并恢复正常运行状态。例如,默认值为0.999f,意味着至少要有99.9%的数据块被成功报告后才能离开安全模式。 ### 配置参数详解 以下是几个重要的配置参数及其作用说明: #### `dfs.namenode.safemode.threshold-pct` 此参数决定了HDFS离开安全模式所需的数据块百分比。其默认值为0.999f,表示只有当实际存在的有效数据块达到理论总数的99.9%以上时才会退出安全模式。可以通过调整该值来改变这一行为;比如将其设置为小于等于零的小数值甚至负数可以让系统永不进入安全模式。 ```xml <property> <name>dfs.namenode.safemode.threshold-pct</name> <value>0.99f</value> </property> ``` #### `dfs.namenode.safemode.min.datanodes` 定义了最小活跃DataNodes的数量,在低于这个数目之前不会考虑离开安全模式。即使所有现存blocks都已被汇报完毕但如果连接上的Datanodes未达标准依旧维持现状直到满足要求为止[^3]。 #### `dfs.namenode.safemode.extension` 延长额外等待时间(毫秒单位),即便已经达到上述两个条件之后也会再延缓这么长时间才真正关闭safety mode以便进一步稳定环境状况。 ```xml <property> <name>dfs.namenode.safemode.extension</name> <value>30000</value><!-- 即30秒 --> </property> ``` #### `dfs.namenode.replication.min` 指定了单个文件最低应具备多少份拷贝才算合格完成复制过程的标准之一。假如某文档目前仅有两份copy存放在不同节点里但规定需三份及以上才算达标的话那么仍旧会被视为欠完备从而触发修复动作直至符合预期水平为止。 --- ### 示例代码展示如何查看当前是否处于safe mode以及手动强制退出的方法 下面提供了一段简单的Shell命令用来检测现在是不是处在Safe Mode当中同时也演示怎样人为干预让NN尽快脱离这种情况(注意生产环境下谨慎执行): ```bash # 查看当前是否处于安全模式 hdfs dfsadmin -safemode get # 如果需要立即退出安全模式可使用如下指令 hdfs dfsadmin -safemode leave ``` ---
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值