hadoop的datanode多磁盘空间不统一…

本文讨论了Hadoop DataNode在面对不同大小磁盘时,如何通过配置`dfs.datanode.du.reserved`和`dfs.datanode.du.pct`来预留空间,以及当这些配置无效时,如何手动平衡磁盘块的方法。文章指出,Hadoop当前没有自动平衡磁盘块的机制,需要通过停用DataNode、移动数据块和重启DataNode来手动操作。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

转载注明出处


很多网上转载解决如下:

修改hdfs-site.xml:添加

<property>
  <name>dfs.datanode.du.reserved</name>
  <value>214748364800</value>
  <description>Reserved space in bytes per volume. Always leave this much space free for non dfs use.200G  </description>
</property>

不同磁盘空间大小,hadoop将数据平均写入,hadoop并不会自动将数据写到有更大空闲空间的磁盘中,还是会将之前的小磁盘写满,小磁盘写满会使mapreduce产生的临时文件没有空间写,而导致mapreduce执行失败。所以需要小磁盘留有一定的空闲空间,查看hadoop资料,设置 dfs.datanode.du.reserved配置项可以使每个磁盘保留相应的磁盘空间,单位使用bytes,但是我设置之后起作用了,总体容量下降,依然往小盘上写数据,艹艹,我使用的hadoop版本是cloudera的cdh4.6。


后续继续设置

<property>

        <name>dfs.datanode.du.pct</name>

        <value>0.96</value>

        <description>When calculating remaining space, only use this percentage of the real available space</description>

</property>

测试后依然如此。


观察源代码:

./hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

  long getCapacity() {

    long remaining = usage.getCapacity() - reserved;

    return remaining > ? remaining : 0;

  }

 

  @Override

  public long getAvailable() throws IOException {

    long remaining = getCapacity()-getDfsUsed();

    long available = usage.getAvailable();

    if (remaining > available) {

      remaining = available;

    }

    return (remaining > 0) ? remaining : 0;

  }

 

  long getReserved(){

    return reserved;

  }

./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/LocalDirAllocator.java

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

private static class AllocatorPerContext {

 

   private final Log LOG =

     LogFactory.getLog(AllocatorPerContext.class);

 

   private int dirNumLastAccessed;

   private Random dirIndexRandomizer = new Random();

   private FileSystem localFS;

   private DF[] dirDF;

   private String contextCfgItemName;

   private String[] localDirs;

   private String savedLocalDirs = "";

 

   public AllocatorPerContext(String contextCfgItemName) {

     this.contextCfgItemName = contextCfgItemName;

   }

 

   

   private synchronized void confChanged(Configuration conf)

       throws IOException {

     String newLocalDirs = conf.get(contextCfgItemName);

     if (!newLocalDirs.equals(savedLocalDirs)) {

       localDirs = StringUtils.getTrimmedStrings(newLocalDirs);

       localFS = FileSystem.getLocal(conf);

       int numDirs = localDirs.length;

       ArrayList<String> dirs = new ArrayList<String>(numDirs);

       ArrayList<DF> dfList = new ArrayList<DF>(numDirs);

       for (int i = 0; i < numDirs; i++) {

         try {

 

               File tmpFile = tmpDir.isAbsolute()

                 new File(localFS.makeQualified(tmpDir).toUri())

                 new File(localDirs[i]);

 

               DiskChecker.checkDir(tmpFile);

               dirs.add(tmpFile.getPath());

               dfList.add(new DF(tmpFile, 30000));

 

             catch (DiskErrorException de) {

               LOG.warn( localDirs[i] + " is not writable\n", de);

             }

           else {

             LOG.warn( "Failed to create " + localDirs[i]);

           }

         catch (IOException ie) {

           LOG.warn( "Failed to create " + localDirs[i] + ": " +

               ie.getMessage() + "\n", ie);

         //ignore

       }

       localDirs = dirs.toArray(new String[dirs.size()]);

       dirDF = dfList.toArray(new DF[dirs.size()]);

       savedLocalDirs = newLocalDirs;

       // randomize the first disk picked in the round-robin selection

       dirNumLastAccessed = dirIndexRandomizer.nextInt(dirs.size());

     }

   }

 

  

   public synchronized Path getLocalPathForWrite(String pathStr, long size,

       Configuration conf, boolean checkWrite) throws IOException {

     confChanged(conf);

     int numDirs = localDirs.length;

     int numDirsSearched = 0;

     //remove the leading slash from the path (to make sure that the uri

     //resolution results in a valid path on the dir being checked)

     if (pathStr.startsWith("/")) {

       pathStr = pathStr.substring(1);

     }

     Path returnPath = null;

 

     if(size == SIZE_UNKNOWN) {  //do roulette selection: pick dir with probability

                   //proportional to available size

       long[] availableOnDisk = new long[dirDF.length];

       long totalAvailable = 0;

 

           //build the "roulette wheel"

       for(int i =0; i < dirDF.length; ++i) {

         availableOnDisk[i] = dirDF[i].getAvailable();

         totalAvailable += availableOnDisk[i];

       }

 

       // Keep rolling the wheel till we get a valid path

       Random r = new java.util.Random();

      while (numDirsSearched < numDirs && returnPath == null) {

         long randomPosition = Math.abs(r.nextLong()) % totalAvailable;

         int dir = 0;

         while (randomPosition > availableOnDisk[dir]) {

           randomPosition -= availableOnDisk[dir];

           dir++;

         }

         dirNumLastAccessed = dir;

         returnPath = createPath(pathStr, checkWrite);

         if (returnPath == null) {

           totalAvailable -= availableOnDisk[dir];

           availableOnDisk[dir] = 0; // skip this disk

           numDirsSearched++;

         }

       }

     else {

       while (numDirsSearched < numDirs && returnPath == null) {

         long capacity = dirDF[dirNumLastAccessed].getAvailable();

         if (capacity > size) {

           returnPath = createPath(pathStr, checkWrite);

         }

         dirNumLastAccessed++;

         dirNumLastAccessed = dirNumLastAccessed % numDirs;

         numDirsSearched++;

       }

     }

     if (returnPath != null) {

       return returnPath;

     }

 

     //no path found

     throw new DiskErrorException("Could not find any valid local " +

         "directory for " + pathStr);

   }

对配置都有判读,咋无效呢?无语,有空再深究把。


其他配置选项依旧不能保证小磁盘写入少,哎~~~~~~~~~


还是以FAQ上为准把:

http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the_disk.3F

3.12. On an individual data node, how do you balance the blocks on the disk?

Hadoop currently does not have a method by which to do this automatically. To do this manually:

  1. Take down the HDFS

  2. Use the UNIX mv command to move the individual blocks and meta pairs from one directory to another on each host

  3. Restart the HDFS

对于1)停止hdfs,只需要停止datanode,使用命令$HADOOP_HOME/bin/hadoop-daemon.sh stop datanode

对于2)必须是dfs.data.dir目录下current目录的子目录 mv path-to-data-dir/current/finalized/subdir11/* path-to-data-dir/current/finalized/subdir11

对于3)$HADOOP_HOME/bin/hadoop-daemon.sh start datanode


参见这个:http://search-hadoop.com/m/fSof91EYbe9  老外的操作,自助吧~~~ NNGX


转载注明出处

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值