hadoop sources reading
leibnitz09
这个作者很懒,什么都没留下…
展开
专栏收录文章
- 默认排序
- 最新发布
- 最早发布
- 最多阅读
- 最少阅读
-
sources study-part 1-outline
i choice to learn the sources from the flow-work of FS to MR,so the steps are these:a.FS(IO)b.MRc.IPC outline of API: simple flow of hadoop:2011-04-20 02:54:16 · 97 阅读 · 0 评论 -
hadoop源码阅读-shell启动流程
open the bin/hadoop file,you will see the there is a config file to load: either libexec/hadoop-config.sh or bin/hadoop-config.sh and the previor is loaded if exists,else the load the later. ...2012-05-03 01:58:54 · 140 阅读 · 0 评论 -
hadoop源码阅读-第二回阅读开始
出于工作需要及版本更新带来的变动,现在开始再次进入源码空间-hadoop-1.0.1 这次阅读的目的有这几个:-比较全面的阅读整体代码,清楚大体的工作流程,各部件的连接与交互 ;-common的改动及主要职责;-config/shell 的启动流程;-hdfs具体的设计及实现;-mapreduce的詳細设计及实现;-ipc詳細实现-others ...原创 2012-05-03 01:03:17 · 104 阅读 · 0 评论 -
hadoop几种排序简介
在map reduce框架中,除了常用的分布式计算外,排序也算是比较重要的一环了。这形如sql查询中的排序数据一样重要。 一。无排序当书写code 时,如果指定了mapred.reduce.tasks=0(same effect as setNumReduceTasks)。这样便达到目的。产生的效果当然是只有一个part file,而且其中的entries是unorder....原创 2011-12-16 21:52:54 · 245 阅读 · 0 评论 -
sources study-part 6- remote debugging
when run in cluster(pseudo),it is difficult to debug the code.i have tried to debug use some tricks to do it:a use the jdpa debug platform.add this options to hadoop opts after start and before ...2011-05-04 03:07:51 · 104 阅读 · 0 评论 -
sources study-part 7-summary
here is my summry during reading the sources,consider to my ability and the complexity of hadoop,and i have not read all the sources,there will be some inlogical statements in them,so if you find a ...2011-05-04 02:55:52 · 112 阅读 · 0 评论 -
sources study-part 5-hdfs - advanced features - blocks allocation policy
TODO2011-05-04 02:49:06 · 91 阅读 · 0 评论 -
sources study-part 4-mapreduce - advanced features - spill,merge and sort
TODO2011-05-04 02:47:09 · 115 阅读 · 0 评论 -
sources study-part 3-mapreduce -taskscheduler
it is used to schedule jobs and tasks also.there are certain ones : JobQueueTaskScheduler(by default,and is FIFO algorithm); FairScheduler; CapacityScheduler;... here is the simple flow of...2011-05-04 02:44:31 · 173 阅读 · 0 评论 -
sources study-part 3-mapreduce - what is a split?
"split" which is a logical concept relatives to a "block" whis is real store unit.when a client submit a job to JT,it will compute the splits by file,than the TT will generate InputSplit to map ...2011-04-28 13:33:19 · 122 阅读 · 0 评论 -
sources study-part 2-hdfs get file
as writing a file to hdfs,the client get a DistributedSystem to communicate with Namenode.and the DS will create a DFSClient to create DFSInputstream which is encasluted to FSDataInputStream. off ...2011-04-27 02:15:00 · 119 阅读 · 0 评论 -
sources study-part 2-hdfs client
data structure transfrom file to hdfs from client2011-04-24 16:43:42 · 104 阅读 · 0 评论 -
hadoop源码阅读-shell启动流程-start-all
when executes start-all.sh (or start-dfs.sh,start-mapred.sh) ,some config files listed below will be loaded: libexec/hadoop-config.sh ( use bin/hadoop-config.sh if without previor) confi...2012-05-06 01:13:54 · 140 阅读 · 0 评论
分享