网上流传的资料很杂。
版本为griaph-1.1.0
看了下bin/giraph的启动脚本
假设已经安装hadoop1.2.1
配置:
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://mu02:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-1.2.1/tmp</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>mapred-site.xml :
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>mu02:9001</value>
</property>
</configuration>hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/opt/hadoop-1.2.1/nndir</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/opt/hadoop-1.2.1/dndir</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.block.size</name>
<value>2097152</value>
</property>
</configuration>
修改bin/giraph-env.sh
增加一句:
HADOOP_HOME="/opt/hadoop-1.2.1"
HADOOP_CONF_DIR="$HADOOP_HOME/conf"
if [ "$HADOOP_CONF_DIR" = "" ] ; then
HADOOP_CONF_DIR=$HADOOP_HOME/conf
即可。 按照官方网站上面的quick-start执行:
配置环境变量:
GIRAPH_HOME=/opt/giraph-1.1.0-for-hadoop-1.2.1
PATH=...:$GIRAPH_HOME
执行作业:
giraph ../giraph-examples-1.1.0.jar org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /test/test.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /output1 -w 1 -ca giraph.SplitMasterWorker=false运行结果:
16/03/15 16:56:57 INFO mapred.JobClient: Running job: job_201603151444_0012
16/03/15 16:56:58 INFO mapred.JobClient: map 100% reduce 0%
16/03/15 16:57:00 INFO mapred.JobClient: Job complete: job_201603151444_0012
16/03/15 16:57:00 INFO mapred.JobClient: Counters: 41
16/03/15 16:57:00 INFO mapred.JobClient: Zookeeper halt node
16/03/15 16:57:00 INFO mapred.JobClient: /_hadoopBsp/job_201603151444_0012/_haltComputation=0
16/03/15 16:57:00 INFO mapred.JobClient: Zookeeper base path
16/03/15 16:57:00 INFO mapred.JobClient: /_hadoopBsp/job_201603151444_0012=0
16/03/15 16:57:00 INFO mapred.JobClient: Job Counters
16/03/15 16:57:00 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=22801
16/03/15 16:57:00 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
16/03/15 16:57:00 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
16/03/15 16:57:00 INFO mapred.JobClient: Launched map tasks=1
16/03/15 16:57:00 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
16/03/15 16:57:00 INFO mapred.JobClient: Giraph Timers
16/03/15 16:57:00 INFO mapred.JobClient: Input superstep (ms)=241
16/03/15 16:57:00 INFO mapred.JobClient: Total (ms)=9575
16/03/15 16:57:00 INFO mapred.JobClient: Superstep 2 SimpleShortestPathsComputation (ms)=73
16/03/15 16:57:00 INFO mapred.JobClient: Shutdown (ms)=8926
16/03/15 16:57:00 INFO mapred.JobClient: Superstep 0 SimpleShortestPathsComputation (ms)=101
16/03/15 16:57:00 INFO mapred.JobClient: Initialize (ms)=419
16/03/15 16:57:00 INFO mapred.JobClient: Superstep 3 SimpleShortestPathsComputation (ms)=64
16/03/15 16:57:00 INFO mapred.JobClient: Superstep 1 SimpleShortestPathsComputation (ms)=92
16/03/15 16:57:00 INFO mapred.JobClient: Setup (ms)=75
16/03/15 16:57:00 INFO mapred.JobClient: Zookeeper server:port
16/03/15 16:57:00 INFO mapred.JobClient: mu02:22181=0
16/03/15 16:57:00 INFO mapred.JobClient: Giraph Stats
16/03/15 16:57:00 INFO mapred.JobClient: Aggregate edges=12
16/03/15 16:57:00 INFO mapred.JobClient: Sent message bytes=0
16/03/15 16:57:00 INFO mapred.JobClient: Superstep=4
16/03/15 16:57:00 INFO mapred.JobClient: Last checkpointed superstep=0
16/03/15 16:57:00 INFO mapred.JobClient: Current workers=1
16/03/15 16:57:00 INFO mapred.JobClient: Aggregate sent messages=12
16/03/15 16:57:00 INFO mapred.JobClient: Current master task partition=0
16/03/15 16:57:00 INFO mapred.JobClient: Sent messages=0
16/03/15 16:57:00 INFO mapred.JobClient: Aggregate finished vertices=5
16/03/15 16:57:00 INFO mapred.JobClient: Aggregate sent message message bytes=267
16/03/15 16:57:00 INFO mapred.JobClient: Aggregate vertices=5
16/03/15 16:57:00 INFO mapred.JobClient: File Output Format Counters
16/03/15 16:57:00 INFO mapred.JobClient: Bytes Written=0
16/03/15 16:57:00 INFO mapred.JobClient: FileSystemCounters
16/03/15 16:57:00 INFO mapred.JobClient: HDFS_BYTES_READ=156
16/03/15 16:57:00 INFO mapred.JobClient: FILE_BYTES_WRITTEN=134412
16/03/15 16:57:00 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=30
16/03/15 16:57:00 INFO mapred.JobClient: File Input Format Counters
16/03/15 16:57:00 INFO mapred.JobClient: Bytes Read=0
16/03/15 16:57:00 INFO mapred.JobClient: Map-Reduce Framework
16/03/15 16:57:00 INFO mapred.JobClient: Map input records=1
16/03/15 16:57:00 INFO mapred.JobClient: Physical memory (bytes) snapshot=398675968
16/03/15 16:57:00 INFO mapred.JobClient: Spilled Records=0
16/03/15 16:57:00 INFO mapred.JobClient: CPU time spent (ms)=6710
16/03/15 16:57:00 INFO mapred.JobClient: Total committed heap usage (bytes)=657981440
16/03/15 16:57:00 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1741172736
16/03/15 16:57:00 INFO mapred.JobClient: Map output records=0
16/03/15 16:57:00 INFO mapred.JobClient: SPLIT_RAW_BYTES=44
本文介绍了Giraph在Hadoop环境下的安装配置过程,并通过一个简单的最短路径计算示例展示了如何运行Giraph任务。文章详细记录了核心配置文件的设置及执行命令。
584

被折叠的 条评论
为什么被折叠?



