阿里云centos7 四台部署hadoop集群

最新推荐文章于 2025-02-15 20:02:07 发布

我的架构师之路

最新推荐文章于 2025-02-15 20:02:07 发布

阅读量1.9k

点赞数 3

分类专栏：大数据文章标签： hadoop集群 centos

本文链接：https://blog.youkuaiyun.com/qq_16681279/article/details/80223667

版权

前言：

Hadoop 发展前景：
（1）分布式文件系统 HDFS (GFS) 。
（2）数据的计算：分布式计算。

1 MapReduce , 搜索排名
2 大任务 拆分成小任务
3 Map 阶段 进行任务拆分，Reduce 阶段进行数据计算汇总 。

（3）bigTable — Hbase (nosql), 行键、列族。

启动：start-all.sh
HDFS : 存储数据。
Yarn : Mapreduce 的运行容器。

访问：
（1）命令行
（2）java api
（3） Web Console 管理界面

本地模式：
特点：不具备 HDFS , 只能测试MapReduce 程序。

伪分布式模式：
特点：具备Hadoop 所有功能，在单机上模拟一个分布式的环境。
（1）HDFS : 主： NameNode , 数据节点： DataNode
（2）Yarn : 容器：运行MapReduce 程序。
主节点： ResourceManager 。
从节点：NodeManager

这里写图片描述

MapReduce 的使用：agui/hadoop/hadoop-2.8.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.3.jar

[root@nn mapreduce]# hadoop jar hadoop-mapreduce-examples-2.8.3.jar 
An example program must be given as the first argument.
Valid program names are:
  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
  dbcount: An example job that count the pageview counts from a database.
  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
  grep: A map/reduce program that counts the matches of a regex in the input.
  join: A job that effects a join over sorted, equally partitioned datasets
  multifilewc: A job that counts words from several files.
  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
  randomwriter: A map/reduce program that writes 10GB of random data per node.
  secondarysort: An example defining a secondary sort to the reduce.
  sort: A map/reduce program that sorts the data written by the random writer.
  sudoku: A sudoku solver.
  teragen: Generate data for the terasort
  terasort: Run the terasort
  teravalidate: Checking results of terasort
  wordcount: A map/reduce program that counts the words in the input files.
  wordmean: A map/reduce program that counts the average length of the words in the input files.
  wordmedian: A map/reduce program that counts the median length of the words in the input files.
  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
[root@nn mapreduce]#