
Hadoop
anoperA
啦啦啦
展开
-
运行wordCount程序
目标:测试WordCount程序当前定位于$hadoop_home目录.1.将$hadoop_home/etc/core-site.xml上传到hdfs的/user/input下.bin/hdfs dfs -appendToFile /usr/hadoop/hadoop-2.7.2/etc/hadoop/core-site.xml /user/input/core-site.x原创 2017-07-12 14:02:38 · 449 阅读 · 0 评论 -
MR--InputSplit
InputSplit Ref InputSplit represents the data to be processed by an individual Mapper. Typically, it presents a byte-oriented view(面向字节的视图) on the input and is the responsibility of RecordReader原创 2017-11-27 15:28:14 · 324 阅读 · 0 评论 -
MR--InputFormat
Hadoop2.7.4 InputFormat总结:原创 2017-11-27 15:22:05 · 442 阅读 · 0 评论 -
配置一个ZooKeeper伪分布式集群
确定dataDir, 创建目录, 分别创建myid文件.修改三份配置文件.启动三个服务器.#选定dataDir为/var/zoo#为3个节点创建dataDirmkdir -p /var/zoo/zk1mkdir -p /var/zoo/zk2mkdir -p /var/zoo/zk3#创建myid文件echo '1' >> /var/zoo/zk1/myidecho '...原创 2018-04-07 17:38:46 · 209 阅读 · 0 评论 -
Zookeeper Java客户端使用
import org.apache.zookeeper.*;import org.apache.zookeeper.data.ACL;import org.apache.zookeeper.data.Id;import org.apache.zookeeper.server.auth.DigestAuthenticationProvider;import java.io.IOExcept...原创 2018-04-08 13:07:33 · 220 阅读 · 0 评论 -
HBase Shell基本命令
创建表create '<table name>', '<column family>', 'column family', ...create 'emp', 'personal data', 'professional data'列出所有表list#禁用一个表disable '<tableName>'disable 't1'启用一个表en...原创 2018-04-15 15:18:10 · 242 阅读 · 0 评论 -
CDH实验环境搭建
安装3个CentOS系统.将工作模式设置为命令行模式vim /etc/inittab#设置id:5:initdefault为id:5:initdefault原创 2018-04-05 01:21:00 · 312 阅读 · 1 评论 -
CDH集群安装02--系统准备
# CentOS-Base.repo## The mirror system uses the connecting IP address of the client and the# update status of each mirror to pick mirrors that are updated to and# geographically close to the clien...原创 2018-04-06 19:25:28 · 146 阅读 · 0 评论 -
FLume Sink模板代码
public class SimpleSink extends AbstractSink implements Configurable{ private static final Logger logger = LoggerFactory.getLogger(SimpleSink.class); @Override public synchronized void start() ...原创 2018-04-27 16:06:25 · 275 阅读 · 0 评论 -
MR--RecordReader
RecordReader Ref The record reader breaks the data into key/value pairs for input to the Mapper.总结:原创 2017-11-27 15:25:14 · 290 阅读 · 0 评论 -
MR--Configuration
Hadoop2.7.4 API–Configuration 功能: 1. 提供了配置Hadoop参数的方法. 2. 可以选择”资源文件(Resource)“, “常量参数(Final Parameters)“, “变量表达式(Variable Expression)“三种方式配置Hadoop参数. 3. 使用conf.get([parameterName])方式获取参数. 4. 过期原创 2017-11-25 20:19:54 · 878 阅读 · 0 评论 -
MR-Job
/*The job submitter's view of the Job.It allows the user to configure the job, submit it, control its execution, and query the state. The set methods only work until the job is submitted, afterwards原创 2017-11-25 20:17:14 · 378 阅读 · 0 评论 -
MR--WordCount的MapReduce程序注释
程序基于Hadoop2.7.4开发, 可运行public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1原创 2017-11-28 21:15:13 · 406 阅读 · 0 评论 -
MR--MaxTemperature的Mapreduce程序注释
程序基于Hadoop2.7.4开发, 可运行 天气数据请到ncdc或者hadoop权威指南书籍网站获取.public class MaxTemperature { public static class MaxTemperatureMapper extends Mapper<Object, Text, Text, IntWritable> { //天气温度9999,代原创 2017-11-28 21:19:53 · 304 阅读 · 0 评论 -
文章标题
The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons. The idea is to have a global ResourceManager (RM) and per-app原创 2017-11-29 16:58:46 · 185 阅读 · 0 评论 -
YARN--Core components
组件关系图(from hadoop yarn home): Container At the fundamental level, a container is a collection of physical resources such as RAM, CPU cores, and disks on a single node. There can be multiple containe原创 2017-11-30 11:22:21 · 457 阅读 · 0 评论 -
Hadoop--org.apache.hadoop.fs.FileSystem
代码基于Hadoop2.7.4 org.apache.hadoop.fs.FileSystem Ref An abstract base class for a fairly generic filesystem. It may be implemented as a distributed filesystem, or as a “local” one that reflects t原创 2017-12-01 16:27:49 · 2533 阅读 · 0 评论 -
Idea + Hadoop2.7.4开发Mapreduce
环境: 1. Ideal 2016 2. Hadoop 2.7.4由于hadoop较大, 我直接添加本地依赖: 对于较小的jar包, 我选择使用maven仓库: pom.xml<dependencies> <!-- https://mvnrepository.com/artifact/commons-logging/commons-logging --> <dep原创 2017-11-24 17:08:18 · 714 阅读 · 0 评论 -
MR--Text
This class stores text using standard UTF8 encoding. It provides methods to serialize, deserialize, and compare texts at byte level. The type of length is integer and is serialized using zero-compresse原创 2017-11-25 17:40:41 · 271 阅读 · 0 评论 -
判断是否为gzip文件 模板代码
package com.urun.flume.commons;import java.io.FileInputStream;import java.io.IOException;import java.util.zip.GZIPInputStream;public class App { public static void main(String[] args) throws I...原创 2018-04-27 17:28:49 · 1716 阅读 · 0 评论