ubuntu下安装protobuf

本文详细介绍了在Ubuntu系统中安装protobuf的步骤,包括下载、编译和解决protobuf库找不到的问题。同时,文章还指导了如何安装和配置Hadoop-LZO,包括修改pom.xml文件、同步jar包到所有节点以及配置Hadoop环境。最后,展示了在Hive中使用LZO压缩数据的实践操作。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1.下载protobuf

下载地址:http://code.google.com/p/protobuf/downloads/list

2.编译protobuf
解压下载的zip包,cd到protobuf的目录下,执行以下指令:

./configure
make
make check
make install

3.检查安装是否成功
protoc –version
如果成功,则会输出版本号信息,例如:libprotoc 2.5.0
如果有问题,则会输出错误内容。

4.错误及解决方法
protoc: error while loading shared libraries: libprotoc.so.8: cannot open shared
错误原因:
protobuf的默认安装路径是/usr/local/lib,而/usr/local/lib 不在Ubuntu体系默认的 LD_LIBRARY_PATH 里,所以就找不到该lib
解决方法:
1. 创建文件 /etc/ld.so.conf.d/libprotobuf.conf 包含内容:

 /usr/local/lib  
  1. 输入命令
    sudo ldconfig

这时,再运行protoc –version 就可以正常看到版本号了

转载 http://blog.youkuaiyun.com/xocoder/article/details/9155901

lzo

sudo apt-get install liblzo2-dev
sudo apt-get install lzop

(5)安装Hadoop-LZO

当然的还有一个前提,就是配置好maven和svn 或者Git(我使用的是SVN),这个就不说了,如果这些搞不定,其实也不必要进行下去了!

我这里使用https://github.com/twitter/hadoop-lzo

使用SVN从https://github.com/twitter/hadoop-lzo/trunk下载代码,修改pom.xml文件中的一部分。

从:

[html] view plain copy
在CODE上查看代码片派生到我的代码片

<properties>  
  <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>  
  <hadoop.current.version>2.1.0-beta</hadoop.current.version>  
  <hadoop.old.version>1.0.4</hadoop.old.version>  
</properties>  

修改为:

[html] view plain copy
在CODE上查看代码片派生到我的代码片

<properties>  
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>  
    <hadoop.current.version>2.2.0</hadoop.current.version>  
    <hadoop.old.version>1.0.4</hadoop.old.version>  
  </properties>  

再依次执行:

[html] view plain copy

mvn clean package -Dmaven.test.skip=true    
tar -cBf - -C target/native/Linux-amd64-64/lib . | tar -xBvf - -C /home/hadoop/hadoop-2.2.0/lib/native/     
cp target/hadoop-lzo-0.4.20-SNAPSHOT.jar /home/hadoop/hadoop-2.2.0/share/hadoop/common/  

接下来就是将/home/hadoop/hadoop-2.2.0/share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jar以及/home/hadoop/hadoop-2.2.0/lib/native/ 同步到其它所有的hadoop节点。注意,要保证目录/home/hadoop/hadoop-2.2.0/lib/native/ 下的jar包,你运行hadoop的用户都有执行权限。

(6)配置Hadoop

在文件$HADOOP_HOME/etc/hadoop/hadoop-env.sh中追加如下内容:

[html] view plain copy
在CODE上查看代码片派生到我的代码片

export LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib  

在文件$HADOOP_HOME/etc/hadoop/core-site.xml中追加如下内容:

[html] view plain copy
在CODE上查看代码片派生到我的代码片

<property>  
        <name>io.compression.codecs</name>  
        <value>org.apache.hadoop.io.compress.GzipCodec,  
                   org.apache.hadoop.io.compress.DefaultCodec,  
                   com.hadoop.compression.lzo.LzoCodec,  
                   com.hadoop.compression.lzo.LzopCodec,  
                   org.apache.hadoop.io.compress.BZip2Codec  
        </value>  
</property>  
<property>  
         <name>io.compression.codec.lzo.class</name>  
         <value>com.hadoop.compression.lzo.LzoCodec</value>  
</property>  

在文件$HADOOP_HOME/etc/hadoop/mapred-site.xml中追加如下内容:

[html] view plain copy

<property>    
    <name>mapred.compress.map.output</name>    
    <value>true</value>    
</property>    
<property>    
    <name>mapred.map.output.compression.codec</name>    
    <value>com.hadoop.compression.lzo.LzoCodec</value>    
</property>    
<property>    
    <name>mapred.child.env</name>  
    <value>LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib</value>    
</property>   

(7)在Hive中体验lzo

A:首先创建nginx_lzo的表

[html] view plain copy
在CODE上查看代码片派生到我的代码片

CREATE TABLE logs_app_nginx (  
ip STRING,  
user STRING,  
time STRING,  
request STRING,  
status STRING,  
size STRING,  
rt STRING,  
referer STRING,  
agent STRING,  
forwarded String  
)   
partitioned by (  
date string,  
host string  
)  
row format delimited   
fields terminated by '\t'  
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"    
OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";   

B:导入数据

[html] view plain copy

LOAD DATA Local INPATH '/home/hadoop/data/access_20131230_25.log.lzo' INTO TABLE logs_app_nginx PARTITION(date=20131229,host=25);  

/home/hadoop/data/access_20131219.log文件的格式如下:

221.207.93.109 - [23/Dec/2013:23:22:38 +0800] “GET /ClientGetResourceDetail.action?id=318880&token=Ocm HTTP/1.1” 200 199 0.008 “xxx.com” “Android4.1.2/LENOVO/Lenovo A706/ch_lenovo/80” “-“

直接采用lzop /home/hadoop/data/access_20131219.log即可生成lzo格式压缩文件/home/hadoop/data/access_20131219.log.lzo

C:索引LZO文件
[html] view plain copy

$HADOOP_HOME/bin/hadoop jar /home/hadoop/hadoop-2.2.0/share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jar com.hadoop.compression.lzo.DistributedLzoIndexer /user/hive/warehouse/<span style="font-family: Arial, Helvetica, sans-serif;">logs_app_nginx</span>  

D:开始跑利用hive来跑map/reduce任务了

[html] view plain copy

set hive.exec.reducers.max=10;   
set mapred.reduce.tasks=10;  
select ip,rt from nginx_lzo limit 10;  

在hive的控制台能看到类似如下格式输出,就表示正确了!

[html] view plain copy

hive> set hive.exec.reducers.max=10;   
hive> set mapred.reduce.tasks=10;  
hive> select ip,rt from nginx_lzo limit 10;  
Total MapReduce jobs = 1  
Launching Job 1 out of 1  
Number of reduce tasks is set to 0 since there's no reduce operator  
Starting Job = job_1388065803340_0009, Tracking URL = http://lrts216:8088/proxy/application_1388065803340_0009/  
Kill Command = /home/hadoop/hadoop-2.2.0/bin/hadoop job  -kill job_1388065803340_0009  
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0  
2013-12-27 09:13:39,163 Stage-1 map = 0%,  reduce = 0%  
2013-12-27 09:13:45,343 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.22 sec  
2013-12-27 09:13:46,369 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.22 sec  
MapReduce Total cumulative CPU time: 1 seconds 220 msec  
Ended Job = job_1388065803340_0009  
MapReduce Jobs Launched:   
Job 0: Map: 1   Cumulative CPU: 1.22 sec   HDFS Read: 63570 HDFS Write: 315 SUCCESS  
Total MapReduce CPU Time Spent: 1 seconds 220 msec  
OK  
221.207.93.109  "XXX.com"  
Time taken: 17.498 seconds, Fetched: 10 row(s)  
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值