core-site.xml 配置需要支持的压缩格式
<property>
<name>io.compression.codecs</name>
<value>
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.BZip2Codec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec,
org.apache.hadoop.io.compress.Lz4Codec,
org.apache.hadoop.io.compress.SnappyCodec,
</value>
</property>
然后在mapred-site.xml里配置实际使用的压缩
<!--是否支持压缩-->
<property>
<name>mapreduce.output.fileoutputformat.compress</name>
<value>true</value>
<!--压缩方式-->
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress.codec</name>
<value>org.apache.hadoop.io.compress.BZip2Codec</value>
</property>
HDFS里压缩[配置
1. 输入压缩 HDFS里的文件压缩格式
2.中间压缩
旧:之被遗弃的属性,新:之代替的属性
属性
描述
默认值
mapred.compress.map.output(旧);mapreduce.map.output.compress(新)
Should the outputs of the maps be compressed before being sent across the network. Uses SequenceFile compression.
alse
mapred.map.output.compression.codec(旧); mapreduce.map.output.compress.codec(新)
If the map outputs are compressed, how should they be compressed?org.apache.hadoop.io.compress.DefaultCodec
org.apache.hadoop.io.compress.DefaultCodec
<!--是否支持压缩-->
<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
</property>
<!--压缩方式-->
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
<description>
This controls whether intermediate files produced by Hive between multiple map-reduce jobs are compressed. The compression codec and other options are determined from hadoop config variables mapred.output.compress*
</description>
</property>
最终压缩
名称
默认
定义
mapred.output.compress (旧);mapreduce.output.fileoutputformat.compress(新)
mapreduce.output.fileoutputformat.compress
false
mapred.output.compression.codec (旧);mapreduce.output.fileoutputformat.compress.codec(新)
If the job outputs are compressed, how should they be compressed?
org.apache.hadoop.io.compress.DefaultCodec
<!--是否支持压缩-->
<property>
<name>mapreduce.output.fileoutputformat.compress</name>
<value>true</value>
</property>
<!--压缩方式-->
<property>
<name>mapreduce.output.fileoutputformat.compress.codec</name>
<value>org.apache.hadoop.io.compress.BZip2Codec</value>
</property>
Hive里压缩[配置
官网:https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-hive-site.xmlandhive-default.xml.template
是否开启
压缩位置
名称
描述
默认值
最终压缩
hive.exec.compress.output
Determines whether the output of the final map/reduce job in a query is compressed or not
false
中间压缩
hive.exec.compress.intermediate
Determines whether the output of the intermediate map/reduce jobs in a query is compressed or not.
false
</