maven3.5+hadoop2.7.3统计米骑测试日志KPI指标(三)

本文介绍了一种使用Hadoop MapReduce处理日志文件的方法,旨在统计特定格式的日志中各IP地址的访问频率,并将指定前缀的IP地址数据分离到单独的文件中。通过对1万条日志数据进行分析,实现了对101.226.93、112.17.244及218.26.54等特定IP前缀的识别与处理。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

这次把日志记录中某些特定的访问ip区分出来.比如101.226.93, 112.17.244,  218.26.54开头的ip区分放到另一个输出文件里面.

这次样本数据还是以那个1万条日志,2.5M大小,以这个作为统计样本。

见文章:http://blog.youkuaiyun.com/cafebar123/article/details/73928303

(1)统计每个访问ip出现的次数,这个不说了,见文章:http://blog.youkuaiyun.com/cafebar123/article/details/73928303

(2)把101.226.93, 112.17.244,  218.26.54开头的ip放到一个输出文件里面:

public class Kpi_IP_Provider {
	public static class IntSumMapper extends Mapper<Object, Text, Text, Kpi_IPCountBean> {
		private Kpi_IPCountBean bean = new Kpi_IPCountBean();
		private Text word = new Text();

		public void map(Object key, Text value, Context context) throws IOException, InterruptedException {			
			if(value.toString().indexOf("\\")==-1){				
				//过滤不成功的请求
    			String line = StringHandleUtils.filterLog(value.toString());
    			String[] fields = line.split(" ");    			
    			String ip = fields[0];
                word.set(ip);
                Integer count = 1;  //每次ip出现,次数为1                
                bean.setIpCount(ip, count);
                context.write(word, bean);
			}
			
		}
	}

	public static class IntSumReducer extends Reducer<Text, Kpi_IPCountBean, Text, Kpi_IPCountBean> {
		private Kpi_IPCountBean bean = new Kpi_IPCountBean();

		public void reduce(Text key, Iterable<Kpi_IPCountBean> values, Context context)
				throws IOException, InterruptedException {
			int sum = 0;
			for (Kpi_IPCountBean val : values) {
				sum += val.getIpcount();
			}
			bean.setIpCount("", sum);
			context.write(key, bean);
		}
	}
	public static class ServiceProviderPartitioner extends Partitioner<Text, Kpi_IPCountBean>{
		private static Map<String, Integer> providerMap = new HashMap<String, Integer>();
		static{
			providerMap.put("101.226.93", 1);
			providerMap.put("112.17.244", 1);
			providerMap.put("218.26.54", 1);
		}
		public int getPartition(Text key, Kpi_IPCountBean value, int number){
			String ip = key.toString();
			String ipField = ip.substring(0, 10);			
			Integer p = providerMap.get(ipField);
			if(p == null)
				p = 0;
			return p;
		}
		
	}
	
	public static void main(String[] args) throws Exception {		
		Configuration conf = new Configuration();
		
		Job job = new Job(conf, "ip count provider");
		job.setJarByClass(Kpi_IP_Provider.class);
		job.setMapperClass(IntSumMapper.class);
		job.setCombinerClass(IntSumReducer.class);
		job.setReducerClass(IntSumReducer.class);
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(Kpi_IPCountBean.class);
		
		//设置reduce默认的Partitioner  
		job.setPartitionerClass(ServiceProviderPartitioner.class);
		//此处需要设置reduce的数量  
		job.setNumReduceTasks(2);
		
		FileInputFormat.addInputPath(job, new Path("hdfs://119.29.174.43:9000/user/hadoop/miqiLog10000Input"));
		FileOutputFormat.setOutputPath(job, new Path("hdfs://119.29.174.43:9000/user/hadoop/miqiLogOutProvider"));
		
			
		System.exit(job.waitForCompletion(true) ? 0 : 1);
	}
}




结果:



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值