hadoop中mapreduce的示例代码

这篇文章主要介绍hadoop中mapreduce的示例代码，文中介绍的非常详细，具有一定的参考价值，感兴趣的小伙伴们一定要看完！

盖州网站建设公司成都创新互联公司,盖州网站设计制作，有大型网站制作公司丰富经验。已为盖州成百上千家提供企业网站建设服务。企业网站搭建\成都外贸网站制作要多少钱，请找那个售后服务好的盖州做网站的公司定做！

package cn.itheima.bigdata.hadoop.mr.wordcount;

import java.io.IOException;

import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class WordCountMapper extends Mapper{

   @Override
   protected void map(LongWritable key, Text value,Context context)
           throws IOException, InterruptedException {

       //获取到一行文件的内容
       String line = value.toString();
       //切分这一行的内容为一个单词数组
       String[] words = StringUtils.split(line, " ");
       //遍历输出
       for(String word:words){

           context.write(new Text(word), new LongWritable(1));

       }




   }





}
package cn.itheima.bigdata.hadoop.mr.wordcount;

import java.io.IOException;

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class WordCountReducer extends Reducer{


   // key: hello , values : {1,1,1,1,1.....}
   @Override
   protected void reduce(Text key, Iterable values,Context context)
           throws IOException, InterruptedException {

       //定义一个累加计数器
       long count = 0;
       for(LongWritable value:values){

           count += value.get();

       }

       //输出<单词：count>键值对
       context.write(key, new LongWritable(count));

   }



}

package cn.itheima.bigdata.hadoop.mr.wordcount;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

/**
* 用来描述一个作业job（使用哪个mapper类，哪个reducer类，输入文件在哪，输出结果放哪。。。。）
* 然后提交这个job给hadoop集群
* @author duanhaitao@itcast.cn
*
*/
//cn.itheima.bigdata.hadoop.mr.wordcount.WordCountRunner
public class WordCountRunner {

   public static void main(String[] args) throws Exception {
       Configuration conf = new Configuration();
       Job wcjob = Job.getInstance(conf);
       //设置job所使用的jar包
       conf.set("mapreduce.job.jar", "wcount.jar");

       //设置wcjob中的资源所在的jar包
       wcjob.setJarByClass(WordCountRunner.class);


       //wcjob要使用哪个mapper类
       wcjob.setMapperClass(WordCountMapper.class);
       //wcjob要使用哪个reducer类
       wcjob.setReducerClass(WordCountReducer.class);

       //wcjob的mapper类输出的kv数据类型
       wcjob.setMapOutputKeyClass(Text.class);
       wcjob.setMapOutputValueClass(LongWritable.class);

       //wcjob的reducer类输出的kv数据类型
       wcjob.setOutputKeyClass(Text.class);
       wcjob.setOutputValueClass(LongWritable.class);

       //指定要处理的原始数据所存放的路径
       FileInputFormat.setInputPaths(wcjob, "hdfs://192.168.88.155:9000/wc/srcdata");

       //指定处理之后的结果输出到哪个路径
       FileOutputFormat.setOutputPath(wcjob, new Path("hdfs://192.168.88.155:9000/wc/output"));

       boolean res = wcjob.waitForCompletion(true);

       System.exit(res?0:1);


   }



}

打包成mr.jar放在hadoop server上

[root@hadoop02 ~]# hadoop jar /root/Desktop/mr.jar cn.itheima.bigdata.hadoop.mr.wordcount.WordCountRunner
Java HotSpot(TM) Client VM warning: You have loaded library /home/hadoop/hadoop-2.6.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'.
15/12/05 06:07:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/05 06:07:07 INFO client.RMProxy: Connecting to ResourceManager at hadoop02/192.168.88.155:8032
15/12/05 06:07:08 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/12/05 06:07:09 INFO input.FileInputFormat: Total input paths to process : 1
15/12/05 06:07:09 INFO mapreduce.JobSubmitter: number of splits:1
15/12/05 06:07:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1449322432664_0001
15/12/05 06:07:10 INFO impl.YarnClientImpl: Submitted application application_1449322432664_0001
15/12/05 06:07:10 INFO mapreduce.Job: The url to track the job: http://hadoop02:8088/proxy/application_1449322432664_0001/
15/12/05 06:07:10 INFO mapreduce.Job: Running job: job_1449322432664_0001
15/12/05 06:07:22 INFO mapreduce.Job: Job job_1449322432664_0001 running in uber mode : false
15/12/05 06:07:22 INFO mapreduce.Job: map 0% reduce 0%
15/12/05 06:07:32 INFO mapreduce.Job: map 100% reduce 0%
15/12/05 06:07:39 INFO mapreduce.Job: map 100% reduce 100%
15/12/05 06:07:40 INFO mapreduce.Job: Job job_1449322432664_0001 completed successfully
15/12/05 06:07:41 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=635
                FILE: Number of bytes written=212441
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=338
                HDFS: Number of bytes written=223
                HDFS: Number of read operations=6
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=7463
                Total time spent by all reduces in occupied slots (ms)=4688
                Total time spent by all map tasks (ms)=7463
                Total time spent by all reduce tasks (ms)=4688
                Total vcore-seconds taken by all map tasks=7463
                Total vcore-seconds taken by all reduce tasks=4688
                Total megabyte-seconds taken by all map tasks=7642112
                Total megabyte-seconds taken by all reduce tasks=4800512
        Map-Reduce Framework
                Map input records=10
                Map output records=41
                Map output bytes=547
                Map output materialized bytes=635
                Input split bytes=114
                Combine input records=0
                Combine output records=0
                Reduce input groups=30
                Reduce shuffle bytes=635
                Reduce input records=41
                Reduce output records=30
                Spilled Records=82
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=211
                CPU time spent (ms)=1350
                Physical memory (bytes) snapshot=221917184
                Virtual memory (bytes) snapshot=722092032
                Total committed heap usage (bytes)=137039872
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=224
        File Output Format Counters
                Bytes Written=223

以上是“hadoop中mapreduce的示例代码”这篇文章的所有内容，感谢各位的阅读！希望分享的内容对大家有帮助，更多相关知识，欢迎关注创新互联行业资讯频道！

当前文章：hadoop中mapreduce的示例代码
网站网址：http://chengdu.cdxwcx.cn/article/jepijp.html

甜橘子，专注成都网站制作网站设计与营销型网站建设与优化

首页

网站建设

网站制作案例

解决方案

网站设计报价

网站制作动态

关于我们

联系我们

成都网站建设设计将想法与焦点和您一起共享

hadoop中mapreduce的示例代码

其他资讯

css的样式属性和作用 css样式属性有哪些

android热更新 android热更新 Google play

jquery置灰a标签 js置灰按钮

android开发计件 android开发范例实战宝典

mysql怎么查看帮助哪个命令可以查看mysql的帮助信息

甜橘子，专注成都网站制作网站设计与营销型网站建设与优化

成都网站建设设计 将想法与焦点和您一起共享

hadoop中mapreduce的示例代码

其他资讯

css的样式属性和作用 css样式属性有哪些

android热更新 android热更新 Google play

jquery置灰a标签 js置灰按钮

android开发计件 android开发范例实战宝典

mysql怎么查看帮助 哪个命令可以查看mysql的帮助信息

成都网站建设设计将想法与焦点和您一起共享

mysql怎么查看帮助哪个命令可以查看mysql的帮助信息