Problems running pig script in mapreduce mode

I have a working haop (2.6.0) with 6 nodes (including master node) and want to run a pig (0.14.0) script in mapreduce mode. The script runs without error, but unfortunately it only seems to work on the master node. During my research, I tried some changes to hadoop config files with no success.

Can you help me figure out how to make pigs work for the whole cluster?

Here's some information:

Configuration on each node:

General:

/ etc / hosts

127.0.0.1       localhost
192.168.101.3   master
192.168.101.4   node1
192.168.101.5   node2
192.168.101.6   node3
192.168.101.7   node4
192.168.101.8   node5

      

Hadoop:

yarn site.xml

<configuration>
<!-- Site specific YARN configuration properties -->
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>master</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                <value>master:8025</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.address</name>
                <value>master:8030</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.resourcemanager.address</name>
                <value>master:8050</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.resourcemanager.admin.address</name>
                <value>master:8041</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.nodemanager.aux_services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux_services.mapreduce.shuffle.class</name>
                <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
                <name>yarn.log.server.url</name>
                <value>master:19888/jobhistory/logs/</value>
        </property>
</configuration>

      

core-site.xml

<configuration>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/app/hadoop/tmp</value>
                <description>A base for other temporary dictionaries.</description>
        </property>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://master:9000/</value>
                <description>...</description>
        </property>
</configuration>

      

mapred-site.xml

<configuration>
        <property>
                <name>mapreduce.jobtracker.address</name>
                <value>master:54311</value>
                <description>...</description>
        </property>
        <property>
                <name>mapred.framework.name</name>
                <value>yarn</value>
                <final>true</final>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>master:10020</value>
                <description>...</description>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>master:19888</value>
                <description>...</description>
        </property>

</configuration>

      

pig output:

15/01/09 13:12:54 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
15/01/09 13:12:54 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
15/01/09 13:12:54 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2015-01-09 13:12:54,845 [main] INFO  org.apache.pig.Main - Apache Pig version 0.14.0 (r1640057) compiled Nov 16 2014, 18:02:05
2015-01-09 13:12:54,845 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/hduser/pig_1420805574843.log
2015-01-09 13:12:56,450 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/hduser/.pigbootup not found
2015-01-09 13:12:56,876 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-01-09 13:12:56,886 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:12:56,886 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master:9000/
2015-01-09 13:12:58,146 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: master:54311
2015-01-09 13:12:59,195 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:12:59,418 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:12:59,598 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:00,496 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: FILTER,UNION
2015-01-09 13:13:00,618 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:00,634 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2015-01-09 13:13:00,713 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2015-01-09 13:13:00,987 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2015-01-09 13:13:01,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2015-01-09 13:13:01,038 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2015-01-09 13:13:01,079 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:01,103 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - session.id is deprecated. Instead, use dfs.metrics.session-id
2015-01-09 13:13:01,105 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
2015-01-09 13:13:01,149 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-01-09 13:13:01,161 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2015-01-09 13:13:01,161 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2015-01-09 13:13:01,161 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2015-01-09 13:13:01,167 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2015-01-09 13:13:19,222 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/pig-0.14.0-core-h2.jar to DistributedCache through /tmp/temp-1277984423/tmp-918732110/pig-0.14.0-core-h2.jar
2015-01-09 13:13:20,063 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1277984423/tmp883771618/automaton-1.11-8.jar
2015-01-09 13:13:20,621 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1277984423/tmp-1372558595/antlr-runtime-3.4.jar
2015-01-09 13:13:26,600 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar to DistributedCache through /tmp/temp-1277984423/tmp-1556176302/guava-11.0.2.jar
2015-01-09 13:13:29,300 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/lib/joda-time-2.1.jar to DistributedCache through /tmp/temp-1277984423/tmp145012374/joda-time-2.1.jar
2015-01-09 13:13:29,718 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-01-09 13:13:29,736 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2015-01-09 13:13:29,736 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2015-01-09 13:13:29,736 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2015-01-09 13:13:29,840 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2015-01-09 13:13:29,841 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2015-01-09 13:13:30,191 [JobControl] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2015-01-09 13:13:30,384 [JobControl] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:30,785 [JobControl] WARN  org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2015-01-09 13:13:30,949 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:13:30,949 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:13:31,250 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 52
2015-01-09 13:13:31,309 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:13:31,309 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:13:31,355 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 24
2015-01-09 13:13:31,378 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:13:31,379 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:13:31,394 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 6
2015-01-09 13:13:31,587 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:82
2015-01-09 13:13:31,706 [JobControl] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:32,475 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_local647507189_0001
2015-01-09 13:13:33,628 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805612754/pig-0.14.0-core-h2.jar <- /home/hduser/pig-0.14.0-core-h2.jar
2015-01-09 13:13:33,758 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp-1277984423/tmp-918732110/pig-0.14.0-core-h2.jar as file:/app/hadoop/tmp/mapred/local/1420805612754/pig-0.14.0-core-h2.jar
2015-01-09 13:13:33,759 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805612755/automaton-1.11-8.jar <- /home/hduser/automaton-1.11-8.jar
2015-01-09 13:13:33,770 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp-1277984423/tmp883771618/automaton-1.11-8.jar as file:/app/hadoop/tmp/mapred/local/1420805612755/automaton-1.11-8.jar
2015-01-09 13:13:33,772 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805612756/antlr-runtime-3.4.jar <- /home/hduser/antlr-runtime-3.4.jar
2015-01-09 13:13:33,781 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp-1277984423/tmp-1372558595/antlr-runtime-3.4.jar as file:/app/hadoop/tmp/mapred/local/1420805612756/antlr-runtime-3.4.jar
2015-01-09 13:15:54,534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar to DistributedCache through /tmp/temp206201348/tmp-1481268210/guava-11.0.2.jar
2015-01-09 13:15:56,233 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/lib/joda-time-2.1.jar to DistributedCache through /tmp/temp206201348/tmp-1921418840/joda-time-2.1.jar
2015-01-09 13:15:56,340 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-01-09 13:15:56,366 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2015-01-09 13:15:56,367 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2015-01-09 13:15:56,368 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2015-01-09 13:15:56,483 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2015-01-09 13:15:56,486 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2015-01-09 13:15:56,505 [JobControl] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2015-01-09 13:15:56,582 [JobControl] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:15:56,695 [JobControl] WARN  org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2015-01-09 13:15:57,070 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:15:57,070 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:15:57,197 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 52
2015-01-09 13:15:57,227 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:15:57,228 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:15:57,263 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 24
2015-01-09 13:15:57,289 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:15:57,289 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:15:57,306 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 6
2015-01-09 13:15:57,393 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:82
2015-01-09 13:15:57,416 [JobControl] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:15:57,791 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_local561414911_0001
2015-01-09 13:15:58,741 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758017/pig-0.14.0-core-h2.jar <- /home/hduser/pig-0.14.0-core-h2.jar
2015-01-09 13:15:58,755 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp1912320441/pig-0.14.0-core-h2.jar as file:/app/hadoop/tmp/mapred/local/1420805758017/pig-0.14.0-core-h2.jar
2015-01-09 13:15:58,757 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758018/automaton-1.11-8.jar <- /home/hduser/automaton-1.11-8.jar
2015-01-09 13:15:58,766 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp-886499198/automaton-1.11-8.jar as file:/app/hadoop/tmp/mapred/local/1420805758018/automaton-1.11-8.jar
2015-01-09 13:15:58,768 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758019/antlr-runtime-3.4.jar <- /home/hduser/antlr-runtime-3.4.jar
2015-01-09 13:15:58,778 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp1437387446/antlr-runtime-3.4.jar as file:/app/hadoop/tmp/mapred/local/1420805758019/antlr-runtime-3.4.jar
2015-01-09 13:15:58,779 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758020/guava-11.0.2.jar <- /home/hduser/guava-11.0.2.jar
2015-01-09 13:15:58,786 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp-1481268210/guava-11.0.2.jar as file:/app/hadoop/tmp/mapred/local/1420805758020/guava-11.0.2.jar
2015-01-09 13:15:58,787 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758021/joda-time-2.1.jar <- /home/hduser/joda-time-2.1.jar
2015-01-09 13:15:58,795 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp-1921418840/joda-time-2.1.jar as file:/app/hadoop/tmp/mapred/local/1420805758021/joda-time-2.1.jar
2015-01-09 13:15:58,953 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758017/pig-0.14.0-core-h2.jar
2015-01-09 13:15:58,954 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758018/automaton-1.11-8.jar
2015-01-09 13:15:58,955 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758019/antlr-runtime-3.4.jar
2015-01-09 13:15:58,955 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758020/guava-11.0.2.jar
2015-01-09 13:15:58,955 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758021/joda-time-2.1.jar
2015-01-09 13:15:58,970 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://localhost:8080/
2015-01-09 13:15:58,973 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local561414911_0001
2015-01-09 13:15:58,973 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases records_infobox,records_mappingbased,records_person,records_union,result_filter
2015-01-09 13:15:58,973 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: records_person[10,17],records_person[-1,-1],null[-1,-1],records_union[13,16],records_infobox[6,18],records_infobox[-1,-1],result_filter[16,16],records_mappingbased[8,23],records_mappingbased[-1,-1],null[-1,-1] C:  R: 
2015-01-09 13:15:58,990 [Thread-19] INFO  org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter set in config null
2015-01-09 13:15:58,991 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-01-09 13:15:58,994 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_local561414911_0001]
2015-01-09 13:15:59,067 [Thread-19] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2015-01-09 13:15:59,069 [Thread-19] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:15:59,069 [Thread-19] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-01-09 13:15:59,094 [Thread-19] INFO  org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter is org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2015-01-09 13:15:59,257 [Thread-19] INFO  org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks
2015-01-09 13:15:59,258 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local561414911_0001_m_000000_0
2015-01-09 13:15:59,459 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task -  Using ResourceCalculatorProcessTree : [ ]
2015-01-09 13:15:59,470 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 134217728
Input split[0]:
   Length = 134217728
   ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
   Locations:

-----------------------

2015-01-09 13:15:59,522 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed hdfs://master:9000/wiki/infobox_properties_en.nt:0+134217728
2015-01-09 13:15:59,662 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2015-01-09 13:15:59,743 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: records_person[10,17],records_person[-1,-1],null[-1,-1],records_union[13,16],records_infobox[6,18],records_infobox[-1,-1],result_filter[16,16],records_mappingbased[8,23],records_mappingbased[-1,-1],null[-1,-1] C:  R: 
2015-01-09 13:15:59,798 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject(ACCESSING_NON_EXISTENT_FIELD): Attempt to access field which was not found in the input
2015-01-09 13:15:59,815 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject(ACCESSING_NON_EXISTENT_FIELD): Attempt to access field which was not found in the input
2015-01-09 13:16:05,578 [communication thread] INFO  org.apache.hadoop.mapred.LocalJobRunner - map > map
2015-01-09 13:16:08,582 [communication thread] INFO  org.apache.hadoop.mapred.LocalJobRunner - map > map
2015-01-09 13:16:10,209 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - map > map
2015-01-09 13:16:10,699 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task:attempt_local561414911_0001_m_000000_0 is done. And is in the process of committing
2015-01-09 13:16:10,714 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - map > map
2015-01-09 13:16:10,714 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task attempt_local561414911_0001_m_000000_0 is allowed to commit now
2015-01-09 13:16:10,849 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local561414911_0001_m_000000_0' to hdfs://master:9000/tmp/temp206201348/tmp-1297558267/_temporary/0/task_local561414911_0001_m_000000
2015-01-09 13:16:10,854 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - map
2015-01-09 13:16:10,854 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task 'attempt_local561414911_0001_m_000000_0' done.
2015-01-09 13:16:10,855 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local561414911_0001_m_000000_0
2015-01-09 13:16:10,855 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local561414911_0001_m_000001_0
2015-01-09 13:16:10,877 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task -  Using ResourceCalculatorProcessTree : [ ]
2015-01-09 13:16:10,883 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
....

      

+3


source to share


2 answers


Run: yarn application -list

and check if node can connect to ResourceManager. In your case is: master:8050

.



0


source


I had a similar problem, but with different ones mapred-site.xml

, but nevertheless, I think there is a problem.

Yarn

is the next version MR

, so we need the following section in the file to make sure it's used with older programs:

    <property>
            <name>mapred.framework.name</name>
            <value>yarn</value>
            <final>true</final>
    </property>

      

However, if you use Yarn

, you don't Jobtracker

, since it was replaced by ResourceManager

in a sense (it was actually a complete redesign. You can read about it at http://blog.cloudera.com/blog/2013/11/ migrating-to-mapreduce-2-on-yarn-for-operators / )



So, you need to remove the following lines:

    <property>
            <name>mapreduce.jobtracker.address</name>
            <value>master:54311</value>
            <description>...</description>
    </property>

      

from the file and the pig will be good to go.

(There is a related answer discussing this change in Why does YARN have a mapreduce.jobtracker.address setting?

0


source







All Articles