[HADOOP-11334] Mapreduce Job Failed due to failure fetching mapper output on the reduce side - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Implemented
Affects Version/s: 2.4.1
Fix Version/s: None
Component/s: io
Labels:
None

Description

Running terasort with the following options hadoop jar hadoop-mapreduce-examples.jar terasort -Dio.native.lib.available=false -Dmapreduce.map.output.compress=true -Dmapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.GzipCodec /tmp/tera-in /tmp/tera-out

The job failed with the reducer failed to fetch the output from mappers (see the following stacktrace). The problem is that in JIRA ~~MAPREDUCE-1784~~, it added support to handle null compressors to default to non-compressed output. In this case, when the io.native.lib.available is set to false, the compressor will be null. However, the decompressor has a Java implementation, so when the reducer tries to read the mapper output, it uses the decompressor, but the output does not have the Gzip header.

2014-11-25 10:39:48,108 WARN fetcher#9 org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to shuffle output of attempt_1416875111322_0005_m_000002_0 from bdvs130:13562
java.io.IOException: not a gzip file
at org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.processBasicHeader(BuiltInGzipDecompressor.java:495)
at org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeHeaderState(BuiltInGzipDecompressor.java:256)
at org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:185)
at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:91)
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:434)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:341)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

Attachments

Issue Links

relates to

HADOOP-8642 Document that io.native.lib.available only controls native bz2 and zlib compression codecs

Closed

Activity

People

Assignee:: Yuanbo Liu

Reporter:: Jinghui Wang

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 25/Nov/14 20:37

Updated:: 06/Apr/16 06:59

Resolved:: 06/Apr/16 06:59