I'm trying to test a Hadoop .mapreduce Avro job using MRUnit. I am receiving a NullPointerException as seen below. I've attached a portion of the pom and source code. Any assistance would be appreciated.
Thanks
The error I'm getting is :
java.lang.NullPointerException
at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:73)
at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:91)
at org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:104)
at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:608)
at org.apache.hadoop.mrunit.MapDriverBase.setInputKey(MapDriverBase.java:64)
at org.apache.hadoop.mrunit.MapDriverBase.setInput(MapDriverBase.java:104)
at org.apache.hadoop.mrunit.MapDriverBase.withInput(MapDriverBase.java:218)
at org.lab41.project.mapreduce.ParseMetadataAsTextIntoAvroTest.testMap(ParseMetadataAsTextIntoAvroTest.java:115)
.....
pom snippet:
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
<version>0.9.0-incubating</version>
<classifier>hadoop2</classifier>
<scope>test</scope>
</dependency>
<avro.version>1.7.4</avro.version>
<hadoop.version>2.0.0-mr1-cdh4.1.3</hadoop.version>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>${avro.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-mapred</artifactId>
<version>${avro.version}</version>
<classifier>hadoop2</classifier>
</dependency>
Here is an excerpt of the test :
import static org.junit.Assert.*;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import org.apache.avro.mapred.AvroKey;
import org.apache.avro.hadoop.io.AvroSerialization;
import org.apache.avro.mapred.AvroValue;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.types.Pair;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;
import org.lab41.project.domain.DataRecord;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class ParseMetadataAsTextIntoAvroTest {
Logger logger = LoggerFactory
.getLogger(ParseMetadataAsTextIntoAvroTest.class);
private MapDriver<LongWritable, Text, AvroKey<Long>, AvroValue<DataRecord>> mapDriver;
@BeforeClass
public static void setUpClass() {
}
@AfterClass
public static void tearDownClass() {
}
@Before
public void setUp() throws IOException {
ParseMetadataAsTextIntoAvroMapper mapper = new ParseMetadataAsTextIntoAvroMapper();
mapDriver = new MapDriver<LongWritable, Text, AvroKey<Long>, AvroValue<DataRecord>>();
mapDriver.setMapper(mapper);
mapDriver.getConfiguration().setStrings("io.serializations", new String[]{
AvroSerialization.class.getName()
});
}
@Test
public void testMap() throws ParseException, IOException {
Text testInputText = new Text(test0);
DataRecord record = new DataRecord();
….
AvroKey<Long> expectedPivot = new AvroKey<Long>(1L);
AvroValue<DataRecord> expectedRecord = new AvroValue<DataRecord>(record);
mapDriver.withInput(new Pair<LongWritable, Text>(new LongWritable(1), testInputText));
mapDriver.withOutput(new Pair<AvroKey<Long>, AvroValue<DataRecord>>(expectedPivot, expectedRecord));
mapDriver.runTest();
}
}
Answered here: https://issues.apache.org/jira/browse/MRUNIT-181 specifically: https://cwiki.apache.org/confluence/display/MRUNIT/MRUnit+with+Avro
You have to add
AvroSerialization
to the default serializations and configureAvroSerialization
.In order to get this to work you have add the
AvroSerializatio
to the default serailizations. You also have to configureAvroSerializationn
.This also solve the problem, with merits of shorter and more clear code.
It may be bug of mrunit, that don't set the io.serializations right Instead it should have been set by job.setInputFormatClass(AvroKeyInputFormat.class) I think.