I am using Jackson JSON parser (1.9.5) in Hadoop Java M/R program (0.20.205). Given the JSON example below:
{"id":23423423, "name":"abc", "location":{"displayName":"Florida, Rosario","objectType":"place"}, "price":1234.55}
Now, let say I just want to parse out id, location.displayName, and price so I created the following Java object and I am omitting unwanted fields.
@JsonIgnoreProperties(ignoreUnknown = true)
public class Transaction {
private long id;
private Location location;
private double price;
private static final ObjectMapper mapper = new ObjectMapper();
..setter/getter method would be here for id, Location, price
@JsonIgnoreProperties(ignoreUnknown = true)
public static class Location {
private String displayName;
public String getDisplayName { return displayName; }
public void setDisplayName(String displayName) { this.displayName = displayName; }
}
public static final Transaction fromJsonDoc(String jsonDoc) throws IOException {
JsonNode rootNode = mapper.readTree(jsonDoc);
return mapper.treeToValue(rootNode, Transaction.class);
}
}
When I run this program in standalone mode (not in Hadoop distributed mode). All the fields that I want parse out correctly. However, as soon as I try to parse out the data in Hadoop map only job, I only get the id field and not the location.displayName
and price (they are not deserialized and are null). It seems that the @JsonIgnoreProperties(ignoreUnknown = true)
annotation is somehow not working property when running in MapReduce and the fields that I want don't get deserialized (everything after id is null). If I add all the fields and getters and setter to to my Transaction
object and remove @JsonIgnoreProperties
, then everything works fine.
Does anyone have a suggestion why this is happening? I just gave a simple example but in reality my JSON document is very complex and I don't want to deserialize all the fields out of it. Am I doing something wrong here?
This is how I am using Jackson in main method and Java/Map reduce program.
Transaction tran = Transaction.fromJsonDoc(jsonRec);
System.out.println("id: " + tran.getId()); //works in both
System.out.println("location: " + tran.getLocation().getDisplayName()); //works only in standalone execution but not in Map/Reduce
This might be due to class loading problems: old version of jackson core or so. The tricky part wrt class loading and annotations is that VM is apparently allowed to just drop annotations it does not recognize. I don't know if this could be causing problem you have, but it may be worth checking. Hadoop used to bundle rather old version of Jackson (1.1?), and since
@JsonIgnoreProperties
was added in 1.4, this might explain the problem.How could this occur? You must be compiling using a more recent version (to see the annotation), but perhaps runtime environment is using old (1.1) version. Because you do not actively use annotation class from your code (it is "only" associated with the class), class loader would then drop this annotation as it can not find it from jar.