SnakeYAML is instantiating ArrayList instead of Ha

2019-07-26 17:07发布

I need to parse the following YAML file.

arguments:
  - Database
  - Fold
  - MetaFeature
  - Algorithm
  - Config
processes:
  - id: MetaFeatureCalculator
    command: "python metaFeatCalc.py {Database} folds/{Fold} de/{MetaFeature}/{Algorithm}.csv"
    in: [Database, Fold]
    out: [MetaFeature, Algorithm]
    log: "mf/{Fold}/{MetaFeature}.out"
  - id: Tunner
    command: "java -jar tunner.jar {MetaFeature} alg/{Algorithm} {config}"
    in: [Metafeature, Algorithm]
    out: [Config]
    log: "mf/{Metafeature}/{Algorithm}.out"
recipeDefaults:
  - Database: ["D1"]
recipes:
  - id: Ex1
    uses:
      - Database: ["D1", "D2"]
      - MetaFeature: ["M1", "M2"]
      - Algorithm: ["A1", "A2"]
      - Config: ["C1", "C4"]
  - id: Ex2
    uses:
      - Folds: ["F1", "F2", "F5"]
      - MetaFeature: ["M1", "M2"]
      - Algorithm: ["A1", "A2"]
      - Config: ["C1", "C4"]

And I created the following POJOs to receive this data.

Repo: https://github.com/Pacheco95/ExperimentLoader

@Data
public class Experiment {
  private HashSet<String> arguments;
  private HashSet<Process> processes;
  private HashSet<HashMap<String, HashSet<String>>> recipeDefaults;
  private HashSet<Recipe> recipes;
}
@Data
public class Process {
  private String id;
  private String command;
  private HashSet<String> in;
  private HashSet<String> out;
  private String log;
}
@Data
public class Recipe {
  private String id;
  private HashSet<HashMap<String, HashSet>> uses;
}

And this class to test the parser:

public class ExperimentLoader {
  public static void main(String[] args) throws IOException {
    InputStream is = args.length == 0 ? System.in : Files.newInputStream(Paths.get(args[0]));
    Yaml yaml = new Yaml();
    Experiment experiment = yaml.loadAs(is, Experiment.class);
    Gson gson = new GsonBuilder().setPrettyPrinting().serializeNulls().create();
    System.out.println(gson.toJson(experiment));
  }
}

The parser seems to work well, but running this code in debugging mode some fields was instantiated with the correct type (HashSet) while others dont. They was instantiated as ArrayLists (I have no idea what kind of magic happened here).

This is a snapshot of degubbing window:

Bug snapshot[1]

Output from my test class:

{
  "arguments": [
    "Fold",
    "MetaFeature",
    "Config",
    "Database",
    "Algorithm"
  ],
  "processes": [
    {
      "id": "MetaFeatureCalculator",
      "command": "python metaFeatCalc.py {Database} folds/{Fold} de/{MetaFeature}/{Algorithm}.csv",
      "in": [
        "Fold",
        "Database"
      ],
      "out": [
        "MetaFeature",
        "Algorithm"
      ],
      "log": "mf/{Fold}/{MetaFeature}.out"
    },
    {
      "id": "Tunner",
      "command": "java -jar tunner.jar {MetaFeature} alg/{Algorithm} {config}",
      "in": [
        "Metafeature",
        "Algorithm"
      ],
      "out": [
        "Config"
      ],
      "log": "mf/{Metafeature}/{Algorithm}.out"
    }
  ],
  "recipeDefaults": [
    {
      "Database": [
        "D1"
      ]
    }
  ],
  "recipes": [
    {
      "id": "Ex2",
      "uses": [
        {
          "MetaFeature": [
            "M1",
            "M2"
          ]
        },
        {
          "Folds": [
            "F1",
            "F2",
            "F5"
          ]
        },
        {
          "Config": [
            "C1",
            "C4"
          ]
        },
        {
          "Algorithm": [
            "A1",
            "A2"
          ]
        }
      ]
    },
    {
      "id": "Ex1",
      "uses": [
        {
          "MetaFeature": [
            "M1",
            "M2"
          ]
        },
        {
          "Config": [
            "C1",
            "C4"
          ]
        },
        {
          "Database": [
            "D1",
            "D2"
          ]
        },
        {
          "Algorithm": [
            "A1",
            "A2"
          ]
        }
      ]
    }
  ]
}

My dependencies:

<dependencies>
  <dependency>
    <groupId>junit</groupId>
    <artifactId>junit</artifactId>
    <version>4.11</version>
    <scope>test</scope>
  </dependency>
  <dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
    <version>1.18.8</version>
    <scope>provided</scope>
  </dependency>
  <dependency>
    <groupId>org.yaml</groupId>
    <artifactId>snakeyaml</artifactId>
    <version>1.24</version>
  </dependency>
  <dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.8.5</version>
  </dependency>
</dependencies>

Has anyone had this problem? I cant find a solution.

1条回答
放我归山
2楼-- · 2019-07-26 17:27

Your problem is probably type erasure:

When type-safe (generic) collections are JavaBean properties SnakeYAML dynamically detects the required class. […]

It does not work if generic type is abstract class (interface). You have to put an explicit tag in the YAML or provide the explicit TypeDescription. TypeDescription serves the goal to collect more information and use it while loading/dumping.

While you do not use abstract classes or interfaces, I assume that SnakeYaml has a problem with discovering the nested generic types of HashSet<HashMap<String, HashSet>>. The documentation suggests adding a TypeDescription; however that does not solve your problem because the interface is designed so that you can only specify the type inside the outer HashSet, but not inside the inner HashMap. The fact that the interface does not expect nested containers also hints that this is your problem.

A workaround would be to add explicit tags inside your YAML to the sets that fail to load properly:

- Database: !!set ["D1"]
- MetaFeature: !!set ["M1", "M2"]

If you don't want to do that, you basically have two other options: Patch this feature into SnakeYAML, or use the low-level API and generate your types manually from the parser events.

查看更多
登录 后发表回答