Univocity - parse each TSV file row to different T

2019-06-02 07:57发布

问题:

I have a tsv file which has fixed rows but each row is mapped to different Java Class.

For example.

recordType  recordValue1
recordType  recordValue1 recordValue2

for First row I have follofing class:

public class FirstRow implements ItsvRecord {

    @Parsed(index = 0)
    private String recordType;

    @Parsed(index = 1)
    private String recordValue1;

    public FirstRow() {
    }
}

and for second row I have:

public class SecondRow implements ItsvRecord {

    @Parsed(index = 0)
    private String recordType;

    @Parsed(index = 1)
    private String recordValue1;

    public SecondRow() {
    }
}

I want to parse the TSV file directly to the respective objects but I am falling short of ideas.

回答1:

Use an InputValueSwitch. This will match a value in a particular column of each row to determine what RowProcessor to use. Example:

Create two (or more) processors for each type of record you need to process:

final BeanListProcessor<FirstRow> firstProcessor = new BeanListProcessor<FirstRow>(FirstRow.class);
final BeanListProcessor<SecondRow> secondProcessor = new BeanListProcessor<SecondRow>(SecondRow.class);

Create an InputValueSwitch:

//0 means that the first column of each row has a value that 
//identifies what is the type of record you are dealing with
InputValueSwitch valueSwitch = new InputValueSwitch(0);

//assigns the first processor to rows whose first column contain the 'firstRowType' value
valueSwitch.addSwitchForValue("firstRowType", firstProcessor);

//assigns the second processor to rows whose first column contain the 'secondRowType' value
valueSwitch.addSwitchForValue("secondRowType", secondProcessor);

Parse as usual:

TsvParserSettings settings = new TsvParserSettings(); //configure...
// your row processor is the switch
settings.setProcessor(valueSwitch);

TsvParser parser = new TsvParser(settings);

Reader input = new StringReader(""+
        "firstRowType\trecordValue1\n" +
        "secondRowType\trecordValue1\trecordValue2");

parser.parse(input);

Get the parsed objects from your processors:

List<FirstRow> firstTypeObjects = firstProcessor.getBeans();
List<SecondRow> secondTypeObjects = secondProcessor.getBeans();

The output will be*:

[FirstRow{recordType='firstRowType', recordValue1='recordValue1'}]

[SecondRow{recordType='secondRowType', recordValue1='recordValue1', recordValue2='recordValue2'}]
  • Assuming you have a sane toString() implemented in your classes

If you want to manage associations among the objects that are parsed:

If your FirstRow should contain the elements parsed for records of type SecondRow, simply override the rowProcessorSwitched method:

    InputValueSwitch valueSwitch = new InputValueSwitch(0) {
    @Override
    public void rowProcessorSwitched(RowProcessor from, RowProcessor to) {
        if (from == secondProcessor) {
            List<FirstRow> firstRows = firstProcessor.getBeans();
            FirstRow mostRecentRow = firstRows.get(firstRows.size() - 1);

            mostRecentRow.addRowsOfOtherType(secondProcessor.getBeans());
            secondProcessor.getBeans().clear();
        }
    }
};
  • The above assumes your FirstRow class has a addRowsOfOtherType method that takes a list of SecondRow as parameter.

And that's it!

You can even mix and match other types of RowProcessor. There's another example here that demonstrates this.

Hope this helps.



标签: univocity