What's the best way of parsing a fixed-width f

2019-01-06 18:48发布

I've got a file from a vendor that has 115 fixed-width fields per line. What's the best way of parsing that file into the 115 fields so I can use them in my code?

My first thought is just to make constants for each field like NAME_START_POSITION and NAME_LENGTH and using substring. That just seems ugly so I'm curious if there's any other recommended ways of doing this. None of the couple of libraries a Google search turned up seemed any better either. Thanks

10条回答
霸刀☆藐视天下
2楼-- · 2019-01-06 19:46

Here is the plain java code to read fixedwidth file:

import java.io.File;
import java.io.FileNotFoundException;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;

public class FixedWidth {

    public static void main(String[] args) throws FileNotFoundException, IOException {
        // String S1="NHJAMES TURNER M123-45-67890004224345";
        String FixedLengths = "2,15,15,1,11,10";

        List<String> items = Arrays.asList(FixedLengths.split("\\s*,\\s*"));
        File file = new File("src/sample.txt");

        try (BufferedReader br = new BufferedReader(new FileReader(file))) {
            String line1;
            while ((line1 = br.readLine()) != null) {
                // process the line.

                int n = 0;
                String line = "";
                for (String i : items) {
                    // System.out.println("Before"+n);
                    if (i == items.get(items.size() - 1)) {
                        line = line + line1.substring(n, n + Integer.parseInt(i)).trim();
                    } else {
                        line = line + line1.substring(n, n + Integer.parseInt(i)).trim() + ",";
                    }
                    // System.out.println(
                    // S1.substring(n,n+Integer.parseInt(i)));
                    n = n + Integer.parseInt(i);
                    // System.out.println("After"+n);
                }
                System.out.println(line);
            }
        }

    }

}
查看更多
我欲成王,谁敢阻挡
3楼-- · 2019-01-06 19:48

I would use a flat file parser like flatworm instead of reinventing the wheel: it has a clean API, is simple to use, has decent error handling and a simple file format descriptor. Another option is jFFP but I prefer the first one.

查看更多
男人必须洒脱
4楼-- · 2019-01-06 19:49

uniVocity-parsers comes with a FixedWidthParser and FixedWidthWriter the can support tricky fixed-width formats, including lines with different fields, paddings, etc.

// creates the sequence of field lengths in the file to be parsed
FixedWidthFields fields = new FixedWidthFields(4, 5, 40, 40, 8);

// creates the default settings for a fixed width parser
FixedWidthParserSettings settings = new FixedWidthParserSettings(fields); // many settings here, check the tutorial.

//sets the character used for padding unwritten spaces in the file
settings.getFormat().setPadding('_');

// creates a fixed-width parser with the given settings
FixedWidthParser parser = new FixedWidthParser(settings);

// parses all rows in one go.
List<String[]> allRows = parser.parseAll(new File("path/to/fixed.txt")));

Here are a few examples for parsing all sorts of fixed-width inputs.

And here are some other examples for writing in general and other fixed-width examples specific to the fixed-width format.

Disclosure: I'm the author of this library, it's open-source and free (Apache 2.0 License)

查看更多
Animai°情兽
5楼-- · 2019-01-06 19:50

Most suitable for Scala, but probably you could use it in Java

I was so fed up with the fact that there is no proper library for fixed length format that I have created my own. You can check it out here: https://github.com/atais/Fixed-Length

A basic usage is that you create a case class and it's described as an HList (Shapeless):

case class Employee(name: String, number: Option[Int], manager: Boolean)

object Employee {

    import com.github.atais.util.Read._
    import cats.implicits._
    import com.github.atais.util.Write._
    import Codec._

    implicit val employeeCodec: Codec[Employee] = {
      fixed[String](0, 10) <<:
        fixed[Option[Int]](10, 13, Alignment.Right) <<:
        fixed[Boolean](13, 18)
    }.as[Employee]
}

And you can easily decode your lines now or encode your object:

import Employee._
Parser.decode[Employee](exampleString)
Parser.encode(exampleObject)
查看更多
登录 后发表回答