Parsing C# string representing a 'fixed-length

2019-02-21 06:24发布

问题:

I have a fixed length string message in a that looks like this:

"\0\0\0j\0\0\0\vT3A1111        2999BOSH                          2100021        399APV                           2100022  "

This message is created from me reading a byte[] into a StringBuilder to build string.

Above, string portion "\0\0\0j\0\0\0\v" are supposed to be LENGTH and ID fields, both 4 bytes long. However, I am not sure how to extract these 2 values but I can see that HEX 0j is 106 (1+1+8+9+30+9+9+30+9=106 total in length). I am not sure why the "v" is not "0v" above but I know it is supposed to be HEX value representing message id.

First 2 fields of length 4 are HEX, all other are ASCII.

This is not an EDI message (so cannot use EDI parser library) and unlike EDI messages which has some kind of field identifier, I have only stream of bytes and I know only the length of the fields. The fields are:

4  byte long message length      ("\0\0\0j")
4  byte long message id          ("\0\0\0\v")
1  byte long message type        ("T")
1  byte long message sequence    ("3")
8  byte long car Id              ("A1111   ")  
9  byte long part-1 price        ("     2999")
30 byte long part-1 manufacturer ("BOSH                          ")
9  byte long part#               ("2100021  ")
9  byte long part-2 price        ("      399")
30 byte long part-2 manufacturer ("APV                           ")
9  byte long part#               ("2100022  ")

So, above I have 2 parts made by 2 manufacturers but in real example, it could be more parts than just 2:

Part 1, 29.99, made by Bosh, part# 2100021
Part 2, 3.99, made by APV, part# 2100022

I would like to get all price and manufacturer fields out of this flat file string into a List objects where Part is

class Part
{
   public decimal Price {get; set}
   public string Manufacturer {get; set;}
   public string PartNumber {get; set;}
}

So, my List would contain all parts with their prices and manufacturers.

Since I have lengths of each fields, I know I could loop through this string and get me the Part related data. But, I wonder if there is a more elegant and easier way to do this.

Or even better, is there a open source library allowing me to parse something like this?

I receive this message using this method

private TcpClient clientSocket;
private NetworkStream serverStream;

private async System.Threading.Tasks.Task ReadResponseAsync()
{
    if (serverStream.CanRead)
    {
        byte[] readBuffer = new byte[1024];
        StringBuilder receivedMessage = new StringBuilder();
        int readSoFar = 0;

        do
        {
            readSoFar = await serverStream.ReadAsync(readBuffer, 0, readBuffer.Length);
            receivedMessage.AppendFormat("{0}", Encoding.ASCII.GetString(readBuffer, 0, readSoFar));
        } 
        while (serverStream.DataAvailable);

        string msg = receivedMessage.ToString();
    }
    else
    {
        Log("Error", "Cannot read from NetworkStream");
    }
}

@Enigmativity - I tried posting your answer and running it in LinqPad (never used it, just downloaded and installed it) but I dont see the table-like structure you posted in your answer. How do you get that?

Here is what I get

回答1:

Perhaps try something like this:

void Main()
{
    var line = "00580011T3A1111        2999Bosh                                399APV                                2399MAG                           ";

    var lengths = new[] { 4, 4, 1, 1, 8, 9, 30, 9, 30, 9, 30 };
    var starts = lengths.Aggregate(new[] { 0 }.ToList(), (a, x) => { a.Add(a.Last() + x); return a; });

    var fields = starts.Zip(lengths, (p, l) => line.Substring(p, l).Trim()).ToArray();

    var message = new
    {
        message_length = int.Parse(fields[0]),
        message_id = int.Parse(fields[1]),
        message_type = fields[2],
        message_sequence = int.Parse(fields[3]),
        car_Id = fields[4],
        parts =
            Enumerable
                .Range(0, 3)
                .Select(x => x * 2 + 5)
                .Select(x => new Part
                {
                    Price = decimal.Parse(fields[x]),
                    Manufacturer = fields[x + 1]
                }).ToArray(),
    };
}

public class Part
{
    public decimal Price { get; set; }
    public string Manufacturer { get; set; }
}

On the sample data that I used (which I had to fix as it appears to be corrupted in your question even when I remove the | and replace the - with spaces), I get this result:



回答2:

You say "byte[] into a StringBuilder to build string", so I take it you have a string. Perhaps try using SubString(..), something like:

var length = int.Parse(message.SubString(0,4);
var id = int.Parse(message.SubString(4,4);

etc

Edit: If there are unwanted filler characters try

message.Replace('-', ' ');

Not elegant, but it will work.