Regex: Match the text before the end of line

2019-06-09 04:22发布

I have a file that looks like this:

J6      INT-00113G  227.905    5.994  180  ~!@#$%&^)
J3      INT-00113G  227.905 -203.244  180  12341341312315
U13     EXCLUDES    -42.210  181.294  180  QFP128
U3      IC-00276G     5.135  198.644  90   B%GA!@-48
U12     IC-00270G  -123.610 -201.594  0    SOP8_000
J1      INT-00112G  269.665  179.894  180  SOIC16_1
J2      INT-00112G  269.665  198.144  180  SOIC16-_2
..      ..........  .......  .......  ...  ................

And I would like to match the end value in the 6th column in order to remove it from a list. The length of the value in the 6th column is undetermined and can contain any character. So what I would like to do is match the end value before a space. or just the end of the line.


CODE:

        // Reads the lines in the file to format.
        var fileReader = File.OpenText(filePath + "\\Remove Package 1 Endings.txt");

        // Creates a list for the lines to be stored in.
        var fileList = new List<string>();

        // Adds each line in the file to the list.
        while (true)
        {
            var line = fileReader.ReadLine();
            if (line == null)
                break;

            fileList.Add(line);
        }

        var mainResult = new List<string>();
        var theResult = new List<string>();

        foreach (var mainLine in fileList)
            mainResult.Add(string.Join(" ", mainLine));

        foreach (var theLine in mainResult)
        {
            // PLACEMENT ONE Regex
            Match theRegex = Regex.Match(theLine, @"insert the regex here!");

            if (theRegex.Success)
                theResult.Add(string.Join(" ", theLine));
        }

        // Removes the matched values from both of the Regex used above.
        List<string> userResult = mainResult.Except(theResult).ToList();

        // Prints the proper values into the assigned RichTextBoxes.
        foreach (var line in userResult)
            richTextBox2.AppendText(line + "\n");

What I am trying to do is get the file to look like this:

J6      INT-00113G  227.905    5.994  180
J3      INT-00113G  227.905 -203.244  180
U13     EXCLUDES    -42.210  181.294  180
U3      IC-00276G     5.135  198.644  90
U12     IC-00270G  -123.610 -201.594  0
J1      INT-00112G  269.665  179.894  180
J2      INT-00112G  269.665  198.144  180

QUESTION:

  • Can anyone help come up with a regex for this?

EDIT:

ADDED CODE:

        var lines = new List<string>(File.ReadAllLines(filePath + "\\Remove Package 1 Endings.txt"));
        for (int i = 0; i < lines.Count; i++)
        {
            var idx = lines[i].LastIndexOf(" ");

            if (idx != -1)
                lines[i] = lines[i].Remove(idx);

            richTextBox1.AppendText(lines[i] + Environment.NewLine
        }

3条回答
我只想做你的唯一
2楼-- · 2019-06-09 04:25

Just relying on the fact that each column is separated by spaces you could use:

\s+([\S]*)$
查看更多
Emotional °昔
3楼-- · 2019-06-09 04:34

I think that you're making this more complex than it really is; for instance, the following should help you removing the last part of the data if formatted as per your example, with a little tweaking, such as trimming (and, obviously, error mitigation), I'm sure this would suit:

var lines = new List<string>(File.ReadAllLines(path));
for (int i = 0; i < lines.Count; i++) 
{
    var idx = lines[i].LastIndexOf(" ");   
    if (idx != -1)
    {     
        lines[i] = lines[i].Remove(idx);
    }
}

Note that it is possible to read all lines of a file in one fell swoop, this isn't always desired depending on the size of the file to be loaded, but I see you're loading each of the lines anyway before processing - in which case we can just make the whole thing more concise.

查看更多
【Aperson】
4楼-- · 2019-06-09 04:36

\S+$ should do it, with multiline functionality enabled. (Not sure how exactly you enable regex flags in... C#, is it?, but prepending (?m) to the string works with some regex engines, though it's not the only way to do it.).

\S - matches any non-whitespace character
+ - indicates that the preceding regex element should be matched one or more times
$ - indicates matching to the end of the string, or end of a line if multiline is enabled.

EDIT: You're checking each line individually, so no need to worry about multiline stuff.

(Though as stated by others, going with regex for this is probably making things more complicated than necessary.)

查看更多
登录 后发表回答