Extracting specific part of a text file in C#

2019-09-21 17:52发布

I usually add some strings from a text file into a list or array line by line, although I am now using "#"'s as separators in the text file. How would it be possible to read the two strings "softpedia.com" and "download.com" into a list using the two "#" signs as a breaking point? Baring in mind that there might be more or less strings inbetween the two hashes

e.g.

# Internal Hostnames
softpedia.com
download.com
# External Hostnames

Expected output:

softpedia.com
download.com

标签: c# .net windows
3条回答
我命由我不由天
2楼-- · 2019-09-21 18:11

It sounds like you want to read all of the lines in between a set of # start lines. If so try the following

List<string> ReadLines(string filePath) {
  var list = new List<string>();
  var foundStart = false;
  foreach (var line in File.ReadAllLines(filePath)) {
    if (line.Length > 0 && line[0] == '#') {
      if (foundStart) {
        return list;
      }
      foundStart = true;
    } else if (foundStart) {
      list.Add(line);
    }
  }
  return line;
}
查看更多
\"骚年 ilove
3楼-- · 2019-09-21 18:15
class Program
{
    static void Main()
    {
        using (var reader = File.OpenText("test.txt"))
        {
            foreach (var line in Parse(reader))
            {
                Console.WriteLine(line);
            }
        }
    }

    public static IEnumerable<string> Parse(StreamReader reader)
    {
        string line;
        bool first = false;
        while ((line = reader.ReadLine()) != null)
        {
            if (!line.StartsWith("#"))
            {
                if (first)
                {
                    yield return line;
                }
            }
            else if (!first)
            {
                first = true;
            }
            else
            {
                yield break;
            }
        }
    }
}

and if you wanted to just get them in a list:

using (var reader = File.OpenText("test.txt"))
{
    List<string> hostnames = Parse(reader).ToList();
}
查看更多
我欲成王,谁敢阻挡
4楼-- · 2019-09-21 18:16

Read it into a buffer and let regex do the work.

string input = @"
# Internal Hostnames 
softpedia.com 
download.com
# External Hostnames    
";
string pattern = @"^(?!#)(?<Text>[^\r\s]+)(?:\s?)";

Regex.Matches(input, pattern, RegexOptions.Multiline)
     .OfType<Match>()
     .Select (mt => mt.Groups["Text"].Value)
     .ToList()
     .ForEach( site => Console.WriteLine (site));

/* Outputs
softpedia.com
download.com
*/
查看更多
登录 后发表回答