Parsing non-standard date formats with DateTime.Tr

2019-06-24 05:34发布

问题:

Hi Im trying to parse date strings like "1012012", "1st January 2012".

  1. read the Api It says to use d,%d where the date does not have a leading 0. Cant get it working for dates like "1012012"

  2. trying to use "d MMM YYYY" for "1st January 2012", what do I use so 'st', 'th' works?

    using System;
    using System.IO;
    using System.Globalization;
    
    namespace test
    {
      class Script
      {
        static public void Main(string [] args)
        {
    
            //String dateString = "9022011";  // q1
            String dateString = "9th February 2011";  //q2
            System.DateTime date = DateTime.MinValue;
            string[] format = { "ddMMyyyy", "d MMM yyyy" }; // what would be the correct format strings?
    
            if (DateTime.TryParseExact(dateString,format,new CultureInfo("en-AU"),DateTimeStyles.None,out date))
                            {
                Console.Out.WriteLine(date.ToString());
            } 
                            else
                            {
                Console.Out.WriteLine("cant convert");
            }
         }
      }
    
    }
    

回答1:

  1. I don't think this can be done. The parser processes your input left-to-right, so if it sees "1012012" it will think the day is 10, and then fail the parse because there's not enough characters left, even if the format string is "dMMyyyy". It would need some kind of backtracking to consider the possibility that the day is 1, but it doesn't seem to do that unfortunately.

    It is however fairly simple to use a custom regex to parse this format. The regex parser does use backtracking so it will correctly consider both options:

    string input = "1012012";
    Match m = Regex.Match(input, @"^(?<day>\d{1,2})(?<month>\d{2})(?<year>\d{4})$");
    if( m.Success )
    {
        DateTime d = new DateTime(Convert.ToInt32(m.Groups["year"].Value),
                                  Convert.ToInt32(m.Groups["month"].Value),
                                  Convert.ToInt32(m.Groups["day"].Value));
    }
    

    Another option would be to simple add a leading zero if the length of the string is seven:

    string input = "1012012";
    if( input.Length == 7 )
        input = "0" + input;
    DateTime d = DateTime.ParseExact(input, "ddMMyyyy", CultureInfo.CurrentCulture);
    
  2. Rather than attempting to do multiple find and replaces as in the other answers, you can use the fact that the exact format of the string is known. It starts with one or two digits, followed by two letters, followed by the month and the year. So you could extract the date like this:

    string input = "1st January 2012";
    int index = char.IsNumber(input, 1) ? 2 : 1;
    input = input.Substring(0, index) + input.Substring(index + 2);
    DateTime d = DateTime.ParseExact(input, "d MMMM yyyy", CultureInfo.InvariantCulture);
    

    Of course, this will accept dates that have pure nonsense in those positions, like "1xx January 2012", but I'm not sure if that's a problem in your case.

    Also be sure to pass the appropriate CultureInfo if the input can hold non-English month names.

If you can get either format without knowing in advance which you're getting, you'll need a simple check to see which method to use beforehand. Strings in the first format will always be 7 or 8 characters, and strings in the second format will always be longer, so this should be easy to test. Another method would be to check if the string contains any non-numeric characters (in which case it's the long format).



回答2:

var dateString = "1st February 2011";
DateTime date;
var replaced = dateString.Substring(0,4)
                         .Replace("nd","")
                         .Replace("th","")
                         .Replace("rd","")
                         .Replace("st","")
                         + dateString.Substring(4);

DateTime.TryParseExact(replaced, "d MMMM yyyy",
                       new CultureInfo("en-us"), DateTimeStyles.AssumeLocal, 
                       out date);

should do the trick (sorry the 'th' is nasty) - you have to take some care with the st (August) - just remove it only from the first few appearances:



回答3:

If you want to parse culture specific date strings you should use a matching culture. CultureInfo.InvariantCulture isn't a good idea, because it will only work with English strings.
However, what you are trying to do is not possible only with format specifiers, because there is no for the day that can parse the "th", "st" etc. strings. You will have to remove them manually beforehand.