Parsing a RFC 822 date with NSDateFormatter

2019-02-11 22:17发布

I'm using a NSDateFormatter to parse a RFC 822 date on the iPhone. However, there is no way to specify optional elements in the date format. There are a couple of optional parts in the RFC 822 specification which is breaking the date parser. If nothing works out, I'd probably have to write a custom parser to obey the specs.

For example, the day name is optional in the spec. So both these dates are valid:

Tue, 01 Dec 2009 08:48:25 +0000 is parsed with the format EEE, dd MMM yyyy HH:mm:ss z 01 Dec 2009 08:48:25 +0000 is parsed with the format dd MMM yyyy HH:mm:ss z

This is what I am currently using:

+ (NSDateFormatter *)rfc822Formatter {
    static NSDateFormatter *formatter = nil;
    if (formatter == nil) {
        formatter = [[NSDateFormatter alloc] init];
        NSLocale *enUS = [[NSLocale alloc] initWithLocaleIdentifier:@"en_US"];
        [formatter setLocale:enUS];
        [enUS release];
        [formatter setDateFormat:@"EEE, dd MMM yyyy HH:mm:ss z"];
    }
    return formatter;
}

+ (NSDate *)dateFromRFC822:(NSString *)date {
    NSDateFormatter *formatter = [NSDate rfc822Formatter];
    return [formatter dateFromString:date];
}

And parsing the date as follows:

self.entry.published = [NSDate dateFromRFC822:self.currentString];

One way is to try both formats, and take whatever returns non null value. However, there are two optional parts in the spec (day name and seconds) and there would be 4 possible combinations. Still not too bad, but it's a bit hacky.

4条回答
劳资没心,怎么记你
2楼-- · 2019-02-11 22:51

I've used the following method to parse RFC822 dates. I believe it originally was from MWFeedParser:

+ (NSDate *)dateFromRFC822String:(NSString *)dateString {

    // Create date formatter
    static NSDateFormatter *dateFormatter = nil;
    if (!dateFormatter) {
        NSLocale *en_US_POSIX = [[NSLocale alloc] initWithLocaleIdentifier:@"en_US_POSIX"];
        dateFormatter = [[NSDateFormatter alloc] init];
        [dateFormatter setLocale:en_US_POSIX];
        [dateFormatter setTimeZone:[NSTimeZone timeZoneForSecondsFromGMT:0]];
        [en_US_POSIX release];
    }

    // Process
    NSDate *date = nil;
    NSString *RFC822String = [[NSString stringWithString:dateString] uppercaseString];
    if ([RFC822String rangeOfString:@","].location != NSNotFound) {
        if (!date) { // Sun, 19 May 2002 15:21:36 GMT
            [dateFormatter setDateFormat:@"EEE, d MMM yyyy HH:mm:ss zzz"]; 
            date = [dateFormatter dateFromString:RFC822String];
        }
        if (!date) { // Sun, 19 May 2002 15:21 GMT
            [dateFormatter setDateFormat:@"EEE, d MMM yyyy HH:mm zzz"]; 
            date = [dateFormatter dateFromString:RFC822String];
        }
        if (!date) { // Sun, 19 May 2002 15:21:36
            [dateFormatter setDateFormat:@"EEE, d MMM yyyy HH:mm:ss"]; 
            date = [dateFormatter dateFromString:RFC822String];
        }
        if (!date) { // Sun, 19 May 2002 15:21
            [dateFormatter setDateFormat:@"EEE, d MMM yyyy HH:mm"]; 
            date = [dateFormatter dateFromString:RFC822String];
        }
    } else {
        if (!date) { // 19 May 2002 15:21:36 GMT
            [dateFormatter setDateFormat:@"d MMM yyyy HH:mm:ss zzz"]; 
            date = [dateFormatter dateFromString:RFC822String];
        }
        if (!date) { // 19 May 2002 15:21 GMT
            [dateFormatter setDateFormat:@"d MMM yyyy HH:mm zzz"]; 
            date = [dateFormatter dateFromString:RFC822String];
        }
        if (!date) { // 19 May 2002 15:21:36
            [dateFormatter setDateFormat:@"d MMM yyyy HH:mm:ss"]; 
            date = [dateFormatter dateFromString:RFC822String];
        }
        if (!date) { // 19 May 2002 15:21
            [dateFormatter setDateFormat:@"d MMM yyyy HH:mm"]; 
            date = [dateFormatter dateFromString:RFC822String];
        }
    }
    if (!date) NSLog(@"Could not parse RFC822 date: \"%@\" Possibly invalid format.", dateString);
    return date;

}
查看更多
爱情/是我丢掉的垃圾
3楼-- · 2019-02-11 22:57

In case this is helpful to anyone else.. here is a NSDate+RFC822String.swift extension based on Simucal's answer.

It also caches the last used date format that was successful, since setting the dateFormatter.dateFormat is expensive.

import Foundation

private let dateFormatter: NSDateFormatter = {
    let dateFormatter = NSDateFormatter()
    dateFormatter.locale = NSLocale(localeIdentifier: "en_US_POSIX")
    dateFormatter.timeZone = NSTimeZone(forSecondsFromGMT: 0)

    return dateFormatter
}()

private let dateFormatsWithComma = ["EEE, d MMM yyyy HH:mm:ss zzz", "EEE, d MMM yyyy HH:mm zzz", "EEE, d MMM yyyy HH:mm:ss", "EEE, d MMM yyyy HH:mm"]
private let dateFormatsWithoutComma = ["d MMM yyyy HH:mm:ss zzz", "d MMM yyyy HH:mm zzz", "d MMM yyyy HH:mm:ss", "d MMM yyyy HH:mm"]

private var lastUsedDateFormatString: String?

extension NSDate {
    class func dateFromRFC822String(RFC822String: String) -> NSDate? {
        let RFC822String = RFC822String.uppercaseString

        if lastUsedDateFormatString != nil {
            if let date = dateFormatter.dateFromString(RFC822String) {
                return date
            }
        }

        if RFC822String.containsString(",") {
            for dateFormat in dateFormatsWithComma {
                dateFormatter.dateFormat = dateFormat
                if let date = dateFormatter.dateFromString(RFC822String) {
                    lastUsedDateFormatString = dateFormat
                    return date
                }
            }
        } else {
            for dateFormat in dateFormatsWithoutComma {
                dateFormatter.dateFormat = dateFormat
                if let date = dateFormatter.dateFromString(RFC822String) {
                    lastUsedDateFormatString = dateFormat
                    return date
                }
            }
        }

        return nil
    }
}
查看更多
Root(大扎)
4楼-- · 2019-02-11 23:06

I believe RFC 822 specifies two optional components in the date time: day of week and the seconds past the hour.

As a hack, it is possible to the symbols for the short days of the week:

NSArray *shortWeekSymbols = [NSArray arrayWithObjects:@"Sun,", @"Mon,", @"Tue,", @"Wed,", @"Thu,", @"Fri,", @"Sat,", nil];
        [formatter setShortWeekdaySymbols:shortWeekSymbols];

If you then change the date format to this: EEEdd MMM yyyy HH:mm:ss z. You'll be able to parse times with about without the day of the week. This seems to allow a space after the comma too.

To be safe you should not just blindly set the symbols like this. You should get using setShortWeekdaySymbols and iterate over them adding the comma at the end. The reason being they are potentially different for each locale and the first day might not be Sunday.

Interestingly the format EEE, dd MMM yyyy HH:mm:ss z will parse times without the day of week, but the comma must be there, for example , 01 Dec 2009 08:48:25 +0000. Therefore, you could do something like Steve said but then strip off the day and pass though to the formatter. Not having the comma in the format does not seem to allow the week to be optional. Strange.

Unfortunately, this still doesn't help with the optional :ss in the format. But it might allow you to have two formats rather than four.

查看更多
Melony?
5楼-- · 2019-02-11 23:07

Count the number of salient characters before deciding which formatter to use. For example, the two you give have different numbers of commas and spaces. If no known format matches the counts, then you known not even to try parsing it as a date.

查看更多
登录 后发表回答