How to parse dates in multiple formats using Simpl

2018-12-31 18:17发布

I am trying to parse some dates that are coming out of a document. It would appear users have entered these dates in a similar but not exact format.

here are the formats:

9/09
9/2009
09/2009
9/1/2009
9-1-2009 

What is the best way to go about trying to parse all of these? These seem to be the most common, but I guess what is hanging me up is that if i have a pattern of "M/yyyy" wont that always catch before "MM/yyyy" Do I have to set up my try/catch blocks nested in a least restrictive to most restrictive way? it seems like it sure is going to take a lot of code duplication to get this right.

9条回答
永恒的永恒
2楼-- · 2018-12-31 18:31

Here is the complete example (with main method) which can be added as a utility class in your project. All the format mentioned in SimpleDateFormate API is supported in the below method.

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;

import org.apache.commons.lang.time.DateUtils;

public class DateUtility {

    public static Date parseDate(String inputDate) {

        Date outputDate = null;
        String[] possibleDateFormats =
              {
                    "yyyy.MM.dd G 'at' HH:mm:ss z",
                    "EEE, MMM d, ''yy",
                    "h:mm a",
                    "hh 'o''clock' a, zzzz",
                    "K:mm a, z",
                    "yyyyy.MMMMM.dd GGG hh:mm aaa",
                    "EEE, d MMM yyyy HH:mm:ss Z",
                    "yyMMddHHmmssZ",
                    "yyyy-MM-dd'T'HH:mm:ss.SSSZ",
                    "yyyy-MM-dd'T'HH:mm:ss.SSSXXX",
                    "YYYY-'W'ww-u",
                    "EEE, dd MMM yyyy HH:mm:ss z", 
                    "EEE, dd MMM yyyy HH:mm zzzz",
                    "yyyy-MM-dd'T'HH:mm:ssZ",
                    "yyyy-MM-dd'T'HH:mm:ss.SSSzzzz", 
                    "yyyy-MM-dd'T'HH:mm:sszzzz",
                    "yyyy-MM-dd'T'HH:mm:ss z",
                    "yyyy-MM-dd'T'HH:mm:ssz", 
                    "yyyy-MM-dd'T'HH:mm:ss",
                    "yyyy-MM-dd'T'HHmmss.SSSz",
                    "yyyy-MM-dd",
                    "yyyyMMdd",
                    "dd/MM/yy",
                    "dd/MM/yyyy"
              };

        try {

            outputDate = DateUtils.parseDate(inputDate, possibleDateFormats);
            System.out.println("inputDate ==> " + inputDate + ", outputDate ==> " + outputDate);

        } catch (ParseException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        return outputDate;

    }

    public static String formatDate(Date date, String requiredDateFormat) {
        SimpleDateFormat df = new SimpleDateFormat(requiredDateFormat);
        String outputDateFormatted = df.format(date);
        return outputDateFormatted;
    }

    public static void main(String[] args) {

        DateUtility.parseDate("20181118");
        DateUtility.parseDate("2018-11-18");
        DateUtility.parseDate("18/11/18");
        DateUtility.parseDate("18/11/2018");
        DateUtility.parseDate("2018.11.18 AD at 12:08:56 PDT");
        System.out.println("");
        DateUtility.parseDate("Wed, Nov 18, '18");
        DateUtility.parseDate("12:08 PM");
        DateUtility.parseDate("12 o'clock PM, Pacific Daylight Time");
        DateUtility.parseDate("0:08 PM, PDT");
        DateUtility.parseDate("02018.Nov.18 AD 12:08 PM");
        System.out.println("");
        DateUtility.parseDate("Wed, 18 Nov 2018 12:08:56 -0700");
        DateUtility.parseDate("181118120856-0700");
        DateUtility.parseDate("2018-11-18T12:08:56.235-0700");
        DateUtility.parseDate("2018-11-18T12:08:56.235-07:00");
        DateUtility.parseDate("2018-W27-3");
    }

}
查看更多
琉璃瓶的回忆
3楼-- · 2018-12-31 18:32

This solution checks all the possible formats before throwing an exception. This solution is more convenient if you are trying to test for multiple date formats.

Date extractTimestampInput(String strDate){
    final List<String> dateFormats = Arrays.asList("yyyy-MM-dd HH:mm:ss.SSS", "yyyy-MM-dd");    

    for(String format: dateFormats){
        SimpleDateFormat sdf = new SimpleDateFormat(format);
        try{
            return sdf.parse(strDate);
        } catch (ParseException e) {
             //intentionally empty
        }
    }
        throw new IllegalArgumentException("Invalid input for date. Given '"+strDate+"', expecting format yyyy-MM-dd HH:mm:ss.SSS or yyyy-MM-dd.");

}
查看更多
谁念西风独自凉
4楼-- · 2018-12-31 18:42

If working in Java 1.8 you can leverage the DateTimeFormatterBuilder

public static boolean isTimeStampValid(String inputString)
{
    DateTimeFormatterBuilder dateTimeFormatterBuilder = new DateTimeFormatterBuilder()
            .append(DateTimeFormatter.ofPattern("" + "[yyyy-MM-dd'T'HH:mm:ss.SSSZ]" + "[yyyy-MM-dd]"));

    DateTimeFormatter dateTimeFormatter = dateTimeFormatterBuilder.toFormatter();

    try {
        dateTimeFormatter.parse(inputString);
        return true;
    } catch (DateTimeParseException e) {
        return false;
    }
}

See post: Java 8 Date equivalent to Joda's DateTimeFormatterBuilder with multiple parser formats?

查看更多
宁负流年不负卿
5楼-- · 2018-12-31 18:48

What about just defining multiple patterns? They might come from a config file containing known patterns, hard coded it reads like:

List<SimpleDateFormat> knownPatterns = new ArrayList<SimpleDateFormat>();
knownPatterns.add(new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'"));
knownPatterns.add(new SimpleDateFormat("yyyy-MM-dd'T'HH:mm.ss'Z'"));
knownPatterns.add(new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss"));
knownPatterns.add(new SimpleDateFormat("yyyy-MM-dd' 'HH:mm:ss"));
knownPatterns.add(new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssXXX"));

for (SimpleDateFormat pattern : knownPatterns) {
    try {
        // Take a try
        return new Date(pattern.parse(candidate).getTime());

    } catch (ParseException pe) {
        // Loop on
    }
}
System.err.println("No known Date format found: " + candidate);
return null;
查看更多
牵手、夕阳
6楼-- · 2018-12-31 18:48

For the modern answer I am ignoring the requirement to use SimpleDateFormat. While using this class for parsing was a good idea in 2010 when this question was asked, it is now long outdated. The replacement, DateTimeFormatter, came out in 2014. The idea in the following is pretty much the same as in the accepted answer.

private static DateTimeFormatter[] parseFormatters = Stream.of("M/yy", "M/y", "M/d/y", "M-d-y")
        .map(DateTimeFormatter::ofPattern)
        .toArray(DateTimeFormatter[]::new);

public static YearMonth parseYearMonth(String input) {
    for (DateTimeFormatter formatter : parseFormatters) {
        try {
            return YearMonth.parse(input, formatter);
        } catch (DateTimeParseException dtpe) {
            // ignore, try next format
        }
    }
    throw new IllegalArgumentException("Could not parse " + input);
}

This parses each of the input strings from the question into a year-month of 2009-09. It’s important to try the two-digit year first since "M/y" could also parse 9/09, but into 0009-09 instead.

A limitation of the above code is it ignores the day-of-month from the strings that have one, like 9/1/2009. Maybe it’s OK as long as most formats have only month and year. To pick it up, we’d have to try LocalDate.parse() rather then YearMonth.parse() for the formats that include d in the pattern string. Surely it can be done.

查看更多
心情的温度
7楼-- · 2018-12-31 18:49

Matt's approach above is fine, but please be aware that you will run into problems if you use it to differentiate between dates of the format y/M/d and d/M/y. For instance, a formatter initialised with y/M/d will accept a date like 01/01/2009 and give you back a date which is clearly not what you wanted. I fixed the issue as follows, but I have limited time and I'm not happy with the solution for 2 main reasons:

  1. It violates one of Josh Bloch's quidelines, specifically 'don't use exceptions to handle program flow'.
  2. I can see the getDateFormat() method becoming a bit of a nightmare if you needed it to handle lots of other date formats.

If I had to make something that could handle lots and lots of different date formats and needed to be highly performant, then I think I would use the approach of creating an enum which linked each different date regex to its format. Then use MyEnum.values() to loop through the enum and test with if(myEnum.getPattern().matches(date)) rather than catching a dateformatexception.

Anway, that being said, the following can handle dates of the formats 'y/M/d' 'y-M-d' 'y M d' 'd/M/y' 'd-M-y' 'd M y' and all other variations of those which include time formats as well:

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;

public class DateUtil {
    private static final String[] timeFormats = {"HH:mm:ss","HH:mm"};
    private static final String[] dateSeparators = {"/","-"," "};

    private static final String DMY_FORMAT = "dd{sep}MM{sep}yyyy";
    private static final String YMD_FORMAT = "yyyy{sep}MM{sep}dd";

    private static final String ymd_template = "\\d{4}{sep}\\d{2}{sep}\\d{2}.*";
    private static final String dmy_template = "\\d{2}{sep}\\d{2}{sep}\\d{4}.*";

    public static Date stringToDate(String input){
    Date date = null;
    String dateFormat = getDateFormat(input);
    if(dateFormat == null){
        throw new IllegalArgumentException("Date is not in an accepted format " + input);
    }

    for(String sep : dateSeparators){
        String actualDateFormat = patternForSeparator(dateFormat, sep);
        //try first with the time
        for(String time : timeFormats){
        date = tryParse(input,actualDateFormat + " " + time);
        if(date != null){
            return date;
        }
        }
        //didn't work, try without the time formats
        date = tryParse(input,actualDateFormat);
        if(date != null){
        return date;
        }
    }

    return date;
    }

    private static String getDateFormat(String date){
    for(String sep : dateSeparators){
        String ymdPattern = patternForSeparator(ymd_template, sep);
        String dmyPattern = patternForSeparator(dmy_template, sep);
        if(date.matches(ymdPattern)){
        return YMD_FORMAT;
        }
        if(date.matches(dmyPattern)){
        return DMY_FORMAT;
        }
    }
    return null;
    }

    private static String patternForSeparator(String template, String sep){
    return template.replace("{sep}", sep);
    }

    private static Date tryParse(String input, String pattern){
    try{
        return new SimpleDateFormat(pattern).parse(input);
    }
    catch (ParseException e) {}
    return null;
    }


}
查看更多
登录 后发表回答