How would you handle different formats of dates?

2019-01-20 20:27发布

问题:

I have different types of dates formatting like:

  • 27 - 28 August 663 CE

  • 22 August 1945 19 May

  • May 4 1945 – August 22 1945

  • 5/4/1945

  • 2-7-1232

  • 03-4-1020

  • 1/3/1 (year 1)

  • 09/08/0 (year 0)

Note they are all different formats, different order, some have 2 months, some only one, I tried to use moment js with no results, I also tried to use date js yet, no luck.

I tried to do some splitting:

dates.push({
    Time : []
});

function doSelect(text) {
  return $wikiDOM.find(".infobox th").filter(function() {
    return $(this).text() === text;
  });
}
dateText = doSelect("Date").siblings('td').text().split(/\s+/g);
 for(var i = 0; i < dateText.length; i++) {
  d += dateText[i] + ' ';
}
dates[0].Time.push(d);

But the result is:

"Time": [
            "27 - 28 August 663 CE ",

Eventually what I need to auto generate is:

<ul class="Days">
  <li>27</li>
  <li>28</li>
</ul>

<ul class="Months">
  <li>August</li>
</ul>

<ul class="Year">
  <li>663</li>
</ul>

And also think of a way to handle CE or AD or BC

To achieve that an ideal way I'd like to use is a multidimensional array:

time.push({
    Day : [], 
    Month : [],
    Year : [],
    Prefix : []
});

Probably to check max 2 numbers for days, check months against a list of strings like January, February, March.. and then the year min 3 numbers to max 4 numbers and then handle the prefix with some conditionals. But yet, how about year 2 or 1? or how about if the date is 02/9/1975? Or with separating dash, they'd be a new format. I think the logic is kinda there but how would you split those dates into a multidimensional array as per above given the fact that they are all different formats?

回答1:

I will be updating this answer more and more while I will build new parsers. Feel free to contribute.

So for these formats, I'll do:

27 - 28 August 663 CE
22 August 1945 19 May
May 4 1945 – August 22 1945
5-10 February 1720

JS

months = new Set(["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]);
for(var i = 0; i < dateText.length; i++) {
  d += dateText[i] + ' ';
}
var words = d.replace("–", " ").replace("-", " ").replace(",", " ").replace("/", " ").split(' ');
words = $.grep(words, function(n, i){
    return (n !== "" && n != null);
});
var array = words;
var newArray = array.filter(function(v){return v!==''});
for (const word of newArray) {
 if (months.has(word)) {
   spacetime[0].Time.months.push(word);
 } else if (+word < 32) {
   spacetime[0].Time.days.push(+word);
 } else if (+word < 2200) {
   spacetime[0].Time.years.push(+word);
 } else if (/\w+/.test(word)) {
   spacetime[0].Time.suffixes.push(word);
}

jSon example:

        "Time": {
            "days": [
                22
            ],
            "months": [
                "August"
            ],
            "years": [
                1945
            ],
            "suffixes": [
                "10:25",
                "(UTC+1)"
            ]