Sorting date field in unix

2019-03-31 11:41发布

问题:

I have text file which contains hundreds of thousands of records. One of the fields is a date field. Is there is any way to sort the file based on the date field?

09-APR-12 04.08.43.632279000 AM
19-MAR-12 03.53.38.189606000 PM
19-MAR-12 03.56.27.933365000 PM
19-MAR-12 04.00.13.387316000 PM
19-MAR-12 04.04.45.168361000 PM
19-MAR-12 03.54.32.595348000 PM
27-MAR-12 10.28.14.797580000 AM
28-MAR-12 12.28.02.652969000 AM
27-MAR-12 07.28.02.828746000 PM

The Output should come as

19-MAR-12 03.53.38.189606000 PM
19-MAR-12 03.54.32.595348000 PM
19-MAR-12 03.56.27.933365000 PM
19-MAR-12 04.00.13.387316000 PM
19-MAR-12 04.04.45.168361000 PM
27-MAR-12 10.28.14.797580000 AM
27-MAR-12 07.28.02.828746000 PM
28-MAR-12 12.28.02.652969000 AM
09-APR-12 04.08.43.632279000 AM

I have tried the sort command to order the date (taking the date field as a string), but it is not giving the correct output.

回答1:

Chronicle's solution is close, but misses the AM/PM distinction, sorting 27-MAR-12 07.28.02.828746000 PM before 27-MAR-12 10.28.14.797580000 AM. This can be modified:

sort -t- -k 3.1,3.2 -k 2M -k 1n -k 3.23,3.24

But that is still very fragile. It would be much better to convert the dates to an epoch time and compare numerically.



回答2:

Try this :

Input.txt

09-APR-12 04.08.43.632279000 AM 
19-MAR-12 03.53.38.189606000 PM 
19-MAR-12 03.56.27.933365000 PM 
19-MAR-12 04.00.13.387316000 PM 
19-MAR-12 04.04.45.168361000 PM 
19-MAR-12 03.54.32.595348000 PM 
27-MAR-12 10.28.14.797580000 AM 
28-MAR-12 12.28.02.652969000 AM 
27-MAR-12 07.28.02.828746000 PM 

Code

 sort -t "-"  -k 3 -k 2M -nk 1 Input.txt

Output

19-MAR-12 03.53.38.189606000 PM
19-MAR-12 03.54.32.595348000 PM
19-MAR-12 03.56.27.933365000 PM
19-MAR-12 04.00.13.387316000 PM
19-MAR-12 04.04.45.168361000 PM
27-MAR-12 07.28.02.828746000 PM
27-MAR-12 10.28.14.797580000 AM
28-MAR-12 12.28.02.652969000 AM
09-APR-12 04.08.43.632279000 AM


回答3:

This script sorts by Epoch time with nanosecond resolution:

awk '{
  t = gensub(/\.([0-9]{2})\./, ":\\1:", 1, $0);
  command = "date +%s%N -d \x022" t "\x022";
  command | getline t;
  close(command);
  print t, $0;
}' unsorted.txt | sort -n -k 1 | cut -d ' ' -f 2- > sorted.txt


回答4:

You could use date, which is generally probably a decent idea, especially if you don't need to worry about the microseconds, otherwise you could probably clip the microseconds off and sort that as a secondary sorting field.

while read a; do   
grep "^${a}" input.txt; 
done < <(sed 's/\./:/;s/\./:/' input.txt | xargs -n3 -I{} date -d"{}" +%s | sort | xargs -n1 -I{} date -d @'{}' +'%d-%^h-%y %I.%M.%S')