I want to download a bunch of files named with ISO-8601 dates. Is there a simple way to do this using bash+GNU coreutils? (Or some trick to make wget/curl to generate the list automatically, but I find that unlikely)
Similar to this question, but not restricted to weekdays: How to generate a range of nonweekend dates using tools available in bash?.
I guess that there is a simpler way to do it without that restriction.
Also related to How to generate date range for random data on bash, but not restricted to a single year.
If you have GNU date
, you could do use either a for
loop in any POSIX-compliant shell:
# with "for"
for i in {1..5}; do
echo $(date -I -d "2014-06-28 +$i days")
done
or an until
loop, this time using Bash's extended test [[
:
# with "until"
d="2014-06-29"
until [[ $d > 2014-07-03 ]]; do
echo "$d"
d=$(date -I -d "$d + 1 day")
done
Note that non-ancient versions of sh
will also do lexicographical comparison if you change the condition to [ "$d" \> 2014-07-03 ]
.
Output from either of those loops:
2014-06-29
2014-06-30
2014-07-01
2014-07-02
2014-07-03
For a more portable way to do the same thing, you could use a Perl script:
use strict;
use warnings;
use Time::Piece;
use Time::Seconds;
use File::Fetch;
my ($t, $end) = map { Time::Piece->strptime($_, "%Y-%m-%d") } @ARGV;
while ($t <= $end) {
my $url = "http://www.example.com/" . $t->strftime("%F") . ".log";
my $ff = File::Fetch->new( uri => $url );
my $where = $ff->fetch( to => '.' ); # download to current directory
$t += ONE_DAY;
}
Time::Piece, Time::Seconds and File::Fetch are all core modules. Use it like perl wget.pl 2014-06-29 2014-07-03
.
Using GNU date and bash:
start=2014-12-29
end=2015-01-03
while ! [[ $start > $end ]]; do
echo $start
start=$(date -d "$start + 1 day" +%F)
done
2014-12-29
2014-12-30
2014-12-31
2015-01-01
2015-01-02
2015-01-03
This is how I ended up doing it:
d=$(date -I);
while wget "http://www.example.com/$d.log"; do
d=$(date -I -d "$d - 1 day");
done
This tries to download all files from today's date until we get a 404.
I use this handy function to work with log files in the format yyyymmdd.log.gz:
function datelist { for dt in $(seq -w $1 $2) ; do date -d $dt +'%Y%m%d' 2>/dev/null ; done ; }
It accepts dates in the format yyyymmdd.