To extract the DateTime from the name of file(

2019-07-27 06:42发布

问题:

I want to extract the filename string as a DateTime column. The code for it as follows: @data = EXTRACT ... filename_date DateTime FROM "/input/vga_{filename_date}.txt" USING Extractors.Tsv(skipFirstNRows:1);

filename = vga_20171201.txt

whenever i have used datatype as string or int, it's work for me.

回答1:

You have to specify .net date format strings along with the virtual column name to get that behaviour, like this:

@data =
    EXTRACT someData string,
            filename_date DateTime
    FROM "/input/vga_{filename_date:yyyy}{filename_date:MM}{filename_date:dd}.txt"
    USING Extractors.Tsv(skipFirstNRows : 1);


回答2:

I have a series of files that are named like 1601.gz to represent January of 2016. {date:yyMM}.gz or {date:yy}{date:MM}.gz don't seem to