Loop for changing date format and generating graph

2019-08-21 05:00发布

问题:

I would like to loop through a number of csv files in order to change the format of date variables and generate graphs each with a title that is simply the name of the respective file.

Here is an example of one of the csv files created using dataex:

clear
input str32 eventname str10 scrapedate float(average thpercentile v5 v6)
"EventName" "2015-12-15"  136.9255     83.2 104.875    148.75
"EventName" "2015-12-16"  130.4555    78.55      99    138.22
"EventName" "2015-12-17" 123.66705     72.7   90.25     131.2
"EventName" "2015-12-18" 116.45757   64.855   78.55     119.5
"EventName" "2015-12-19" 108.63446 60.56333    72.7 119.07333
"EventName" "2015-12-20"  94.97125    55.15   69.77    112.48
end

The files are inside the "I:\Games CSVs" directory and they are all in csv format. I have ensured that the variable names in all files are the same.

I want to take each file, convert the scrapedate from a string to a date format. I then want to plot four lines represented by four variables average, thpercentile, v5, v6 and on the y-axis, ScrapeDate.

So far, based on a Google Search, I tried the code below:

local files : dir "I:\Games CSVs" files "*.csv"

cd "I:\Games CSVs"

foreach file in `files' {
    insheet using `file', comma clear

    /* Convert string to date format */
    gen ScrapeDate = date(scrapedate, "YMD") 
    format ScrapeDate %td #Convert string to date format

    /* X Axis: price, Y Axis: ScrapeDate; Plotting, averages, 25th, 50th and 75th percentile  */
    line average thpercentile v5 v6 ScrapeDate, legend(size(medsmall))  
}

The above code is contained in a do file. The problem is that when I do, File -> Do.. -> filename.do, I don't see any graphs, it simply shows in my window as end of do file. I do not see the graphs!

I'm hoping my code isn't wrong or anything because on the left panel, the do "path/filename.do" appears red in color. I have removed all blank lines in the do files, it has no semicolon contrary to what the Google Search answer had (the answer had semicolons).

I want to be able to access the graphs! But where are they stored?

I was able to somewhat accomplish what i wanted using just one file with the following code:

insheet using "I:\Games CSVs\oneofthecsvfiles.csv"
gen ScrapeDate = date(scrapedate, "YMD")
format ScrapeDate %td
line average thpercentile v5 v6 ScrapeDate, legend(size(medsmall))

Finally, the above code converts the date variable in numeric but changes the format from 2015-12-17 originally to 17dec2015.

Is there a way to get this to be 2015/12/17 or 2015-12-17?

回答1:

Assume you have the following toy datasets:

local foodir // INSERT YOUR DIRECTORY PATH HERE

clear
input str32 eventname str10 scrapedate float(average thpercentile v5 v6)
"EventName" "2015-12-15"  136.9255     83.2 104.875    148.75
"EventName" "2015-12-16"  130.4555    78.55      99    138.22
"EventName" "2015-12-17" 123.66705     72.7   90.25     131.2
"EventName" "2015-12-18" 116.45757   64.855   78.55     119.5
"EventName" "2015-12-19" 108.63446 60.56333    72.7 119.07333
"EventName" "2015-12-20"  94.97125    55.15   69.77    112.48
end

export delimited using "`foodir'one", replace

clear
input str32 eventname str10 scrapedate float(average thpercentile v5 v6)
"EventName" "2014-12-15"  236.9255     83.2 104.875    148.75
"EventName" "2014-12-16"  230.4555    78.55      99    138.22
"EventName" "2014-12-17" 223.66705     72.7   90.25     131.2
"EventName" "2014-12-18" 216.45757   64.855   78.55     119.5
"EventName" "2014-12-19" 208.63446 60.56333    72.7 119.07333
"EventName" "2014-12-20"  194.97125    55.15   69.77    112.48
end

export delimited using "`foodir'two", replace

clear
input str32 eventname str10 scrapedate float(average thpercentile v5 v6)
"EventName" "2013-12-15"  336.9255     83.2 104.875    148.75
"EventName" "2013-12-16"  330.4555    78.55      99    138.22
"EventName" "2013-12-17" 323.66705     72.7   90.25     131.2
"EventName" "2013-12-18" 316.45757   64.855   78.55     119.5
"EventName" "2013-12-19" 308.63446 60.56333    72.7 119.07333
"EventName" "2013-12-20"  294.97125    55.15   69.77    112.48
end

The following works for me:

local foodir // INSERT YOUR DIRECTORY PATH HERE

local files : dir "`foodir'" files "*.csv"
cd "`foodir'"

local i = 0
foreach file of local files {
    local ++i
    insheet using "`file'", comma clear

    generate ScrapeDate = daily(scrapedate, "YMD")
    format ScrapeDate %tdCCYY-NN-DD

    line average thpercentile v5 v6 ScrapeDate, name("graph`i'", replace) title("`file'") ///
    legend(size(medsmall))      
}

Just insert the full path in the local macro foodir and Stata will do the rest.

All the changes I have made in your original code are indicated in bold.