I have just found out about R, which seems to be the ideal tool to get statistics on webserver logfiles. I have looked into several libs like zoo
and plyr
, but none of them offer a straight forward solution to aggregate timestamped data.
Is there any R lib or a tutorial or a documentation which focuses on analyzing log file like data? Which emphasize on aggregating the time in slices?
Possible usecases:
- average request time per day
- average requests per session per day
- get the slowest requests this week
- ...
This kind of question of processing timestamped data is actually quite common. Because your question is vague, my answer is limited to some pointers. For an example of aggregating timeseries see (which btw are all answers of myself):
- How to create histogram in R with CSV time data?
- How to get a "events per month" bar plot in R
These answers all use the same strategy, combined with the plyr
and ggplot2
package. This should get you started. Note that these are only answers of myself that I kind find in a couple of minutes. Probably there is much more to find, especially if you are looking for more specific questions.