I've been experimenting with the XBRL package in R to try to write a function that would cycle through companies and output financial statements, ideally in a very standard dataframe. But, I don't understand the output. Using the function and then viewing the data frame, all that appears is a running total on the leftmost column, with right-justified URL's of various XML/XBRL/C++ components on the right. I admit I have very little XBRL knowledge, but I must be missing something. How would I use the functions of this package to cycle through and log all XBRL statements, formatting into something usable for an end-user?
Using the example from the pdf guide is easy, but it prints out strangely and I have no idea how to get this into a proper data frame:
## Setting stringsAsFactors = FALSE is highly recommended
## to avoid data frames to create factors from character vectors.
options(stringsAsFactors = FALSE)
## Load the library
library(XBRL)
## XBRL instance file to be analyzed, accessed
## directly from SEC website:
inst <- "http://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xml"
## Level 1: Function that does all work and returns
## a list of data frames with extracted information:
## Not run:
xbrl.vars <- xbrlDoAll(inst, verbose=TRUE)
Summary of this gives a bunch of lists of differing row lengths:
summary(xbrl.vars) Length Class Mode element 7 data.frame list role 5 data.frame list calculation 11 data.frame list context 13 data.frame list unit 4 data.frame list fact 7 data.frame list footnote 5 data.frame list definition 11 data.frame list label 5 data.frame list presentation 11 data.frame list
This may be so simple as me not understanding a data.frame of lists (list of lists? list of data.frames?) in R. If so, I apologize for a stupid question (it could be stupid for other reasons). I tried to use the solution at the bottom of the answers to this question: list of lists with different lengths to data.frame in R. So: xbrl.vars2<-as.data.frame(as.matrix(xbrl.vars)) Which was dumb of me, because how can R make a matrix when the # of rows are different? It seemed to make R freeze.
Thank you for any help.
The result of
xbrlDoAll
is a list of data frames. There is a way to get the financial statements from there, but not as direct as was expected.The data frames (fact, context, element, presentation, role, etc.) correspond to XBRL entities. To get the data in the financial statement structure, some manipulation is required.
XBRL package conveniently converts XMLs, XLinks and schemas in data.frames with obvious relations, so the task is fairly easy - especially with tools like dplyr and tidyr. See balance sheet example with entity diagram and R code on github.