I am trying to extract financial statement information based on type of the statement.
Let me explain to you in a little more details.
I want to extract the income statement, balance sheet and cash flow statement from an XBRL instance – especially US GAAP.
For me, the perfect solution would be to have tags in the XML file in such a way that I can extract the income statement with tag <incomestatement>
, balance sheet with <balancesheet>
and cash flow with <cashflow>
.
Please help me here. I am a novice and do not posses much background in XBRL.
As far as I recall, the right place to look at is the user-friendly labels associated with these roles.
The SEC places restrictions on how these labels look like (e.g., paragraph 6.7.12 of the Edgar Filing Manual), e.g.
02 - Statement - Balance Sheet
. The income statement, cash flow statement and balance sheet are commonly found in labels withStatement
(as opposed toDisclosure
,Document
,Schedule
) between the two dashes.The third part of the label itself will tell you where to find the income statement/cash flow statement/balance sheet, however the exact labels may vary between filers. Also, there are several kinds of these (consolidated vs. unconsolidated, classified vs. unclassified, etc), and the complexity is further increased because sometimes, the same filing may contain several versions (consolidated and unconsolidated), so that you need some domain expertise to decide which one you need.
In a nutshell, you will need to do some trial and error on real filings in order to find the right algorithm to filter these labels.
What should help you though, is that Charles Hoffman has done some research on this, which for example can be found here (section 1.5).
Fortunately, it is not that difficult to extract financial statements. Here is how I was able to extract income statement info:
Replace the file="" parameter with your own path. You can also substitute url for file parameter