I'm trying to use the read_html
function in the rvest
package, but have come across a problem I am struggling with.
For example, if I were trying to read in the bottom table that appears on this page, I would use the following code:
library(rvest)
html_content <- read_html("https://projects.fivethirtyeight.com/2016-election-forecast/washington/#now")
By inspecting the HTML code in the browser, I can see that the content I would like is contained in a <table>
tag (specifically, it is all contained within <table class="t-calc">
). But when I try to extract this using:
tables <- html_nodes(html_content, xpath = '//table')
I retrieve the following:
> tables
{xml_nodeset (4)}
[1] <table class="tippingpointroi unexpanded">\n <tbody>\n <tr data-state="FL" class=" "> ...
[2] <table class="tippingpointroi unexpanded">\n <tbody>\n <tr data-state="NV" class=" "> ...
[3] <table class="scenarios">\n <tbody/>\n <tr data-id="1">\n <td class="description">El ...
[4] <table class="t-desktop t-polls">\n <thead>\n <tr class="th-row">\n <th class="t ...
Which includes some of the table elements on the page, but not the one I am interested in.
Any suggestions on where I am going wrong would be most appreciated!