我想从这个URL刮去表:“ https://hutdb.net/17/players ”我花了很多时间学习rvest和使用selectorgadget,但每当我试图得到一个输出我总是得到相同的误差(字符(0))。
library(rvest)
library(magrittr)
url <- read_html("https://hutdb.net/17/players")
table <- url %>%
html_nodes("td") %>%
html_text()
任何帮助,将不胜感激。
该数据是动态加载,并且不能从HTML直接检索。 但是,看着在Chrome DevTools例如“网络”,我们可以找到一个很好的格式化JSON https://hutdb.net/ajax/stats.php?year=17&page=0&selected=OVR&sort=DESC
library(jsonlite)
dat <- fromJSON("https://hutdb.net/ajax/stats.php?year=17&page=0&selected=OVR&sort=DESC")
输出如下:
# results aOVR id League Year Card Team Player Position Type Shoots HGT
# 1 6308 6308 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 <NA> 2030 11782 NHL 17 MOV OTT Erik Karlsson RD OFD Right 6'0
# 3 <NA> 2060 11785 NHL 17 MOV TBL Victor Hedman LD TWD Left 6'6
# 4 <NA> 2008 11791 NHL 17 MOV CHI Patrick Kane RW SNP Left 5'11
# 5 <NA> 2058 13845 NHL 17 SCE ANA Ryan Getzlaf C PWF Right 6'4
# 6 <NA> 2074 11824 NHL 17 MOV BOS Brad Marchand LW TWF Left 5'9
# 7 <NA> 2008 11829 NHL 17 MOV EDM Connor McDavid C PLY Left 6'2
# 8 <NA> 2048 11840 NHL 17 MOV WSH Nicklas Backstrom C PLY Left 6'1
# 9 <NA> 2058 11841 NHL 17 MOV PIT Sidney Crosby C PLY Left 5'11
# 10 <NA> 2065 13644 NHL 17 TOTY WPG Patrik Laine RW TWF Right 6'3
# 11 <NA> 2008 13645 NHL 17 TOTY EDM Connor McDavid C PLY Left 6'2
# 12 <NA> 2039 13680 NHL 17 TOTY LAK Drew Doughty RD TWD Right 6'1
# 13 <NA> 2063 13689 NHL 17 TOTY BOS Patrice Bergeron C TWF Right 6'2
文章来源: Scraping a table from a website using R (Rvest).. or VBA if possible