Invalid 'path' argument with XLConnect

2019-02-26 09:49发布

问题:

I am trying and failing to get the following process to complete in R Version 3.1.2:

library(RCurl)
library(XLConnect)
yr <- substr(Sys.Date(), 1, 4)
mo <- as.character(as.numeric(substr(Sys.Date(), 6, 7)) - 1)
temp <- tempfile()
temp <- getForm("http://strikemap.clb.org.hk/strikes/api.v4/export",
  FromYear = "2011", FromMonth = "1", 
  ToYear = yr, ToMonth = mo,
  `_lang` = "en")
CLB <- readWorksheetFromFile(temp, sheet=1)
unlink(temp)

I have been able manually to export the requested data set and then read it into R from a local directory using the same readWorksheetFromFile syntax. My goal now is to do the whole thing in R. The call to the API seems to work (thanks to some earlier help), but the process fails at the next step, when I try to ingest the results. Here's what happens:

> CLB <- readWorksheetFromFile(temp, sheet=1)
Error in path.expand(filename) : invalid 'path' argument

Any thoughts on what I'm doing wrong or what's broken?

回答1:

Turns out the problem didn't lie with XLConnect at all. Based on Hadley's tip that I needed to save the results of my query to the API to a file before reading them back into R, I have managed (almost) to complete the process using the following code:

library(httr)
library(readxl)
yr <- substr(Sys.Date(), 1, 4)
mo <- as.character(as.numeric(substr(Sys.Date(), 6, 7)) - 1)
baseURL <- paste0("http://strikemap.clb.org.hk/strikes/api.v4/export?FromYear=2011&FromMonth=1&ToYear=", yr, "&ToMonth=", mo, "&_lang=en")
queryList <- parse_url(baseURL)
clb <- GET(build_url(queryList), write_disk("clb.temp.xlsx", overwrite=TRUE))
CLB <- read_excel("clb.temp.xlsx")

The object that creates, CLB, includes the desired data with one glitch: the dates in the first column are not being read properly. If I open "clb.temp.xlsx" in Excel, they show up as expected (e.g., 2015-06-30, or 6/30/2015 if I click on the cell). But read_excel() is reading them as numbers that don't track to those dates in an obvious way (e.g., 42185 for 2015-06-30). I tried fixing that by specifying that they were dates in the call to read_excel, but that produced a long string of warnings about expecting dates but getting those numbers.

If I use readWorkSheetFromFile() instead of read_excel at that last step, here's what happens:

> CLB <- readWorksheetFromFile("clb.temp.xlsx")
Error in (function (classes, fdef, mtable)  : unable to find an inherited method for function ‘readWorksheet’ for signature ‘"workbook", "missing"’

I will search for a solution to the problem using read_excel and will create a new question if I can't find one.