I wish to use read.csv to read a google doc spreadsheet.
I try using the following code:
data_url <- "http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv"
read.csv(data_url)
Which results in the following error:
Error in file(file, "rt") : cannot open the connection
I'm on windows 7. And the code was tried on R 2.12 and 2.13
I remember trying this a few months ago and it worked fine.
Any suggestion what might be causing this or how to solve it?
Thanks.
I ran into the same problem and eventually found a solution in a forum thread. Using my own public CSV file:
library(RCurl)
tt = getForm("https://spreadsheets.google.com/spreadsheet/pub",
hl ="en_US", key = "0Aonsf4v9iDjGdHRaWWRFbXdQN1ZvbGx0LWVCeVd0T1E",
output = "csv",
.opts = list(followlocation = TRUE, verbose = TRUE, ssl.verifypeer = FALSE))
holidays <- read.csv(textConnection(tt))
It might have something to do with the fact that Google is reporting a 302 temporarily moved response.
> download.file(data_url, "~/foo.csv", method = "wget")
--2011-04-29 18:01:01-- http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
Resolving spreadsheets0.google.com... 74.125.230.132, 74.125.230.128, 74.125.230.130, ...
Connecting to spreadsheets0.google.com|74.125.230.132|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv [following]
--2011-04-29 18:01:01-- https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
Connecting to spreadsheets0.google.com|74.125.230.132|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: `/home/gavin/foo.csv'
[ <=> ] 41 --.-K/s in 0s
2011-04-29 18:01:02 (1.29 MB/s) - `/home/gavin/foo.csv' saved [41]
> read.csv("~/foo.csv")
column1 column2
1 a 1
2 b 2
3 ds 3
4 d 4
5 f 5
6 ga 5
I'm not sure R's internal download code is capable of responding to such redirects:
> download.file(data_url, "~/foo.csv")
trying URL 'http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv'
Error in download.file(data_url, "~/foo.csv") :
cannot open URL 'http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv'
Check the solution on http://blog.forret.com/2011/07/google-docs-infamous-moved-temporarily-error-fixed/
So what is the solution: just add “&ndplr=1” to your URL and you will
skip the authentication redirect. I’m not sure what the NDPLR
parameter name stands for, let’s just call it: “Never Do Published
Link Redirection“.