How do I write data from R to PostgreSQL tables wi

2020-02-12 02:35发布

I have a table in a PostgreSQL database that has a BIGSERIAL auto-incrementing primary key. Recreate it using:

CREATE TABLE foo
(
  "Id" bigserial PRIMARY KEY,
  "SomeData" text NOT NULL
);

I want to append some data to this table from R via the RPostgreSQL package. In R, the data doesn't include the Id column because I want the database to generate those value.

dfr <- data.frame(SomeData = letters)

Here's the code I used to try and write the data:

library(RPostgreSQL)
conn <- dbConnect(
  "PostgreSQL", 
  user     = "yourname", 
  password = "your password",
  dbname   = "test"
)
dbWriteTable(conn, "foo", dfr, append = TRUE, row.names = FALSE)
dbDisconnect(conn)

Unfortunately, dbWriteTable throws an error:

## Error in postgresqlgetResult(new.con) : 
##   RS-DBI driver: (could not Retrieve the result : ERROR:  invalid input syntax for integer: "a"
## CONTEXT:  COPY foo, line 1, column Id: "a"
## )

The error message isn't completely clear, but I interpret this as R trying to pass the contents of the SomeData column to the first column in the database (which is Id).

How should I be passing the data to PostgreSQL so that the Id column is auto-generated?

标签: r postgresql
2条回答
够拽才男人
2楼-- · 2020-02-12 02:52

From the thread in hrbrmstr's comment, I found a hack to make this work.

In the postgresqlWriteTable in the RPostgreSQL package, you need to replace the line

sql4 <- paste("COPY", postgresqlTableRef(name), "FROM STDIN")

with

sql4 <- paste(
  "COPY ", 
  postgresqlTableRef(name), 
  "(", 
  paste(postgresqlQuoteId(names(value)), collapse = ","), 
  ") FROM STDIN"
)

Note that the quoting of variables (not included in the original hack) is necessary to pass case-sensitive column names.

Here's a script to do that:

body_lines <- deparse(body(RPostgreSQL::postgresqlWriteTable))
new_body_lines <- sub(
  'postgresqlTableRef(name), "FROM STDIN")', 
  'postgresqlTableRef(name), "(", paste(shQuote(names(value)), collapse = ","), ") FROM STDIN")', 
  body_lines,
  fixed = TRUE
)
fn <- RPostgreSQL::postgresqlWriteTable
body(fn) <- parse(text = new_body_lines)
while("RPostgreSQL" %in% search()) detach("package:RPostgreSQL")
assignInNamespace("postgresqlWriteTable", fn, "RPostgreSQL")
查看更多
萌系小妹纸
3楼-- · 2020-02-12 03:05

I struggled with an issue very similar to this today, and stumbled across this thread as I tried out different approaches. As of this writing (02/12/2018), it looks like the patch recommended above has been implemented into the latest version of RPostgreSQL::postgresqlWriteTable, but I still kept getting an error indicating that the primary key R assigned to my new rows was duplicated in the source data table.

I ultimately implemented a workaround generating an incrementing primary key in R to append to my inserted data to update the source table in my postgreSQL Db. For my purposes, I only needed to insert one record into my table at a time and I can't imagine this is an optimal solution for inserting a batch of records requiring a serially incremented primary key. Predictably, an error of "table my_table exists in database: aborting assignTable" was thrown when I omitted the 'append=TRUE' from my script; however this option did not automatically assign an incrementing primary key as I had hoped, even with the code patch described above.

drv <- dbDriver("PostgreSQL")
localdb <- dbConnect(drv, dbname= 'MyDatabase',
                      host= 'localhost',
                    port = 5432,
                    user = 'postgres',
                    password= 'MyPassword')

KeyPlusOne <- sum(dbGetQuery(localdb, "SELECT count(*) FROM my_table"),1)
NewRecord <- t(c(KeyPlusOne, 'Var1','Var2','Var3','Var4'))
NewRecord <- as.data.frame(NewRecord)
NewRecord <- setNames(KeyPlusOne, c("PK","VarName1","VarName2","VarName3","VarName4"))

postgresqlWriteTable(localdb, "my_table", NewRecord, append=TRUE, row.names=FALSE)
查看更多
登录 后发表回答