I have successfully connected local R3.1.2( win7 64bit rstudio) and remote hive server using rjdbc
,
library(RJDBC)
.jinit()
dir = "E:/xxx/jars/hive/"
for(l in list.files(dir)) {
.jaddClassPath(paste( dir ,l,sep="")) }
options( java.parameters = "-Xmx8g" )
drv <- JDBC("org.apache.hadoop.hive.jdbc.HiveDriver",
"E:/xxx/jars/hive/hive-jdbc-0.11.0.jar")
conn <- dbConnect(drv, "jdbc:hive://10.127.130.162:10002/default", "", "" )
dbGetQuery(conn, "select * from test.test limit 10 ")
successfully read data from hive ,but I cannot write R data frame using
dbWriteTable
:
data(iris)
dbWriteTable(conn, iris , "test.dc_test")
Error return:
Error in .jcall(md, "Ljava/sql/ResultSet;", "getTables", .jnull("java/lang/String"), :
method getTables with signature (Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)Ljava/sql/ResultSet; not found
Either my misuse or other methods needed?
through these years, I still cannot find a full solution...but here is also a partial one, only available for write small data.frame and how small vary from 32/64bit , mac/win ...
first change dataframe to character vector
then use insert to write lines into hadoop
in my PC, WIN7 64BIT 16G, if the vector 'data2hodoop' larger than 50M, there will be an error " C stack usage xxx is too close to the limit";
in my mac, the limit is even lower, and I can not find a way to modify this limit.
I have a partial answer. Your arguments to dbWriteTable are reversed. The pattern is dbWriteTable(connection, tableName, data), the docs read
dbWriteTable(conn, name, value, ...)
. That being said, I don't find that the 'correct' form works either, instead yielding the following error message:(at least when using Amazon's JDBC driver for Hive). That error at least seems self apparent, the query generated to make the table for data insertion didn't parse correctly in HiveQL. The fix, other than doing it manually, I'm not sure about.