I have a large data.frame
(15 columns and 100,000 rows) in an existing R session that I want to send to a Q/KDB instance. From KDB's cookbook, the possible solutions are:
RServer for Q: use KDB to create new R instance which shares memory space. This doesn't work because my data is in an existing instance of R.
RServe: run an R server and use TCP/IP to communicate with Q/KDB client. This does not work, because as per RServe's documentation, "every connection has a separate workspace and working directory" and so i presume does not see my existing data.
R Math Library: access R's functionality via a math library without needing an instance of R. This does not work because my data is already in an instance of R.
So any other ideas on how to send data from R to Q/KDB?
open a port in Q. I start Q with a batch file:
load qserver.dll
Then use these
then ex2 can take multiple arguments so you can build queries with R variables and strings
Edit: thats for R from Q, heres R to Q
2nd Edit: improved algo:
This will work for a matrix, so for a data frame just save a vector containing the types of each column, then convert the dataframe to a matrix, import the matrix to Q, and cast the types
Note that this algo is approx O(rows * cols^1.1) so you'll need to chop the columns up into multiple matricies if you have any more than 20 to get O(rows * cols)
but for your example 150,000 rows and 15 columns takes 10 seconds so further optimization may not be necessary.