Using revoscalepy to insert data into a database

2019-08-26 08:26发布

问题:

Ahoi there,

is there a possibility of using the revoscalepy package to insert values into a table?

I would expect something along the lines of:

import pandas as pd
from revoscalepy import rx_write_to_db, RxOdbcData

a_df = pd.DataFrame([[0, 1], [2, 3]], columns=[...])

rx_write_to_db(RxOdbcData(connection_string=con_str, ...), data=a_df)

But I couldn't find anything like this. The closest option appears to be rx_write_object, which dumps the dataframe as a binary into the table. More information about the usage can be found on the R-package site. This however does not solve my issue, as I would that the data is not in one binary blob.

Some context on the problem: During the feature generation I create multiple features which I want to store inside the database for later use. In theory I could create a final dataframe with all my features and the meta-data in it and use some triggers to dump the data into the right tables, but before I do this, I would rather install pymssql.

Any clues?

Ps.: If anyone knows the correct tags for a question like this, let me know...

回答1:

I think what you are looking for is rx_featurize from microsoftml package (Installed with revoscalepy)

After you have your data frame, you would create a RxSqlServerData or RxOdbcData, with the connection string and table name arguments.

Then you simply call rx_featurize giving it the data frame as input and the Rx...Data object as output (specifying if you want to overwrite the table or not)

http://docs.microsoft.com/en-us/machine-learning-server/python-reference/microsoftml/rx-featurize

import pandas as pd
from revoscalepy import RxOdbcData
from microsoftml import rx_featurize

a_df = pd.DataFrame([[0, 1], [2, 3]], columns=[...])

rx_featurize(data=a_df,output_data=RxOdbcData(connection_string=con_str, table = tablename), overwrite = True)