I have some monthly weather data that I want to insert into an Oracle database table but I want to insert the corresponding records in a batch in order to be more efficient. Can anyone advise as to how I'd go about doing this in Python?
For example let's say my table has four fields: a station ID, a date, and two value fields. The records are uniquely identified by the station ID and date fields (composite key). The values I'll have to insert for each station will be kept in a list with X number of full years worth of data, so for example if there are two years of values then the value lists will contain 24 values.
I assume that below is the way I'd do this if I wanted to insert the records one at a time:
connection_string = "scott/tiger@testdb"
connection = cx_Oracle.Connection(connection_string)
cursor = cx_Oracle.Cursor(connection)
station_id = 'STATION_1'
start_year = 2000
temps = [ 1, 3, 5, 7, 9, 1, 3, 5, 7, 9, 1, 3 ]
precips = [ 2, 4, 6, 8, 2, 4, 6, 8, 2, 4, 6, 8 ]
number_of_years = len(temps) / 12
for i in range(number_of_years):
for j in range(12):
# make a date for the first day of the month
date_value = datetime.date(start_year + i, j + 1, 1)
index = (i * 12) + j
sql_insert = 'insert into my_table (id, date_column, temp, precip) values (%s, %s, %s, %s)', (station_id, date_value, temps[index], precips[index]))
cursor.execute(sql_insert)
connection.commit()
Is there a way to do what I'm doing above but in a way that performs a batch insert in order to increase efficiency? BTW my experience is with Java/JDBC/Hibernate so if someone can give an explanation/example which compares to the Java approach then it'd be especially helpful.
EDIT: Perhaps I need to use cursor.executemany() as described here?
Thanks in advance for any suggestions, comments, etc.
fyi my test result:
I insert into 5000 rows. 3 columns per row.
python code, which setup the sql like insert all into t(a,b,c) select :1, :2, :3 from dual union all select :4, :5: :6 from daul...
The python code to setup this long sql, it cost 0.145329 seconds.
I test my code on a very old sun machine. cpu: 1415 MH.
in the third case, I checked the database side, the wait event is "SQL*Net more data from client". which means the server is waiting for more data from client.
The result of the third method is unbelievable for me without the test.
so the short suggestion from me is just to use executemany.
Use
Cursor.prepare()
andCursor.executemany()
.From the cx_Oracle documentation:
Thus, using the above two functions, your code becomes:
Also see Oracle's Mastering Oracle+Python series of articles.
I would create a large SQL insert statement using union:
You can build the string in python and give it to oracle as one statement to execute.
Here's what I've come up with which appears to work well (but please comment if there's a way to improve this):
As one of the comments says, consider using
INSERT ALL
. Supposedly it'll be significantly faster than usingexecutemany()
.For example:
http://www.techonthenet.com/oracle/questions/insert_rows.php