I'm reading a CSV file like this
Date,Open,High,Low,Close,Volume,Adj Close
2000-12-29,30.88,31.31,28.69,29.06,31702200,27.57
2000-12-28,30.56,31.62,30.38,31.06,25053600,29.46
2000-12-27,30.38,31.06,29.38,30.69,26437500,29.11
2000-12-26,31.50,32.19,30.00,30.94,20589500,29.34
2000-12-22,30.38,31.98,30.00,31.88,35568200,30.23
2000-12-21,27.81,30.25,27.31,29.50,46719700,27.98
2000-12-20,28.06,29.81,27.50,28.50,54440500,27.03
2000-12-19,31.81,33.12,30.12,30.62,58653700,29.05
...
2000-01-13,108.50,109.88,103.50,105.06,55779200,24.91
2000-01-12,112.25,112.25,103.69,105.62,83443600,25.05
2000-01-11,112.62,114.75,109.50,112.38,86585200,26.65
2000-01-10,108.00,116.00,105.50,115.75,91518000,27.45
2000-01-07,95.00,103.50,93.56,103.38,91755600,24.51
2000-01-06,100.16,105.00,94.69,96.00,109880000,22.76
2000-01-05,101.62,106.38,96.00,102.00,166054000,24.19
2000-01-04,115.50,118.62,105.00,107.69,116824800,25.54
2000-01-03,124.62,125.19,111.62,118.12,98114800,28.01
Full data can be download using
python -c "from pyalgotrade.tools import yahoofinance; yahoofinance.download_daily_bars('orcl', 2000, 'orcl-2000.csv')"
see http://gbeced.github.io/pyalgotrade/docs/v0.15/html/tutorial.html
I try to put this CSV data to a MySQL database using Python, Pandas, SQLAlchemy, read_csv
and to_sql
:
filename = "orcl-2000.csv"
df = pd.read_csv(filename, sep=',')
db_uri = "mysql+mysqlconnector://{user}:{password}@{host}:{port}/{db}" # or without mysqlconnector (need MySQLdb)
db_uri = db_uri.format(
user = "root",
password = "123456",
host = "127.0.0.1",
db = "test",
port = 3306
)
engine = sqlalchemy.create_engine(db_uri)
df["Date"] = pd.to_datetime(df["Date"])
df = df.set_index("Date")
print(df)
print(df.dtypes)
print(type(df.index), df.index.dtype)
print(type(df.index[0]))
df.to_sql("test_table", engine, flavor="mysql", if_exists="replace")
(see full code here)
I get the following output:
$ python main.py
Date Open High Low Close Volume Adj Close
0 2000-12-29 30.88 31.31 28.69 29.06 31702200 27.57
1 2000-12-28 30.56 31.62 30.38 31.06 25053600 29.46
2 2000-12-27 30.38 31.06 29.38 30.69 26437500 29.11
3 2000-12-26 31.50 32.19 30.00 30.94 20589500 29.34
4 2000-12-22 30.38 31.98 30.00 31.88 35568200 30.23
5 2000-12-21 27.81 30.25 27.31 29.50 46719700 27.98
6 2000-12-20 28.06 29.81 27.50 28.50 54440500 27.03
7 2000-12-19 31.81 33.12 30.12 30.62 58653700 29.05
8 2000-12-18 30.00 32.44 29.94 32.00 61640100 30.35
9 2000-12-15 29.44 30.08 28.19 28.56 120004000 27.09
10 2000-12-14 29.25 29.94 27.25 27.50 45894400 26.08
11 2000-12-13 31.94 32.00 28.25 28.38 37933600 26.91
12 2000-12-12 31.88 32.50 30.41 30.75 26481200 29.17
13 2000-12-11 30.50 32.25 30.00 31.94 50279700 30.29
14 2000-12-08 30.06 30.62 29.25 30.06 40052600 28.51
15 2000-12-07 29.62 29.94 28.12 28.31 41088300 26.85
16 2000-12-06 31.19 31.62 29.31 30.19 42125600 28.63
17 2000-12-05 29.44 31.50 28.88 31.50 59754700 29.88
18 2000-12-04 26.25 28.88 26.19 28.19 40710400 26.74
19 2000-12-01 26.38 27.88 25.50 26.44 48663500 25.08
20 2000-11-30 21.75 27.62 21.50 26.50 84386200 25.14
21 2000-11-29 23.19 23.62 21.81 22.88 75409600 21.70
22 2000-11-28 23.50 23.81 22.25 22.66 43075300 21.49
23 2000-11-27 25.44 25.81 22.88 23.12 45665200 21.93
24 2000-11-24 23.31 24.25 23.12 24.12 22443900 22.88
25 2000-11-22 23.62 24.06 22.06 22.31 53315300 21.16
26 2000-11-21 24.81 25.62 23.50 23.88 58647400 22.65
27 2000-11-20 24.31 25.88 24.00 24.75 89778400 23.48
28 2000-11-17 26.94 29.25 25.25 28.81 59636000 27.33
29 2000-11-16 28.75 29.81 27.25 27.38 37986600 25.96
.. ... ... ... ... ... ... ...
222 2000-02-14 60.88 62.25 58.62 62.19 37599800 29.49
223 2000-02-11 62.50 64.75 58.75 59.69 55774000 28.31
224 2000-02-10 60.00 62.62 58.00 62.31 45288600 29.55
225 2000-02-09 60.06 61.31 58.81 59.94 52471600 28.43
226 2000-02-08 60.75 61.44 59.00 59.56 55718000 28.25
227 2000-02-07 59.31 60.00 58.88 59.94 44691200 28.43
228 2000-02-04 57.62 58.25 56.81 57.81 40916000 27.42
229 2000-02-03 55.38 57.00 54.25 56.69 55533200 26.88
230 2000-02-02 54.94 56.00 54.00 54.31 63933000 25.76
231 2000-02-01 51.25 54.31 50.00 54.00 57105600 25.61
232 2000-01-31 47.94 50.12 47.06 49.95 68148000 23.69
233 2000-01-28 51.50 51.94 46.62 47.38 86394000 22.47
234 2000-01-27 55.81 56.69 50.00 51.81 61054000 24.57
235 2000-01-26 56.75 58.94 55.00 55.06 47569200 26.11
236 2000-01-25 55.06 57.50 54.88 56.44 53059200 26.77
237 2000-01-24 60.25 60.38 54.00 54.19 50022400 25.70
238 2000-01-21 61.50 61.50 59.00 59.69 50891000 28.31
239 2000-01-20 59.00 60.25 58.12 59.25 54526800 28.10
240 2000-01-19 56.12 58.25 54.00 57.12 49198400 27.09
241 2000-01-18 107.88 114.50 105.62 111.25 66780000 26.38
242 2000-01-14 109.00 111.38 104.75 106.81 57078000 25.33
243 2000-01-13 108.50 109.88 103.50 105.06 55779200 24.91
244 2000-01-12 112.25 112.25 103.69 105.62 83443600 25.05
245 2000-01-11 112.62 114.75 109.50 112.38 86585200 26.65
246 2000-01-10 108.00 116.00 105.50 115.75 91518000 27.45
247 2000-01-07 95.00 103.50 93.56 103.38 91755600 24.51
248 2000-01-06 100.16 105.00 94.69 96.00 109880000 22.76
249 2000-01-05 101.62 106.38 96.00 102.00 166054000 24.19
250 2000-01-04 115.50 118.62 105.00 107.69 116824800 25.54
251 2000-01-03 124.62 125.19 111.62 118.12 98114800 28.01
[252 rows x 7 columns]
Date datetime64[ns]
Open float64
High float64
Low float64
Close float64
Volume int64
Adj Close float64
dtype: object
(<class 'pandas.tseries.index.DatetimeIndex'>, dtype('<M8[ns]'))
<class 'pandas.tslib.Timestamp'>
Traceback (most recent call last):
File "main.py", line 28, in <module>
main()
File "main.py", line 25, in main
df.to_sql("test_table", engine, flavor="mysql", if_exists="replace")
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 950, in to_sql
index_label=index_label)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 475, in to_sql
index_label=index_label)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 842, in to_sql
table.insert()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 611, in insert
self.pd_sql.execute(ins, data_list)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 810, in execute
return self.engine.execute(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1614, in execute
return connection.execute(statement, *multiparams, **params)
File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 662, in execute
params)
File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 761, in _execute_clauseelement
compiled_sql, distilled_params
File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 874, in _execute_context
context)
File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1024, in _handle_dbapi_exception
exc_info
File "/usr/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 196, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 856, in _execute_context
context)
File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 321, in do_executemany
cursor.executemany(statement, parameters)
File "/usr/lib/python2.7/dist-packages/mysql/connector/cursor.py", line 557, in executemany
values.append(fmt % self._process_params(params))
File "/usr/lib/python2.7/dist-packages/mysql/connector/cursor.py", line 344, in _process_params
return self._process_params_dict(params)
File "/usr/lib/python2.7/dist-packages/mysql/connector/cursor.py", line 335, in _process_params_dict
"Failed processing pyformat-parameters; %s" % err)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) Failed processing pyformat-parameters; 'MySQLConverter' object has no attribute '_timestamp_to_mysql' u'INSERT INTO test_table (`index`, `Date`, `Open`, `High`, `Low`, `Close`, `Volume`, `Adj Close`) VALUES (%(index)s, %(Date)s, %(Open)s, %(High)s, %(Low)s, %(Close)s, %(Volume)s, %(Adj Close)s)' ({'index': 0, 'High': 31.31, 'Adj Close': 27.57, 'Volume': 31702200, 'Low': 28.69, 'Date': Timestamp('2000-12-29 00:00:00'), 'Close': 29.06, 'Open': 30.88}, {'index': 1, 'High': 31.62, 'Adj Close': 29.46, 'Volume': 25053600, 'Low': 30.38, 'Date': Timestamp('2000-12-28 00:00:00'), 'Close': 31.06, 'Open': 30.56}, {'index': 2, 'High': 31.06, 'Adj Close': 29.11, 'Volume': 26437500, 'Low': 29.38, 'Date': Timestamp('2000-12-27 00:00:00'), 'Close': 30.69, 'Open': 30.38}, {'index': 3, 'High': 32.19, 'Adj Close': 29.34, 'Volume': 20589500, 'Low': 30.0, 'Date': Timestamp('2000-12-26 00:00:00'), 'Close': 30.94, 'Open': 31.5}, {'index': 4, 'High': 31.98, 'Adj Close': 30.23, 'Volume': 35568200, 'Low': 30.0, 'Date': Timestamp('2000-12-22 00:00:00'), 'Close': 31.88, 'Open': 30.38}, {'index': 5, 'High': 30.25, 'Adj Close': 27.98, 'Volume': 46719700, 'Low': 27.31, 'Date': Timestamp('2000-12-21 00:00:00'), 'Close': 29.5, 'Open': 27.81}, {'index': 6, 'High': 29.81, 'Adj Close': 27.03, 'Volume': 54440500, 'Low': 27.5, 'Date': Timestamp('2000-12-20 00:00:00'), 'Close': 28.5, 'Open': 28.06}, {'index': 7, 'High': 33.12, 'Adj Close': 29.05, 'Volume': 58653700, 'Low': 30.12, 'Date': Timestamp('2000-12-19 00:00:00'), 'Close': 30.62, 'Open': 31.81} ... displaying 10 of 252 total bound parameter sets ... {'index': 250, 'High': 118.62, 'Adj Close': 25.54, 'Volume': 116824800, 'Low': 105.0, 'Date': Timestamp('2000-01-04 00:00:00'), 'Close': 107.69, 'Open': 115.5}, {'index': 251, 'High': 125.19, 'Adj Close': 28.01, 'Volume': 98114800, 'Low': 111.62, 'Date': Timestamp('2000-01-03 00:00:00'), 'Close': 118.12, 'Open': 124.62})
Column Date
type is datetime64[ns]
.
SQLAlchemy doesn't seems to like this kind of Numpy type so it raises:
sqlalchemy.exc.ProgrammingError: (ProgrammingError) Failed processing pyformat-parameters; 'MySQLConverter' object has no attribute '_timestamp_to_mysql'
How can I cleanly avoid this kind of error ?
I found by simply reformatting the dates as strings solves the problem.
MySQL still recognized it as a date rather than VarChar. I ran into the exact same problem at work, and this solution has been working.
I'm not very familiar with MySQL Connector, but according to this, you should be able to add a datetime64 converter using something like
Since all datetime64 dtypes are 8-bytes, they can be viewed and stored as 8-byte integers. I'm not sure what facilities MySQL Connector provides for pulling the data back out as
datetime64
s. But if all else fails, you can convert the 8-byte integers back intodatetime64[ns]
like this: