I'm trying to get zipline working with non-US, intraday data, that I've loaded into a pandas DataFrame:
BARC HSBA LLOY STAN
Date
2014-07-01 08:30:00 321.250 894.55 112.105 1777.25
2014-07-01 08:32:00 321.150 894.70 112.095 1777.00
2014-07-01 08:34:00 321.075 894.80 112.140 1776.50
2014-07-01 08:36:00 321.725 894.80 112.255 1777.00
2014-07-01 08:38:00 321.675 894.70 112.290 1777.00
I've followed moving-averages tutorial here, replacing "AAPL" with my own symbol code, and the historical calls with "1m" data instead of "1d".
Then I do the final call using algo_obj.run(DataFrameSource(mydf))
, where mydf
is the dataframe above.
However there are all sorts of problems arising related to TradingEnvironment. According to the source code:
# This module maintains a global variable, environment, which is
# subsequently referenced directly by zipline financial
# components. To set the environment, you can set the property on
# the module directly:
# from zipline.finance import trading
# trading.environment = TradingEnvironment()
#
# or if you want to switch the environment for a limited context
# you can use a TradingEnvironment in a with clause:
# lse = TradingEnvironment(bm_index="^FTSE", exchange_tz="Europe/London")
# with lse:
# the code here will have lse as the global trading.environment
# algo.run(start, end)
However, using the context doesn't seem to fully work. I still get errors, for example stating that my timestamps are before the market open (and indeed, looking at trading.environment.open_and_close
the times are for the US market.
My question is, has anybody managed to use zipline with non-US, intra-day data? Could you point me to a resource and ideally example code on how to do this?
n.b. I've seen there are some tests on github that seem related to the trading calendars (tradincalendar_lse.py, tradingcalendar_tse.py , etc) - but this appears to only handle data at the daily level. I would need to fix:
- open/close times
- reference data for the benchmark
- and probably more ...
I've got this working after fiddling around with the tutorial notebook. Code sample below. It's using the DF
mid
, as described in the original question. A few points bear mentioning:Trading Calendar I create one manually and assign to
trading.environment
, by using non_working_days in tradingcalendar_lse.py. Alternatively you could create one that fits your data exactly (however could be a problem for out-of-sample data). There are two fields that you need to define:trading_days
andopen_and_closes
.sim_params There is a problem with the default start/end values because they aren't timezone aware. So you must create a sim_params object and pass start/end parameters with a timezone.
Also,
run()
must be called with the argument overwrite_sim_params=False ascalculate_first_open
/close
raise timestamp errors.I should mention that it's also possible to pass pandas Panel data, with fields open,high,low,close,price and volume in the minor_axis. But in this case, the former fields are mandatory - otherwise errors are raised.
Note that this code only produces a daily summary of the performance. I'm sure there must be a way to get the result at a minute resolution (I thought this was set by
emission_rate
, but apparently it's not). If anybody knows please comment and I'll update the code. Also, not sure what the api call is to call 'analyze' (i.e. when using%%zipline
magic in IPython, as in the tutorial, theanalyze()
method gets automatically called. How do I do this manually?)@Luciano
You can add
analyze(None, perf_manual)
at the end of your code for automatically running the analyze process.