How to use a custom calendar in a custom zipline b

2019-04-02 05:01发布

问题:

I have the following code in my viacsv.py file that aims to allow a custom bundle to be ingested:

#
# Ingest stock csv files to create a zipline data bundle

import os

import numpy  as np
import pandas as pd
import datetime

boDebug=True # Set True to get trace messages

from zipline.utils.cli import maybe_show_progress

def viacsv(symbols,start=None,end=None):

    # strict this in memory so that we can reiterate over it.
    # (Because it could be a generator and they live only once)
    tuSymbols = tuple(symbols)

    if boDebug:
        print "entering viacsv.  tuSymbols=",tuSymbols

    # Define our custom ingest function
    def ingest(environ,
               asset_db_writer,
               minute_bar_writer,  # unused
               daily_bar_writer,
               adjustment_writer,
               calendar,
               cache,
               show_progress,
               output_dir,
               # pass these as defaults to make them 'nonlocal' in py2
               start=start,
               end=end):

        if boDebug:
            print "entering ingest and creating blank dfMetadata"

        dfMetadata = pd.DataFrame(np.empty(len(tuSymbols), dtype=[
            ('start_date', 'datetime64[ns]'),
            ('end_date', 'datetime64[ns]'),
            ('auto_close_date', 'datetime64[ns]'),
            ('symbol', 'object'),
        ]))

        if boDebug:
            print "dfMetadata",type(dfMetadata)
            print dfMetadata.describe
            print

        # We need to feed something that is iterable - like a list or a generator -
        # that is a tuple with an integer for sid and a DataFrame for the data to
        # daily_bar_writer

        liData=[]
        iSid=0
        for S in tuSymbols:
            IFIL="~/notebooks/csv/"+S+".csv"
            if boDebug:
               print "S=",S,"IFIL=",IFIL
            dfData=pd.read_csv(IFIL,index_col='Date',parse_dates=True).sort_index()
            if boDebug:
               print "read_csv dfData",type(dfData),"length",len(dfData)
               print
            dfData.rename(
                columns={
                    'Open': 'open',
                    'High': 'high',
                    'Low': 'low',
                    'Close': 'close',
                    'Volume': 'volume',
                    'Adj Close': 'price',
                },
                inplace=True,
            )
            dfData['volume']=dfData['volume']/1000
            liData.append((iSid,dfData))

            # the start date is the date of the first trade and
            start_date = dfData.index[0]
            if boDebug:
                print "start_date",type(start_date),start_date

            # the end date is the date of the last trade
            end_date = dfData.index[-1]
            if boDebug:
                print "end_date",type(end_date),end_date

            # The auto_close date is the day after the last trade.
            ac_date = end_date + pd.Timedelta(days=1)
            if boDebug:
                print "ac_date",type(ac_date),ac_date

            # Update our meta data
            dfMetadata.iloc[iSid] = start_date, end_date, ac_date, S

            iSid += 1

        if boDebug:
            print "liData",type(liData),"length",len(liData)
            print liData
            print
            print "Now calling daily_bar_writer"

        daily_bar_writer.write(liData, show_progress=False)

        # Hardcode the exchange to "YAHOO" for all assets and (elsewhere)
        # register "YAHOO" to resolve to the NYSE calendar, because these are
        # all equities and thus can use the NYSE calendar.
        dfMetadata['exchange'] = "YAHOO"

        if boDebug:
            print "returned from daily_bar_writer"
            print "calling asset_db_writer"
            print "dfMetadata",type(dfMetadata)
            print dfMetadata
            print

        # Not sure why symbol_map is needed
        symbol_map = pd.Series(dfMetadata.symbol.index, dfMetadata.symbol)
        if boDebug:
            print "symbol_map",type(symbol_map)
            print symbol_map
            print

        asset_db_writer.write(equities=dfMetadata)

        if boDebug:
            print "returned from asset_db_writer"
            print "calling adjustment_writer"

        adjustment_writer.write()

        if boDebug:
            print "returned from adjustment_writer"
            print "now leaving ingest function"

    if boDebug:
       print "about to return ingest function"
    return ingest

My problem is that the data I am feed in is not US data but instead Australian equity data. As such, it abides by australian holidays, not US holidays. It seems somehow the code below is defaulting to using a US trading calendar and telling me I cannot pass in data for days that US markets are meant to be closed and vice versa. How can i tweak the above code to take in a custom calendar? To ingest the bundle I run the following command at my terminal:

zipline ingest -b CBA.csv

thoughts?

回答1:

Youn need to define your own calendar in zipline/utils/calendars: just create a copy one of the existing files (say, exchange_calendar_nyse.py) and edit with the required holidays. Let's say that you call this file my_own_calendar.py and the class MyOwnCalendar.

Please note there are other 2 (or 3) steps you need to take:

1) Register your calendar in zipline/util/calendars/calendar_utils.py: you can do it adding an entry to _default_calendar_factories and, if you need an alias, _default_calendar_aliases. For example, to map my_own_calendar.py to 'OWN' and with an alias 'MY_CALENDAR':

_default_calendar_factories = {
    'NYSE': NYSEExchangeCalendar,
    'CME': CMEExchangeCalendar,
    ...
    'OWN': MyOwnCalendar
}

_default_calendar_aliases = {
    'NASDAQ': 'NYSE',
    ...
    'MY_CALENDAR': 'OWN'
 }

2) you need to edit .zipline/extension.py (you will find .zipline in your home directory - to see your home under Windows, open a dos shell and type echo %USERPROFILE%

# List the tickers of the market you defined
tickers_of_interest = {'TICKER1', 'TICKER2', ...}

register('my_market', viacsv(tickers_of_interest), calendar_name="OWN")

with those steps you should be able to ingest your bundle simply typing zipline ingest -b my_market.

3) The problem I personally had was that I needed to have even more control of the trading calendar, given that the super class TradingCalendar assumes that Saturdays/Sundays are non trading days, and this is not true for every market/asset class. Having a wrong calendar definition will cause exception at ingestion time. For example, to have calendar for a market which trades 7/7 24/24, I hacked the calendar as follows:

from datetime import time
from pytz import timezone
from pandas import date_range
from .trading_calendar import TradingCalendar, HolidayCalendar

from zipline.utils.memoize import lazyval

from pandas.tseries.offsets import CustomBusinessDay

class MyOwnCalendar(TradingCalendar):
    """
    Round the clock calendar: 7/7, 24/24
    """

    @property
    def name(self):
        return "OWN"

    @property
    def tz(self):
        return timezone("Europe/London")

    @property
    def open_time(self):
        return time(0)

    @property
    def close_time(self):
        return time(23, 59)

    @property
    def regular_holidays(self):
        return []

    @property
    def special_opens(self):
        return []

    def sessions_in_range(self, start_session, last_session):
        return date_range(start_session, last_session)

    @lazyval
    def day(self):
        return CustomBusinessDay(holidays=self.adhoc_holidays,
        calendar=self.regular_holidays,weekmask="Mon Tue Wed Thu Fri Sat Sun")