Is it possible to override __new__ in an enum to p

2019-01-23 18:50发布

问题:

I want to parse strings into python enums. Normally one would implement a parse method to do so. A few days ago I spotted the __new__ method which is capable of returning different instances based on a given parameter.

Here my code, which will not work:

import enum
class Types(enum.Enum):
  Unknown = 0
  Source = 1
  NetList = 2

  def __new__(cls, value):
    if (value == "src"):  return Types.Source
#    elif (value == "nl"): return Types.NetList
#    else:                 raise Exception()

  def __str__(self):
    if (self == Types.Unknown):     return "??"
    elif (self == Types.Source):    return "src"
    elif (self == Types.NetList):   return "nl"

When I execute my Python script, I get this message:

[...]
  class Types(enum.Enum):
File "C:\Program Files\Python\Python 3.4.0\lib\enum.py", line 154, in __new__
  enum_member._value_ = member_type(*args)
TypeError: object() takes no parameters

How can I return a proper instance of a enum value?

Edit 1:

This Enum is used in URI parsing, in particular for parsing the schema. So my URI would look like this

nl:PoC.common.config
<schema>:<namespace>[.<subnamespace>*].entity

So after a simple string.split operation I would pass the first part of the URI to the enum creation.

type = Types(splitList[0])

type should now contain a value of the enum Types with 3 possible values (Unknown, Source, NetList)

If I would allow aliases in the enum's member list, it won't be possible to iterate the enum's values alias free.

回答1:

Yes, you can override the __new__() method of an enum subclass to implement a parse method if you're careful, but in order to avoid specifying the integer encoding in two places, you'll need to define the method separately, after the class, so you can reference the symbolic names defined by the enumeration.

Here's what I mean:

import enum

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    NetList = 2

    def __str__(self):
        if (self == Types.Unknown):     return "??"
        elif (self == Types.Source):    return "src"
        elif (self == Types.NetList):   return "nl"
        else:                           raise TypeError(self)

def _Types_parser(cls, value):
    if not isinstance(value, str):
        # forward call to Types' superclass (enum.Enum)
        return super(Types, cls).__new__(cls, value)
    else:
        # map strings to enum values, default to Unknown
        return { 'nl': Types.NetList,
                'ntl': Types.NetList,  # alias
                'src': Types.Source,}.get(value, Types.Unknown)

setattr(Types, '__new__', _Types_parser)

print("Types('nl') ->",  Types('nl'))   # Types('nl') -> nl
print("Types('ntl') ->", Types('ntl'))  # Types('ntl') -> nl
print("Types('wtf') ->", Types('wtf'))  # Types('wtf') -> ??
print("Types(1) ->",     Types(1))      # Types(1) -> src

Update

Here's a more table-driven version that helps eliminates some of the repetitious coding that would otherwise be involved:

from collections import OrderedDict
import enum

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    NetList = 2
    __str__ = lambda self: Types._value_to_str.get(self)

# define after Types class
Types.__new__ = lambda cls, value: (cls._str_to_value.get(value, Types.Unknown)
                                    if isinstance(value, str) else
                                    super(Types, cls).__new__(cls, value))
# define look-up table and its inverse
Types._str_to_value = OrderedDict((( '??', Types.Unknown),
                                   ('src', Types.Source),
                                   ('ntl', Types.NetList),  # alias
                                   ( 'nl', Types.NetList),))
Types._value_to_str = {val: key for key, val in Types._str_to_value.items()}
Types._str_to_value = dict(Types._str_to_value) # convert to regular dict (optional)

if __name__ == '__main__':
    print("Types('nl')  ->", Types('nl'))   # Types('nl')  -> nl
    print("Types('ntl') ->", Types('ntl'))  # Types('ntl') -> nl
    print("Types('wtf') ->", Types('wtf'))  # Types('wtf') -> ??
    print("Types(1)     ->", Types(1))      # Types(1)     -> src

    print()
    print(list(Types))  # iterate values

    import pickle  # demostrate picklability
    print(pickle.loads(pickle.dumps(Types.NetList)) == Types.NetList)  # -> True


回答2:

The __new__ method on the your enum.Enum type is used for creating new instances of the enum values, so the Types.Unknown, Types.Source, etc. singleton instances. The enum call (e.g. Types('nl') is handled by EnumMeta.__call__, which you could subclass.

Using name aliases fits your usecases

Overriding __call__ is perhaps overkill for this situation. Instead, you can easily use name aliases:

class Types(enum.Enum):
    Unknown = 0

    Source = 1
    src = 1

    NetList = 2
    nl = 2

Here Types.nl is an alias and will return the same object as Types.Netlist. You then access members by names (using Types[..] index access); so Types['nl'] works and returns Types.Netlist.

Your assertion that it won't be possible to iterate the enum's values alias free is incorrect. Iteration explicitly doesn't include aliases:

Iterating over the members of an enum does not provide the aliases

Aliases are part of the Enum.__members__ ordered dictionary, if you still need access to these.

A demo:

>>> import enum
>>> class Types(enum.Enum):
...     Unknown = 0
...     Source = 1
...     src = 1
...     NetList = 2
...     nl = 2
...     def __str__(self):
...         if self is Types.Unknown: return '??'
...         if self is Types.Source:  return 'src'
...         if self is Types.Netlist: return 'nl'
... 
>>> list(Types)
[<Types.Unknown: 0>, <Types.Source: 1>, <Types.NetList: 2>]
>>> list(Types.__members__)
['Unknown', 'Source', 'src', 'NetList', 'nl']
>>> Types.Source
<Types.Source: 1>
>>> str(Types.Source)
'src'
>>> Types.src
<Types.Source: 1>
>>> str(Types.src)
'src'
>>> Types['src']
<Types.Source: 1>
>>> Types.Source is Types.src
True

The only thing missing here is translating unknown schemas to Types.Unknown; I'd use exception handling for that:

try:
    scheme = Types[scheme]
except KeyError:
    scheme = Types.Unknown

Overriding __call__

If you want to treat your strings as values, and use calling instead of item access, this is how you override the __call__ method of the metaclass:

class TypesEnumMeta(enum.EnumMeta):
    def __call__(cls, value, *args, **kw):
        if isinstance(value, str):
            # map strings to enum values, defaults to Unknown
            value = {'nl': 2, 'src': 1}.get(value, 0)
        return super().__call__(value, *args, **kw)

class Types(enum.Enum, metaclass=TypesEnumMeta):
    Unknown = 0
    Source = 1
    NetList = 2

Demo:

>>> class TypesEnumMeta(enum.EnumMeta):
...     def __call__(cls, value, *args, **kw):
...         if isinstance(value, str):
...             value = {'nl': 2, 'src': 1}.get(value, 0)
...         return super().__call__(value, *args, **kw)
... 
>>> class Types(enum.Enum, metaclass=TypesEnumMeta):
...     Unknown = 0
...     Source = 1
...     NetList = 2
... 
>>> Types('nl')
<Types.NetList: 2>
>>> Types('?????')
<Types.Unknown: 0>

Note that we translate the string value to integers here and leave the rest to the original Enum logic.

Fully supporting value aliases

So, enum.Enum supports name aliases, you appear to want value aliases. Overriding __call__ can offer a facsimile, but we can do better than than still by putting the definition of the value aliases into the enum class itself. What if specifying duplicate names gave you value aliases, for example?

You'll have to provide a subclass of the enum._EnumDict too as it is that class that prevents names from being re-used. We'll assume that the first enum value is a default:

class ValueAliasEnumDict(enum._EnumDict):
     def __init__(self):
        super().__init__()
        self._value_aliases = {}

     def __setitem__(self, key, value):
        if key in self:
            # register a value alias
            self._value_aliases[value] = self[key]
        else:
            super().__setitem__(key, value)

class ValueAliasEnumMeta(enum.EnumMeta):
    @classmethod
    def __prepare__(metacls, cls, bases):
        return ValueAliasEnumDict()

    def __new__(metacls, cls, bases, classdict):
        enum_class = super().__new__(metacls, cls, bases, classdict)
        enum_class._value_aliases_ = classdict._value_aliases
        return enum_class

    def __call__(cls, value, *args, **kw):
        if value not in cls. _value2member_map_:
            value = cls._value_aliases_.get(value, next(iter(Types)).value)
        return super().__call__(value, *args, **kw)

This then lets you define aliases and a default in the enum class:

class Types(enum.Enum, metaclass=ValueAliasEnumMeta):
    Unknown = 0

    Source = 1
    Source = 'src'

    NetList = 2
    NetList = 'nl'

Demo:

>>> class Types(enum.Enum, metaclass=ValueAliasEnumMeta):
...     Unknown = 0
...     Source = 1
...     Source = 'src'
...     NetList = 2
...     NetList = 'nl'
... 
>>> Types.Source
<Types.Source: 1>
>>> Types('src')
<Types.Source: 1>
>>> Types('?????')
<Types.Unknown: 0>


回答3:

I think the by far easiest solution to your problem is to use the functional API of the Enum class which gives more freedom when it comes to choosing names since we specify them as strings:

from enum import Enum

Types = Enum(
    value='Types',
    names=[
        ('??', 0),
        ('Unknown', 0),
        ('src', 1),
        ('Source', 1),
        ('nl', 2),
        ('NetList', 2),
    ]
)

This creates an enum with name aliases. Mind the order of the entries in the names list. The first one will be chosen as default value (and also returned for name), further ones are considered as aliases but both can be used:

>>> Types.src
<Types.src: 1>
>>> Types.Source
<Types.src: 1>

To use the name property as a return value for str(Types.src) we replace the default version from Enum:

>>> Types.__str__ = lambda self: self.name
>>> Types.__format__ = lambda self, _: self.name
>>> str(Types.Unknown)
'??'
>>> '{}'.format(Types.Source)
'src'
>>> Types['src']
<Types.src: 1>

Note that we also replace the __format__ method which is called by str.format().



回答4:

I don't have enough rep to comment on the accepted answer, but in Python 2.7 with the enum34 package the following error occurs at run-time:

"unbound method <lambda>() must be called with instance MyEnum as first argument (got EnumMeta instance instead)"

I was able to correct this by changing:

# define after Types class
Types.__new__ = lambda cls, value: (cls._str_to_value.get(value, Types.Unknown)
                                    if isinstance(value, str) else
                                    super(Types, cls).__new__(cls, value))

to the following, wrapping the lambda in with staticmethod():

# define after Types class
Types.__new__ = staticmethod(
    lambda cls, value: (cls._str_to_value.get(value, Types.Unknown)
                        if isinstance(value, str) else
                        super(Types, cls).__new__(cls, value)))

This code tested correctly in both Python 2.7 and 3.6.



回答5:

Is it possible to override __new__ in a python enum to parse strings to an instance?

In a word, yes. As martineau illustrates you can replace the __new__ method after the class has been instanciated (his original code):

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    NetList = 2
    def __str__(self):
        if (self == Types.Unknown):     return "??"
        elif (self == Types.Source):    return "src"
        elif (self == Types.NetList):   return "nl"
        else:                           raise TypeError(self) # completely unnecessary

def _Types_parser(cls, value):
    if not isinstance(value, str):
        raise TypeError(value)
    else:
        # map strings to enum values, default to Unknown
        return { 'nl': Types.NetList,
                'ntl': Types.NetList,  # alias
                'src': Types.Source,}.get(value, Types.Unknown)

setattr(Types, '__new__', _Types_parser)

and also as his demo code illustrates, if you are not extremely careful you will break other things such as pickling, and even basic member-by-value lookup:

--> print("Types(1) ->", Types(1))  # doesn't work
Traceback (most recent call last):
  ...
TypeError: 1
--> import pickle
--> pickle.loads(pickle.dumps(Types.NetList))
Traceback (most recent call last):
  ...
TypeError: 2

Martijn showed is a clever way of enhancing EnumMeta to get what we want:

class TypesEnumMeta(enum.EnumMeta):
    def __call__(cls, value, *args, **kw):
        if isinstance(value, str):
            # map strings to enum values, defaults to Unknown
            value = {'nl': 2, 'src': 1}.get(value, 0)
        return super().__call__(value, *args, **kw)

class Types(enum.Enum, metaclass=TypesEnumMeta):
    ...

but this puts us having duplicate code, and working against the Enum type.

The only thing lacking in basic Enum support for your use-case is the ability to have one member be the default, but even that can be handled gracefully in a normal Enum subclass by creating a new class method.

The class that you want is:

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    src = 1
    NetList = 2
    nl = 2
    def __str__(self):
        if self is Types.Unknown:
            return "??"
        elif self is Types.Source:
            return "src"
        elif self is Types.NetList:
            return "nl"
    @classmethod
    def get(cls, name):
        try:
            return cls[name]
        except KeyError:
            return cls.Unknown

and in action:

--> for obj in Types:
...   print(obj)
... 
??
src
nl

--> Types.get('PoC')
<Types.Unknown: 0>

If you really need value aliases, even that can be handled without resorting to metaclass hacking:

class Types(Enum):
    Unknown = 0, 
    Source  = 1, 'src'
    NetList = 2, 'nl'
    def __new__(cls, int_value, *value_aliases):
        obj = object.__new__(cls)
        obj._value_ = int_value
        for alias in value_aliases:
            cls._value2member_map_[alias] = obj
        return obj

print(list(Types))
print(Types(1))
print(Types('src'))

which gives us:

[<Types.Unknown: 0>, <Types.Source: 1>, <Types.NetList: 2>]
Types.Source
Types.Source