How to implement the composition pattern? I have a class Container
which has an attribute object Contained
. I would like to redirect/allow access to all methods of Contained
class from Container
by simply calling my_container.some_contained_method()
. Am I doing the right thing in the right way?
I use something like:
class Container:
def __init__(self):
self.contained = Contained()
def __getattr__(self, item):
if item in self.__dict__: # some overridden
return self.__dict__[item]
else:
return self.contained.__getattr__(item) # redirection
Background:
I am trying to build a class (Indicator
) that adds to the functionality of an existing class (pandas.DataFrame
). Indicator
will have all the methods of DataFrame
. I could use inheritance, but I am following the "favor composition over inheritance" advice (see, e.g., the answers in: python: inheriting or composition). One reason not to inherit is because the base class is not serializable and I need to serialize.
I have found this, but I am not sure if it fits my needs.
Caveats:
- DataFrames have a lot of attributes. If a
DataFrame
attribute is a number, you probably just want to return that number. But if the DataFrame
attribute is DataFrame
you probably want to return a Container
. What should we do if the DataFrame
attribute is a Series
or a descriptor? To implement Container.__getattr__
properly, you really
have to write unit tests for each and every attribute.
- Unit testing is also needed for
__getitem__
.
- You'll also have to define and unit test
__setattr__
and __setitem__
, __iter__
, __len__
, etc.
- Pickling is a form of serialization, so if
DataFrames
are picklable, I'm not sure how Container
s really help with serialization.
Some comments:
__getattr__
is only called if the attribute is not in self.__dict__
. So you do not need if item in self.__dict__
in your __getattr__
.
self.contained.__getattr__(item)
calls self.contained
's
__getattr__
method directly. That is usually not what you want to
do, because it circumvents the whole Python attribute lookup
mechanism. For example, it ignores the possibility that the attribute
could be in self.contained.__dict__
, or in the __dict__
of one of
the bases of self.contained.__class__
or if item
refers to a
descriptor. Instead use getattr(self.contained, item)
.
import pandas
import numpy as np
def tocontainer(func):
def wrapper(*args, **kwargs):
result = func(*args, **kwargs)
return Container(result)
return wrapper
class Container(object):
def __init__(self, df):
self.contained = df
def __getitem__(self, item):
result = self.contained[item]
if isinstance(result, type(self.contained)):
result = Container(result)
return result
def __getattr__(self, item):
result = getattr(self.contained, item)
if callable(result):
result = tocontainer(result)
return result
def __repr__(self):
return repr(self.contained)
Here is some random code to test if -- at least superficially -- Container
delegates to DataFrame
s properly and returns Containers
:
df = pandas.DataFrame(
[(1, 2), (1, 3), (1, 4), (2, 1),(2,2,)], columns=['col1', 'col2'])
df = Container(df)
df['col1'][3] = 0
print(df)
# col1 col2
# 0 1 2
# 1 1 3
# 2 1 4
# 3 2 1
# 4 2 2
gp = df.groupby('col1').aggregate(np.count_nonzero)
print(gp)
# col2
# col1
# 1 3
# 2 2
print(type(gp))
# <class '__main__.Container'>
print(type(gp[gp.col2 > 2]))
# <class '__main__.Container'>
tf = gp[gp.col2 > 2].reset_index()
print(type(tf))
# <class '__main__.Container'>
result = df[df.col1 == tf.col1]
print(type(result))
# <class '__main__.Container'>
I found unbutbu 's answer very useful for my own application, I ran into issues displaying it properly in a jupyter notebook. I found that adding the following methods to the class solved the issue.
def _repr_html_(self):
return self.contained._repr_html_()
def _repr_latex_(self):
return self.contained._repr_latex_()