See All Titles |
![]() ![]() Customizing Classes with Special MethodsWe covered two important aspects of methods in preceding sections of this chapter, the first being that methods must be bound (to an instance of their corresponding class) before they can be invoked. The other important matter is that there are two special methods which provide the functionality of constructors and destructors, namely __init__() and __del__() respectively. In fact, __init__() and __del__() are part of a set of special methods which can be implemented. Some have the predefined default behavior of inaction while others do not and should be implemented where needed. These special methods allow for a powerful form of extending classes in Python. In particular, they allow for:
Special methods enable classes to emulate standard types by overloading standard operators such as +, *, and even the slicing subscript and mapping operator [ ]. As with most other special reserved identifiers, these methods begin and end with a double underscore ( __ ). Table 13.4 presents a list of all special methods and their descriptions.
The "Core" group of special methods denotes the basic set of special methods which can be implemented without emulation of any specific types. The "Attributes" group helps manage instance attributes of your class. The "Numeric Types" set of special methods can be used to emulate various numeric operations, including those of the standard (unary and binary) operators, conversion, base representation, and coercion. There are also special methods to emulate sequence and mapping types. Implementation of some of these special methods will overload operators so that they work with instances of your class type. Numeric binary operators in the table annotated with a wildcard asterisk in their names are so denoted to indicate that there are multiple versions of those methods with slight differences in their name. The asterisk either symbolizes no additional character in the string, or a single "r" to indicate a right-hand-side operation. Without the "r," the operation occurs for cases which are of the format self OP obj; the presence of the "r" indicates the format obj OP self. For example, __add__(self, obj) is called for self + obj, and __radd__(self, obj) would be invoked for obj + self. Augmented assignment, new in Python 2.0, introduces the notion of "in-place" operations. An "i" in place of the asterisk implies a combination left-hand side operation plus an assignment, as in self = self OP obj. For example, __iadd__(self, obj) is called for self = self + obj. Simple Class Customization Example (oPair)For our first example, let us create a simple class consisting of an ordered pair (x, y) of numbers. We will represent this data in our class as a 2-tuple. In the code snippet below, we define the class with a constructor that takes a pair of values and stores them as the data attribute of our oPair class: class oPair: # ordered pair def __init__(self, obj1, obj2): # constructor self.data = (obj1, obj2) # assign attribute Using this class, we can instantiate our objects: >>> myPair = oPair(6, -4) # create instance >>> myPair # calls repr() <oPair instance at 92bb50> >>> print myPair # calls str() <oPair instance at 92bb50> Unfortunately, neither print (using str()) nor the actual object's string representation (using repr()) reveals much about our object. One good idea would be to implement either __str__() or __repr__(), or both so that we can "see" what our object looks like. In other words, when you want to display your object, you actually want to see something meaningful rather than the generic Python object string (<object type at id>). We want to see an ordered pair (tuple) with the current data values in our object. Without further ado, let us implement __str__() so that the ordered pair is displayed: def __str__(self): # str() string representation return str(self.data) # convert tuple to string __repr__ = __str__ # repr() string representation Since we also want to use the same piece of code for __repr__(), rather than copying the code verbatim, we use our sense of code reusability and simply create an alias to __str__(). Now our output has been greatly improved: >>> myPair = oPair(-5, 9)# create instance >>> myPair # repr() calls __repr__() (-5, 9) >>> print myPair # str() calls __str__() (-5, 9) What is the next step? Let us say we want our objects to interact. For example, we can define the addition operation of two oPair objects, (x1, y1) and (x2, y2), to be the sum of each individual component. Therefore, the "sum" of two oPair objects is defined as a new object with the values (x1 + x2, y1 + y2). We implement the __add__() special method in such a way that we calculate the individual sums first, then call the class constructor to return a new object. Finally, we alias __add__ as __radd__ since the order does not matter—in other words, numeric addition is commutative. The definitions of __add__ and __radd__ are featured below: def __add__(self, other): # add two oPair objects return self.__class__(self.data[0] + other.data[0], self.data[1] + other.data[1]) The new object is created by invoking the class as in any normal situation. The only difference is that from within the class, you typically would not invoke the class name directly. Rather, you take __class__ attribute of self which is the class from which self was instantiated and invoke that. Because self.__class__ is the same as oPair, calling self.__class__() is the same as calling oPair(). Now we can perform additions with our newly-overloaded operators. Reloading our updated module, we create a pair of oPair objects and "add" them, producing the sum you see below: >>> pair1 = oPair(6, -4) >>> pair2 = oPair(-5, 9) >>> pair1 + pair2 (1, 5) A TypeError exception occurs when attempting to use an operator for which the corresponding special method(s) has(have) not been implemented yet: >>> pair1 * pair2 Traceback (innermost last): File "<stdin", line 1, in ? pair1 * pair2 TypeError: __mul__ nor __rmul__ defined for these operands Obviously, our result would have been similar if we had not implemented __add__ and __radd__. The final example is related to existing data which we may want to use. Let us say that we have some 2-tuples floating around in our system, and in order to create oPair objects with them currently, we would have to split them up into individual components to instantiate an oPair object: aTuple = (-3, -1) pair3 = oPair(aTuple[0], aTuple[1]) But rather than splitting up the tuple and creating our objects as in the above, wouldn't it be nice if we could just feed this tuple into our constructor so that it can handle it there? The answer is yes, but not by overloading the constructor as the case may be with other object-oriented programming languages. Python does not support overloading of callables, so the only way to work around this problem is to perform some manual introspection with the type() built-in function. In our update to __init__() below, we add an initial check to see if what we have is a tuple. If it is, then we just assign it directly to the data attribute. Otherwise, this would mean a "regular" instantiation, meaning that we expect a pair of numbers to be passed. def __init__(self, obj1, obj2=None):# constructor if type(obj1) == type(()): # tuple type self.data = obj1 else: if obj2 == None: # part of values raise TypeError, \ 'oPair() requires tuple or numeric pair' self.data = (obj1, obj2) Note in the above code that we needed to give a default value of None to obj2. This allows only one object to be passed in if it is a tuple. What we do not want is to allow only the creation of an oPair type without a second value, hence our additional check to see if obj2 is None in the else clause. We can now make our call in a more straightforward manner: aTuple = (-3, -1) pair3 = oPair(aTuple) >>> pair3 (-3, -1) >>> pair3 + pair1 (3, -5) Hopefully, you now have a better understanding of operator overloading, why you would want to do it, and how you can implement special methods to accomplish that task. If you are interested in a more complex customization, continue with the optional section below. *More Complex Class Customization Example (NumStr)Let us create another new class, NumStr, consisting of a number-string ordered pair, called n and s, respectively, using integers as our number type. Although the "proper" notation of an ordered pair is (n, s), we choose to represent our pair as [n :: s] just to be different. Regardless of the notation, these two data elements are inseparable as far as our model is concerned. We want to set up our new class, called NumStr, with the following characteristics: InitializationThe class should be initialized with both the number and string; if either (or both) is missing, then 0 and the empty string should be used, i.e., n=0 and s='', as defaults. AdditionWe define the addition operator functionality as adding the numbers together and concatenating the strings; the tricky part is that the strings must be concatenated in the correct order. For example, let NumStr1 = [n1 :: s1] and NumStr2 = [n2 :: s2]. Then NumStr1 + NumStr2 is performed as [n1 + n2 :: s1 + s2] where + represents addition for numbers and concatenation for strings. MultiplicationSimilarly, we define the multiplication operator functionality as multiplying the numbers together and repeating or concatenating the strings, i.e., NumStr1 * NumStr2 = [n1 * n2 :: s1 * s2]. False ValueThis entity has a false value when the number has a numeric value of zero and the string is empty, i.e., when NumStr = [0 :: '']. ComparisonsComparing a pair of NumStr objects, i.e., [n1 :: s1] vs. [n2 :: s2], we find 9 different combinations (i.e., n1 > n2 and s1 < s2, n1 == n2 and s1 > s2, etc.) We use the normal numeric and lexicographic compares for numbers and strings, respectively, i.e., the ordinary comparison of cmp(obj1, obj2) will return an integer less than zero if obj1 < obj2, greater than zero if obj1 > obj2, or equal to zero if the objects have the same value. The solution for our class is to add both of these values and return the result. The interesting thing is that cmp() does not like to return values other than -1, 0, 1, so even if the sum turns out to be -2 or 2, cmp() will still return -1 or 1, respectively. A value of 0 is returned if both sets of numbers and strings are the same, or if the comparisons offset each other, i.e., (n1 < n2) and (s1 > s2) or vice versa. Given the above criteria, we present the code below for numstr.py: Example 13.2. Emulating Types with Classes (numstr.py)<$nopage> 001 1 # !/usr/bin/env python 002 2 003 3 class NumStr: 004 4 005 5 def __init__(self, num=0, string=''):# constr. 006 6 self.__num = num 007 7 self.__string = string 008 8 009 9 def __str__(self): # define for str() 010 10 return \xd4 [%d :: %s]' % \ 011 11 self.__num, \xd4 self.__string\xd4 ) 012 12 __repr__ = __str__ 013 13 014 14 def __add__(self, other): # define for s+o 015 15 if isinstance(other, NumStr): 016 16 return self.__class__(self.__num + \ 017 17 other.__num, \ 018 18 self.__string + other.__string) 019 19 else: <$nopage> 020 20 raise TypeError, \ 021 21 'illegal argument type for built-in operation' 022 22 023 23 def __radd__(self, other): # define for o+s 024 24 if isinstance(other, NumStr): 025 25 return self.__class__(other.num + \ 026 26 self.num, other.str + self.str) 027 27 else: <$nopage> 028 28 raise TypeError, \ 029 29 'illegal argument type for built-in operation' 030 30 031 31 def __mul__(self, num): # define for o*n 032 32 if type(num) == type(0): 033 33 return self.__class__(self.__num * num,\ 034 34 self.__string * num) 035 35 else: <$nopage> 036 36 raise TypeError, \ 037 37 'illegal argument type for built-in operation' 038 38 039 39 def __nonzero__(self): # reveal tautology 040 40 return self.__num or len(self.__string) 041 41 042 42 def __norm_cval(self, cmpres): # normalize cmp() 043 43 return cmp(cmpres, 0) 044 44 045 45 def __cmp__(self, other): # define for cmp() 046 46 nres = self.__norm_cval(cmp(self.__num, \ 047 47 other.__num)) 048 48 sres = self.__norm_cval(cmp(self.__string, \ 049 49 other.__string)) 050 50 051 51 if not (nres or sres): return 0 # both 0 052 52 sum = nres + sres 053 53 if not sum: return None # one <,one> 054 54 return sum 055 <$nopage> Here is an example execution of how this class works: >>> a = NumStr(3, 'foo') >>> b = NumStr(3, 'goo') >>> c = NumStr(2, 'foo') >>> d = NumStr() >>> e = NumStr(string='boo') >>> f = NumStr(1) >>> a [3 :: 'foo'] >>> b [3 :: 'goo'] >>> c [2 :: 'foo'] >>> d [0 :: ''] >>> e [0 :: 'boo'] >>> f [1 :: ''] >>> a < b 1 >>> b < c 0 >>> a == a 1 >>> b * 2 [6 :: 'googoo'] >>> a * 3 [9 :: 'foofoofoo'] >>> e + b [3 :: 'boogoo'] >>> if d: 'not false' … >>> if e: 'not false' … 'not false' >>> cmp(a,b) -1 >>> cmp(a,c) 1 >>> cmp(a,a) 0 Line-by-line ExplanationLines 3–7The constructor __init__() function sets up our instance initializing itself with the values passed in to the class instantiator NumStr(). If either value is missing, the attribute takes on the default false value of either zero or the empty string, depending on the argument. One significant oddity is the use of double underscores to name our attributes. As we will find out in the next section, this is used to enforce a level, albeit elementary, of privacy. Programmers importing our module will not have straightforward access to our data elements. We are attempting to enforce one of the encapsulation properties of OO design by permitting access only though accessor functionality. If this syntax appears odd or uncomfortable to you, you can remove all double underscores from the instance attributes, and the examples will still work exactly in the same manner. All attributes which begin with a double underscore ( __ ) are "mangled" so that these names are not as easily accessible during run-time. They are not, however, mangled in such a way so that it cannot be easily reverse-engineered. In fact, the mangling pattern is fairly well-known and easy to spot. The main point is to prevent the name from being accidentally used when being imported by an external module where conflicts may arise. The name is changed to a new identifier name containing the class name to ensure that it does not get "stepped on" unintentionally. For more information, check out Section 13.14 on privacy. Lines 9–12We choose the string representation of our ordered pair to be "[num :: 'str']" so it is up to __str__() to provide that representation whenever str() is applied to our instance and when the instance appears in a print statement. Because we want to emphasize that the second element is a string, it is more visually convincing if the users view the string surrounded by quotation marks. To that end, we call repr() using the single back quotation marks to give the evaluatable version of a string, which does have the quotation marks: >>> print a [3 :: 'foo'] Not calling repr() on self.__string (leaving the back quotations off) would result in the string quotations being absent. For the sake of argument, let us effect this change for learning purposes. Removing the backquotes, we edit the return statement so that it now looks like this: return '[%d :: %s]' % (self.__num, self.__string) Now calling print again on an instance results in: >>> print a [3 :: foo] How does that look without the quotations? Not as convincing that "foo" is a string, is it? It looks more like a variable. The author is not as convinced either. (We quickly and quietly back out of that change and pretend we never even touched it.) The first line of code after the __str__() function is the assignment of that function to another special method name, __repr__. We made a decision that an evaluatable string representation of our instance should be the same as the printable string representation. Rather than defining an entirely new function which is a duplicate of __str__(), we just create an alias, copying the reference. When you implement __str__(), it is the code that is called by the interpreter if you ever apply the str() built-in function using that object as an argument. The same goes for __repr__() and repr(). How would our execution differ if we chose not to implement __repr__()? If the assignment is removed, only the print statement (which calls str() will show us the contents of our object. The evaluatable string representation defaults to the Python standard of <…some_object_information…>. >>> print a # calls str(a) [3 :: 'foo'] >>> a # calls repr(a) <NumStr.NumStr instance at 122640> Lines 14–29One feature we would like to add to our class is the addition operation, which we described earlier. One of Python's features as far as customizing classes goes is the fact that we can overload operators to make these types of customizations more "realistic." Invoking a function such as "add(obj1, obj2)" to "add" objects obj1 and obj2 may seem like addition, but is it not more compelling to be able to invoke that same operation using the plus sign ( + ) like this? ? obj1 + obj2 Overloading the plus sign requires the implementation of two functions, __add__() and __radd__(), as explained in more detail in the previous section. The __add__() function takes care of the SELF + OTHER case, but we need to define __radd__() to handle the OTHER + SELF scenario. The numeric addition is not affected as much as the string concatenation is because order matters. The addition operation adds each of the two components, with the pair of results forming a new object—created as the results are passed to a call for instantiation as calling self.__class__() (again, also previously explained above). Any object other than a like type should result in a TypeError exception, which we raise in such cases. Lines 31–37We also overload the asterisk [by implementing __mul__()] so that both numeric multiplication and string repetition are performed, resulting in a new object, again created via instantiation. Since repetition allows only an integer to the right of the operator, we must enforce this restriction as well. We also do not define __rmul__() for the same reason. Lines 39–40Python objects have a concept of having a Boolean value at any time. For the standard types, objects have a false value when they are either a numeric equivalent of zero or an empty sequence or mapping. For our class, we have chosen that both its numeric value must be zero and the string empty in order for any such instance to have a false value. We override the __nonzero__() method for this purpose. Other objects such as those which strictly emulate sequence or mapping types use a length of zero as a false value. In those cases, you would implement the __len__() method to effect that functionality. Lines 39–54__norm_cval() is not a special method. Rather, it is a helper function to our overriding of __cmp__(); its sole purpose is to convert all positive return values of cmp() to 1, and all negative values to -1. cmp() normally returns arbitrary positive or negative values (or zero) based on the result of the comparison, but for our purposes, we need to restrict the return values to only -1, 0, and 1. Calling cmp() with integers will give us the result we need, being equivalent to the following snippet of code: def __norm_cval(self, cmpres): if cmpres < 0: return -1 elif cmpres > 0: return 1 else: return 0 The actual comparison of two like objects consists of comparing the numbers and the strings, and returning the sum of the comparisons. You may have noticed in the code above that we prepended a double underscore ( __ ) in front of our data attributes. This directive provides a light form of privacy.
|
© 2002, O'Reilly & Associates, Inc. |