< BACKMake Note | BookmarkCONTINUE >
156135250194107072078175030179198180024228156016206217188240240204172044044179211038032227

String-only Operators

Format Operator (%)

One of Python's coolest features is the string format operator. This operator is unique to strings and makes up for the pack of having functions from C's printf() family. In fact, it even uses the same symbol, the percent sign (%), and supports all the printf() formatting codes.

The syntax for using the format operator is as follows:

						
format_string % (arguments_to_convert)

					

The format_string on the left-hand side is what you would typically find as the first argument to printf(), the format string with any of the embedded % codes. The set of valid codes is given in Table6.4. The arguments_to_convert parameter matches the remaining arguments you would send to printf(), namely the set of variables to convert and display.

Table 6.4. Format Operator Conversion Symbols
Format Symbol Conversion
%c character
%s string conversion via str() prior to formatting
%i signed decimal integer
%d signed decimal integer
%u unsigned decimal integer
%o octal integer
%x hexadecimal integer (lowercase letters)
%X hexadecimal integer (UPPERcase letters)
%e exponential notation (with lowercase 'e')
%E exponential notation (with UPPERcase 'E')
%f floating point real number
%g the shorter of %f and %e
%G the shorter of %f and %E

Python supports two formats for the input arguments. The first is a tuple (introduced in Section 2.8, formally in 6.15), which is basically the set of arguments to convert, just like for C's printf(). The second format which Python supports is a dictionary (Chapter 7). A dictionary is basically a set of hashed key-value pairs. The keys are requested in the format_string, and the corresponding values are provided when the string is formatted.

Converted strings can either be used in conjunction with the print statement to display out to the user or saved into a new string for future processing or displaying to a graphical user interface.

Other supported symbols and functionality are listed in Table 6.5.

Table 6.5. Format Operator Auxiliary Directives
Symbol Functionality
* argument specifies width or precision
- left justification
+ display the sign
<sp> leave a blank space before a positive number
# add the octal leading zero ( '0' ) or hexadecimal leading '0x' or '0X', depending on whether 'x' or 'X' were used.
0 pad from left with zeros (instead of spaces)
% '%%' leaves you with a single literal '%'
(var) mapping variable (dictionary arguments)
m.n. m is the minimum total width and n is the number of digits to display after the decimal point (if appl.)

As with C's printf(), the asterisk symbol (*) may be used to dynamically indicate the width and precision via a value in argument tuple. Before we get to our examples, one more word of caution: long integers are more than likely too large for conversion to standard integers, so we recommend using exponential notation to get them to fit.

Here are some examples using the string format operator:

Hexadecimal Output
							
>>> "%x" % 108
'6c'
>>>
>>> "%X" % 108
'6C'
>>>
>>> "%#X" % 108
'0X6C'
>>>
>>> "%#x" % 108
'0x6c'

						
Floating Point and Exponential Notation Output
							
>>>
>>> '%f' % 1234.567890
'1234.567890'
>>>
>>> '%.2f' % 1234.567890
'1234.57'
>>>
>>> '%E' % 1234.567890
'1.234568E+03'
>>>
>>> '%e' % 1234.567890
'1.234568e+03'
>>>
>>> '%g' % 1234.567890
'1234.57'
>>>
>>> '%G' % 1234.567890
'1234.57'
>>>
>>> "%e" % (1111111111111111111111L)
'1.111111e+21'

						
Integer and String Output
							
>>> "%+d" % 4
'+4'
>>>
>>> "%+d" % -4
'-4'
>>>
>>> "we are at %d%%" % 100
'we are at 100%'
>>>
>>> 'Your host is: %s' % 'earth'
'Your host is: earth'
>>>
>>> 'Host: %s\tPort: %d' % ('mars', 80)
'Host: marsPort: 80'
>>>
>>> num = 123
>>> 'dec: %d/oct: %#o/hex: %#X' % (num, num, num)
'dec: 123/oct: 0173/hex: 0X7B'
>>>
>>> "MM/DD/YY = %02d/%02d/%d" % (2, 15, 67)
'MM/DD/YY = 02/15/67'
>>>
>>> w, p = 'web', 'page'
>>> 'http://xxx.yyy.zzz/%s/%s.html' % (w, p)
'http://xxx.yyy.zzz/web/page.html'

						

The previous examples all use tuple arguments for conversion. Below, we show how to use a dictionary argument for the format operator:

							
>>> 'There are %(howmany)d %(lang)s Quotation Symbols' % \
…     {'lang': 'Python', 'howmany': 3}
'There are 3 Python Quotation Symbols'

						
Amazing Debugging Tool

The string format operator is not only a cool, easy-to-use, and familiar feature, but a great and useful debugging tool as well. Practically all Python objects have a string presentation (either evaluatable from repr() or '', or printable from str()). The print statement automatically invokes the str() function for an object. This gets even better. When you are defining your own objects, there are hooks for you to create string representations of your object such that repr() and str() (and '' and print) return an appropriate string as output. And if worse comes to worst and neither repr() or str() is able to display an object, the Pythonic default is to at least give you something of the format:

							
<… something that is useful …>.

						

Raw String Operator ( r / R )

The purpose of raw strings, introduced to Python in version 1.5, is to counteract the behavior of the special escape characters that occur in strings (see the subsection below on what some of these characters are). In raw strings, all characters are taken verbatim with no translation to special or non-printed characters.

This feature makes raw strings absolutely convenient when such behavior is desired, such as when composing regular expressions (see the re module documentation). Regular expressions (REs) are strings which define advanced search patterns for strings and usually consist of special symbols to indicate characters, grouping and matching information, variable names, and character classes. The syntax for REs contains enough symbols already, but when you have to insert additional symbols to make special characters act like normal characters, you end up with a virtual "alphanumersymbolic" soup! Raw strings lend a helping hand by not requiring all the normal symbols needed when composing RE patterns.

The syntax for raw strings is exactly the same as for normal strings with the exception of the raw string operator, the letter "r," which precedes the quotation marks. The "r" can be lowercase (r) or uppercase (R) and must be placed immediately preceding the first quote mark.

						
>>> print r'\n'
\n
>>>
>>>print '\n'
>>>
>>>print r'werbac'
werbac
>>>
>>>print r'webbac\n'
webbac\n
>>>
>>> print r'fglkjfg\][123=091'
fglkjfg\\][123=091
>>>
>>> import re
>>> aFloatRE = re.compile(R'([+-]?\d+(\.\d*)?([eE][+-
]?\d+)?))
>>> match = aFloatRE.search('abcde')
>>> print "our RE matched:", match.group(1)
''
>>> match = aFloatRE.search('-1.23e+45')
>>> print 'our RE matched:', match.group(1)
'-1.23e+45'

					

Unicode String Operator (u/U)

The Unicode string operator, uppercase (U) and lowercase (u), introduced with Unicode string support in Python 1.6, takes standard strings or strings with Unicode characters in them and converts them to a full Unicode string object. More details on Unicode strings are available in Section 6.7.4. In addition, Unicode support is available in the new string methods (Section 6.6) and the new regular expression engine. Here are some examples:

						
u'abc'          U+0061 U+0062 U+0063
u'\u1234'       U+1234
u'abc\u1234\n'  U+0061 U+0062 U+0063 U+1234 U+0012

					

The Unicode operator can also accept raw Unicode strings if used in conjunction with the raw string operator discussed in the previous section. The Unicode operator must precede the raw string operator.

						
ur 'Hello\nWorld!'

					


Last updated on 9/14/2001
Core Python Programming, © 2002 Prentice Hall PTR

< BACKMake Note | BookmarkCONTINUE >

© 2002, O'Reilly & Associates, Inc.