< BACKMake Note | BookmarkCONTINUE >
156135250194107072078175030179198180024228156016206217188240240204175207178023026015231045

File Built-in Methods

Once open() has completed successfully and returned a file object, all subsequent access to the file transpires with that "handle." File methods come in four different categories: input, output, movement within a file, which we will call "intra-file motion," and miscellaneous. A summary of all file methods can be found in Table 9.3. We will now go over each category.

Input

The read() method is used to read bytes directly into a string, reading at most the number of bytes indicated. If no size is given, the default value is set to -1, meaning that the file is read to the end. The readline() method reads one line of the open file (reads all bytes until a NEWLINE character is encountered). The NEWLINE character is retained in the returned string. The readlines() method is similar, but reads all remaining lines as strings and returns a list containing the read set of lines. The readinto() method reads the given number of bytes into a writable buffer object, the same type of object returned by the unsupported buffer() built-in function. (Since buffer() is not supported, neither is readinto() ).

Output

The write() built-in method has the opposite functionality as read() and readline(). It takes a string which can consist of one or more lines of text data or a block of bytes and writes the data to the file. writelines() operates on a list just like readlines(), but takes a list of strings and writes them out to a file. NEWLINE characters are not inserted between each line; so if desired, they must be added to the end of each line before writelines() is called.

This is easily accomplished in Python 2.0 with a list comprehension:

						
>>> output=['1stline', '2ndline', 'the end']
>>> [x + '\n' for x in output]
['1stline\012', '2ndline\012', 'the end\012']

					

Note that there is no "writeline()" method since it would be equivalent to calling write() with a single line string terminated with a NEWLINE character.

Intra-file Motion

The seek() method (analogous to the fseek() function in C) moves the file pointer to different positions within the file. The offset in bytes is given along with a relative offset location called whence. A value of 0 indicates distance from the beginning of a file (note that a position measured from the beginning of a file is also known as the absolute offset), a value of 1 indicates movement from the current location in the file, and a value of 2 indicates that the offset is from the end of the file. If you have used fseek() as a C programmer, the values 0, 1, and 2 correspond directly to the constants SEEK_SET, SEEK_CUR, and SEEK_END, respectively. Use of the seek() method comes to play when opening a file for read and write access.

tell() is a complementary method to seek(); it tells you the current location of the file—in bytes from the beginning of the file.

Others

The close() method completes access to a file by closing it. The Python garbage collection routine will also close a file when the file object reference has decreased to zero. One way this can happen is when only one reference exists to a file, say, fp = open(), and fp is reassigned to another file object before the original file is explicitly closed. Good programming style suggests closing the file before reassignment to another file object.

The fileno() method passes back the file descriptor to the open file. This is an integer argument that can be used in lower-level operations such as those featured in the os module. The flush() method. isatty() is a Boolean built-in method that returns 1 if the file is a tty-like device and 0 otherwise. The truncate() method truncates the file to 0 or the given size bytes.

File Method Miscellany

We will now reprise our first file example from Chapter 2:

						
filename = raw_input('Enter file name: ')
file = open(filename,  'r')
allLines = file.readlines()
file.close()
for eachLine in allLines:
    print eachline,

					

We originally described how this program differs from most standard file access in that all the lines are read ahead of time before any display to the screen occurs. Obviously, this is not advantageous if the file is large. In those cases, it may be a good idea to go back to the tried-and-true way of reading and displaying one line at a time:

						
filename = raw_input('Enter file name: ')
file = open(filename,  'r')
done = 0
while not done:
    aLine = file.readline()
    if aLine != " ":
        print aLine,
    else:
        done = 1
file.close()

					

In this example, we do not know when we will reach the end of the file, so we create a Boolean flag done, which is initially set for false. When we reach the end of the file, we will reset this value to true so that the while loop will exit. We change from using readlines()to read all lines to readline(), which reads only a single line. readline() will return a blank line if the end of the file has been reached. Otherwise, the line is displayed to the screen.

We anticipate a burning question you may have… "Wait a minute! What if I have a blank line in my file? Will Python stop and think it has reached the end of my file?" The answer is, of course, no. A blank line in your file will not come back as a blank line. Recall that every line has one or more line separator characters at the end of the line, so a "blank line" would consist of a NEWLINE character or whatever your system uses. So even if the line in your text file is "blank," the line which is read is not blank, meaning your application would not terminate until it reaches the end-of-file.

NOTE

One of the inconsistencies of operating systems is the line separator character which their file systems support. On Unix, the line separator is the NEWLINE ( \n) character. For the Macintosh, it is the RETURN (\r), and DOS and Windows uses both ( \r\n). Check your operating system to determine what your line separator(s) are.

Other differences include the file pathname separator (Unix uses '/', DOS and Windows use '\', and the Macintosh uses ':'), the separator used to delimit a set of file pathnames, and the denotations for the current and parent directories.

These inconsistencies generally add an irritating level of annoyance when creating applications that run on all three platforms (and more if more architectures and operating systems are supported). Fortunately, the designers of the os module in Python have thought of this for us. The os module has five attributes which you may find useful. They are listed below in Table 9.2.

Table 9.2. os Module Attributes to Aid in Multi-platform Development
os Module Attribute Description
linesep string used to separate lines in a file
sep string used to separate file pathname components
pathsep string used to delimit a set of file pathnames
curdir string name for current working directory
pardir string name for parent (of current working directory)

Regardless of your platform, these variables will be set to the correct values when you import the os module. One less headache to worry about.


We would also like to remind you that the comma placed at the end of the print statement is to suppress the NEWLINE character that print normally adds at the end of output. The reason for this is because every line from the text file already contains a NEWLINE. readline() and readlines()do not strip off any whitespace characters in your line (see exercises.) If we omitted the comma, then your text file display would be doublespaced one NEWLINE which is part of the input and another added by the print statement.

Before moving on to the next section, we will show two more examples, the first highlighting output to files (rather than input), and the second performing both file input and output as well as using the seek() and tell() methods for file positioning.

						
filename = raw_input('Enter file name: ')
file = open(filename,  'w')
done = 0
while not done:
  aLine = raw_input("Enter a line ('.' to quit): ")
  if aLine != ".":
      file.write(aLine + '\n')
  else:
      done = 1
file.close()

					

This piece of code is practically the opposite of the previous. Rather than reading one line at a time and displaying it, we ask the user for one line at a time, and send them out to the file. Our call to the write() method must contain a NEWLINE because raw_input() does not preserve it from the user input. Because it may not be easy to generate an end-of-file character from the keyboard, the program uses the period ( . ) as its end-of-file character, which, when entered by the user, will terminate input and close the file.

Our final example opens a file for read and write, creating the file scratch (after perhaps truncating an already-existing file). After writing data to the file, we move around within the file using seek(). We also use the tell() method to show our movement.

						
>>> f = open('/tmp/x', 'w+')
>>> f.tell()
0
>>> f.write('test line 1\n')    # add 12-char string [0–11]
>>> f.tell()
12
>>> f.write('test line 2\n')    # add 12-char string [12–23]
>>> f.tell()                    # tell us current file location (end))
24
>>> f.seek(-12, 1)              # move back 12 bytes
>>> f.tell()                    # to beginning of line 2
12
>>> f.readline()
'test line 2\012'
>>> f.seek(0, 0)                # move back to beginning
>>> f.readline()
'test line 1\012'
>>> f.tell()                    # back to line 2 again
12
>>> f.readline()
'test line 2\012'
>>> f.tell()                    # at the end again
24
>>> f.close()                   # close file

					

Table9.3 lists all the built-in methods for file objects:

Table 9.3. Methods for File Objects
File Object Method Operation
file.close() close file
file.fileno() return integer file descriptor (FD) for file
file.flush() flush internal buffer for file
file.isatty() return 1 if file is a tty-like device, 0 otherwise
file.read (size=-1) read all or size bytes of file as a string and return it
file.readinto(buf, size)[a] read size bytes from file into buffer buf
file.readline() read and return one line from file (includes trailing "\n")
file.readlines() read and returns all lines from file as a list (includes all trailing "\n" characters)
file.seek(off, whence) move to a location within file, off bytes offset from whence (0 == beginning of file, 1 == current location, or 2 == end of file)
file.tell() return current location within file
file.truncate(size=0) truncate file to 0 or size bytes
file.write(str) write string str to file
file.writelines(list) write list of strings to file

[a] unsupported method introduced in Python 1.5.2 (other implementations of file-like objects do not include this method)


Last updated on 9/14/2001
Core Python Programming, © 2002 Prentice Hall PTR

< BACKMake Note | BookmarkCONTINUE >

© 2002, O'Reilly & Associates, Inc.