Understanding File Handling in Python, with Examples

    Ini Arthur
    Share

    In this article on handling files with Python, you’ll learn how to use the pathlib and OS modules to navigate through local files and directories. You’ll also learn how to open, read, write and close files in Python.

    File handling is a good way to persist data after a program terminates. Data from a computer program is saved to a file and can be accessed later. Python, like many other programming languages, provides handy methods for creating, opening, reading and writing data to a file.

    Contents:

    1. File Handling in Python: Files and File Paths
    2. File Handling in Python: Reading and Writing Data

    File Handling in Python: Files and File Paths

    Files are quick alternatives for persisting data from a computer program. The random-access memory can only store data temporarily, as all the previous data is lost immediately after the computer system is turned off. Files are preferred, because they’re a more permanent storage point for data on a computer. A file is a location on a local disk where data is stored. It has two essential properties: a filename, and its path.

    Using the pathlib module

    Python provides the pathlib module for performing operations like navigating through directories and files, creating folders, identifying file paths, and so on. The pathlib module became available from Python 3.4, and it provides an objected-oriented programming approach for handling filesystem paths.

    The pathlib module has a Path class for navigating filesystem paths. You import it into your program like so:

    from pathlib import Path
    

    How to get the current working directory

    You can get the current working directory (“cwd”) in Python using the cwd() method of the Path object. This method returns the path of the directory we’re currently working in as a string, as shown in the code snippet below:

    from pathlib import Path
    
    p = Path.cwd()
    print(p)
    

    Here’s the output of that code:

    /home/ini/Dev/Tutorial/sitepoint
    

    Absolute vs relative paths

    File paths can be specified in two ways: either by their absolute path, or by their relative path. Both paths point to the current file location.

    The absolute path of a file declares its path, beginning with the root folder. An absolute path looks like this:

    /home/ini/Dev/Tutorial/sitepoint/filehandling.py
    

    The root folder (as seen in the above code) is home on a Linux OS.

    The relative path of a file declares its path in relation to the current working directory. Let’s see an example:

    ./sitepoint/filehandling.py
    

    The code above shows the relative path for the Python file filehandling.py.

    How to create a directory in Python

    The Path class has a mkdir() method for creating new folders or directories in Python. The mkdir() method accepts three arguments: mode, parents and exist_ok.

    The parents and exist_ok are set to False by default. When set to False, the parents argument throws a FileNotFoundError exception if a parent in the specified path doesn’t exist. exist_ok throws a FileExistsError by default if the target directory already exists.

    Here’s an example:

    from pathlib import Path
    
    path = '/home/ini/Dev/Tutorial/sitepoint/projects'
    p = Path(path)
    p.mkdir()
    

    In the example above, we have a path variable setting the location for a new directory. A Path object p has been created with a path string argument, and the mkdir() method is called on p. If we check through the sitepoint folder now, we’ll find the newly created projects folder.

    How to change the current working directory

    The pathlib module doesn’t support changing of directories. If we want to do that, we have to use the OS module — another inbuilt module for interacting with our computer’s operating system.

    To use the OS module, we import it into our program as shown below:

    import os
    

    To switch between directories, use the chdir() method of the OS module. The new path is passed in as an argument to the method to change from the current working directory to another one.

    After creating a new folder in the previous code sample, we can change the directory to the projects folder:

    import os
    
    os.chdir('/home/ini/Dev/Tutorial/sitepoint/projects')
    

    To confirm the change in the directory, use the Path object’s cwd() method, which returns a string of the current working directory: /home/ini/Dev/Tutorial/sitepoint/projects.

    How to delete files or directories in Python

    We can delete files and directories in Python using the Path object’s unlink() and rmdir() methods respectively.

    To delete files in Python, instantiate the file’s Path object and apply the unlink() method. (When deleting files, the program will throw a FileNotFoundError exception if the file doesn’t exist.)

    Let’s consider a code example:

    from pathlib import Path
    
    path = '/home/ini/Dev/Tutorial/sitepoint/file_handling.txt'
    p = Path(path)
    p.unlink()
    

    The file_handling.txt file is deleted from the sitepoint directory. If we try to delete the file again, we’ll get the FileNotFoundError exception.

    To delete or remove a directory, apply the rmdir() method to the Path object to be deleted, like so:

    from pathlib import Path
    
    path = '/home/ini/Dev/Tutorial/projects'
    p = Path(path)
    p.rmdir()
    

    The projects folder is deleted from the Tutorial folder.

    How to list files and directories in Python

    To get an overview of all the content of a directory, use the iterdir() method on the file’s Path object. This method returns a generator object. We can now iterate over the generator object to get all the existing files and directories in that particular folder:

    from pathlib import Path
    
    path = '/home/ini/Dev/Tutorial/sitepoint'
    p = Path(path)
    for file in p.iterdir():
        print(file)
    

    Here’s the output of the code above:

    /home/ini/Dev/Tutorial/sitepoint/array.py
    /home/ini/Dev/Tutorial/sitepoint/unittesting.py
    /home/ini/Dev/Tutorial/sitepoint/code
    /home/ini/Dev/Tutorial/sitepoint/search_replace.py
    /home/ini/Dev/Tutorial/sitepoint/__pycache__
    /home/ini/Dev/Tutorial/sitepoint/pangram.txt
    /home/ini/Dev/Tutorial/sitepoint/random.txt
    /home/ini/Dev/Tutorial/sitepoint/.pytest_cache
    /home/ini/Dev/Tutorial/sitepoint/exception.py
    /home/ini/Dev/Tutorial/sitepoint/files.py
    /home/ini/Dev/Tutorial/sitepoint/regex.py
    /home/ini/Dev/Tutorial/sitepoint/filehandling.py
    

    File Handling in Python: Reading and Writing Data

    File handling in Python is simple and not as complicated as sometimes is in other programming languages. There are different file access modes to choose from when opening a Python file for any operation:

    • r: opens a file for reading. The read mode throws an error when the file doesn’t exist.

    • r+: opens the file to read and write data into a file object. An error is thrown if the file doesn’t exist.

    • w: a file is opened in this mode for writing data. The write mode overrides existing data and creates a new file object if it doesn’t exist.

    • w+: opens a file to read and write data. Existing data on file is overridden when opened in this mode.

    • a: the append mode appends to a file if the file exists. It also creates a new file if there’s no existing file. It doesn’t override existing data.

    • a+: this mode opens a file for appending and reading data.

    • x: the create mode is used to create files in Python. An error is thrown if the file exists.

    Adding b to any of the access modes changes it from the default text format to a binary format (for example, rb, rb+, wb, and so on).

    How to open a file in Python

    To open a file in Python, we can use the open() function. It takes at least two arguments — the filename, and the mode description — and returns a file object. By default, a file is opened for reading in text mode, but we can specify if we want the binary mode instead.

    Here’s some simple code for opening a file:

    f = open(file, mode='r', encoding=None)
    

    After this step, as seen in the code above, we can begin our read–write operations on the file object. By default, files are always handled in text mode. The open() method takes at least two arguments: a file path, and the mode.

    How to close a file in Python

    After a file object is opened and file processing operations have been carried out, we need to close the file. It’s often the last step in reading or writing files in Python. The file object’s close() method is used to close previously opened files.

    Closing files in Python looks like this:

    f = open('filename', 'mode', encoding=None)
    // file operations, reading, writing or appending
    f.close()
    

    The with statement

    It’s standard practice to close files after they’ve been opened and file operations have been carried out. It’s possible to forget to close some files after they’ve been opened.

    The with statement automatically closes files after the last file handling operation is completed in its scope. For example:

    with open('random.txt', 'r', encoding='UTF-8') as f:
        print(f.read())
    

    Here’s the output of the code above:

    Hello, World!
    Hello, World!
    

    As seen in the code snippet above, the with statement implicitly closes the file after the print statement. (We’ll be using the with statement for the remainder of this tutorial.)

    How to read a file in Python

    There are a couple of ways to read data from a file in Python. We can read a file’s contents using the read(), readline(), and readlines() methods.

    The read() method

    The read() method returns a string of all characters on the file being read. The pointer is placed at the start of the file’s content. The default mode is to read from the beginning of the file to the end of the file, except where the number of characters is specified.

    Take a look at the code snippet below:

    with open('random.txt', 'r', encoding='UTF-8') as f:
        print(f.read())
    

    Here’s the output of that code:

    This is some random text.
    Here is the second line.
    The sky is blue.
    Roses are red. 
    

    We can specify how many characters to read from the text file. Simply pass the number of characters as an argument to the read() method:

    with open('random.txt', 'r', encoding='UTF-8') as f:
        print(f.read(12))
    

    Here’s the output of that code:

    This is some
    

    As seen in the example above, the intepreter reads only 12 characters from the entire file.

    The readline() method

    This method reads one line from a file at a time. It reads from the beginning of the file and stops where a newline character is found. See the code example below:

    with open('random.txt', 'r', encoding='UTF-8') as f:
        print(f.readline())
    

    The code above outputs the following:

    This is some random text.
    

    The readlines() method

    This method returns a list of all lines from the existing file being read. See the code snippet below:

    with open('random.txt', 'r', encoding='UTF-8') as f:
        print(f.readlines())
    

    Here’s the output:

    ['This is some random text.\n', 'Here is the second line.\n', 'The sky is blue.\n', 'Roses are red.']
    

    Note: all the methods for reading a file stream return an empty value when the end of the file is reached. The seek() method returns the file cursor to the beginning of the file.

    How to write to a file in Python

    The write() method in Python is useful when attempting to write data to a file. To write to an opened file, the access mode should be set to one of the following: w, w+, a, a+, and so on. Once again, the default is text mode rather than binary.

    The write() method

    Pass a string argument to this method when we want to write data to a text file. Remember that the write mode will override existing data if the file exists:

    with open('random.txt', 'w', encoding='UTF-8') as f:
        f.write("Hello, World!")
    

    The new string argument passed to the write() method will overwrite our previous text in the random.txt file, and we’ll be left with this:

    Hello, World!
    

    The writelines() method

    This method helps us insert several strings to a text file at once. We can write multiple lines of strings to a file in one go by passing the list as an argument of the method:

    words = ['The sky is blue.\n', 'Roses are red.']
    with open('random.txt', 'w', encoding='UTF-8') as f: 
        f.writelines(words)
    

    The code above shows the random.txt file being opened and a list of strings being inserted into it. The contents of random.txt will now look like this:

    The sky is blue.
    Roses are red.
    

    Conclusion

    There are two important attributes about a file: the filename and its path. The pathlib and os modules help us navigate through directories and perform certain operations. Python file handling involves using several methods to open, create, read, and write data on file objects.

    It’s also important to understand file handling in Python if we want to interact with content inside text or binary files. Always make sure to close files after carrying out operations on them. The with statement makes it easier to perform file handling in Python, as it implicitly closes file objects after we’re done.

    Related reading: