Glob — Working with Files in Python (64/100 Days of Python)

Martin Mirakyan
2 min readMar 6, 2023

--

Day 64 of the “100 Days of Python” blog post series covering glob and filesystem

Glob is a Python module that provides a convenient way to search for files that match a specified pattern. It allows you to use wildcard characters to match files with similar names or extensions. In this tutorial, we will explore the usage of the glob module in Python.

Basic Usage of the glob module in Python

Glob can be used to obtain or search for specific files in the file system. A basic usage can be in a scenario where we have several files. For example, in my_folder directory:

my_folder/
file1.txt
file2.txt
file3.jpg
file4.py

To get a list of all files in this directory, you can use the following code:

import glob

files = glob.glob('my_folder/*')
print(files)

This will output the following list:

['my_folder/file1.txt', 'my_folder/file2.txt', 'my_folder/file3.jpg', 'my_folder/file4.py']

Wildcard Characters

Glob supports the use of wildcard characters to match files with similar names or extensions. Here are the most commonly used wildcard characters:

  • *: Matches any string of characters, including an empty string.
  • ?: Matches any single character.
  • [ ]: Matches any character inside the brackets.
  • [! ]: Matches any character not inside the brackets.

For example, let’s say you want to get a list of all files in the my_folder directory that have a .txt extension. You can use the * wildcard character to match any string of characters before the .txt extension:

import glob

files = glob.glob('my_folder/*.txt')
print(files)

This will contain a list of 2 files:

['my_folder/file1.txt', 'my_folder/file2.txt']

You can also use the ? wildcard character to match a single character. For example, to get a list of all files in the my_folder directory that have a file name with 5 characters followed by a .txt extension, you can use the following code:

import glob

files = glob.glob('my_folder/?????.txt')
print(files)

This will print the list of all the files that are in my_folder, have 5 characters and a .txt in the end:

['my_folder/file1.txt', 'my_folder/file2.txt']

Find Files Recursively with glob

By default, glob() only searches the current directory for matching files. However, you can use the ** wildcard character to perform a recursive search that includes all subdirectories. Let’s say you have the following directory structure:

my_folder/
file1.txt
sub_folder1/
file2.txt
sub_sub_folder/
file3.txt
sub_folder2/
file4.txt

To get a list of all the .txt files in the my_folder directory and its subdirectories, you can use the following code:

import glob

files = glob.glob('my_folder/**/*.txt', recursive=True)
print(files)

Notice the recursive=True flag. It’s necessary to tell glob to search for the pattern recursively. The code above will output all the files that have a .txt extension:

['my_folder/file1.txt', 'my_folder/sub_folder1/file2.txt', 'my_folder/sub_folder1/sub_sub_folder/file3.txt', 'my_folder/sub_folder2/file4.txt']

What’s next?

--

--