Sets in Python (24/100 Days of Python)

Martin Mirakyan
6 min readJan 25, 2023

--

Day 24 of the “100 Days of Python” blog post series covering sets

Python sets are a unique collection of elements that are unordered. They are very useful for data manipulation and have many real-world use cases. In this post, we will explore how to create and modify sets in Python, discuss some operations that can be performed on sets, and touch on set comprehension.

Creating a Set in Python

To create a set in Python, you can use the built-in set() function or use curly braces {}.

numbers = set()        # Creating an empty set
numbers.add(1) # {1}
numbers.add(2) # {1, 2}
numbers.add(3) # {1, 2, 3}
numbers.add(2) # {1, 2, 3}

numbers = {1, 2, 3} # Creating a set with elements

Note that a set will always store only unique elements. So, if we try to add an element twice, it won’t appear in the set. In the example above, when we add the number 2 the second time, the set is not modified as 2 was already present in the set.

A great use case for using sets can be to remove duplicates from a list. Duplicate data can be a common problem when working with lists in Python. One way to remove duplicates is by converting the list to a set and then back to a list:

emails = ['anna@gmail.com', 'bob@gmail.com', 'anna@gmail.com']
unique_emails = list(set(emails))
print(unique_emails) # ['bob@gmail.com', 'anna@gmail.com']

Removing Elements from a Set

To remove an element from a set in Python, you can use the discard() method or the remove() method. The discard() method will remove the specified element from the set if it is present, and it will not raise an error if the element is not found in the set. The remove() method will remove the specified element from the set if it is present, and it will raise a KeyError if the element is not found in the set:

# Use remove() to remove an element from a set
numbers = {1, 2, 3} # {1, 2, 3}
numbers.remove(2) # {1, 3}
numbers.remove(2) # KeyError: 2
numbers.remove(10) # KeyError: 10

# Use discard() to discard an element from a set
numbers = {1, 2, 3} # {1, 2, 3}
numbers.discard(2) # {1, 3}
numbers.discard(2) # {1, 3} - nothing happened
numbers.discard(10) # {1, 3} - nothing happened

Checking if an Element is Present in a Set

A common task when working with sets is to check if a specific element is present in the set or not. You can use the in keyword to check for membership in a set. Imagine if we have a list of blocked users and would like to proceed only in case the current user is not blocked:

blocked_users = {'user1', 'user2'}   # usernames `user1` and `user2` are blocked
user = input() # Get the username from the input

# Check if the username is blocked
if user in blocked_users:
print(f'{user} is blocked!')
else:
print(f'Welcome, {user}!')

You can also use the not in keyword to check if an element is not a member of a set:

numbers = {1, 2, 3}     # Define a set of numbers
n = int(input()) # Get the current number from the user

# Check if the number entered is not in the set
if n not in numbers:
print(f'{n} is not in the set.')
else:
print(f'{n} is in the set.')

Keep in mind that checking the membership of an element in a set in Python is a simple and efficient process. It’s way faster than checking membership in lists. When operating with lists, the program has to go through all the elements one by one and check if the element in the list is equal to the inputted one. For sets, Python stores the elements efficiently (in a hashtable) thus making the checking of the element being in the set way faster (we’ll discuss the time complexity of some data structures later in the series). So, if you have a collection of items and would like to check for a membership many times, it would be preferable to use a set in that scenario.

Intersection of Sets

One of the most powerful features of sets is the ability to perform mathematical operations like union and intersection. The intersection of two sets is a new set that contains only the elements that are common to both sets. You can perform the intersection operation using the intersection() method or the & operator. Here’s an example of how to find the common customers who made a purchase both last month and this month:

prev_month_customers = {'customer1', 'customer2', 'customer3'}
this_month_customers = {'customer2', 'customer3', 'customer4'}
recurring_customers = prev_month_customers.intersection(this_month_customers)
print(recurring_customers) # {'customer2', 'customer3'}

It’s also possible to use the & operator to get the intersection of two sets:

prev_month_customers = {'customer1', 'customer2', 'customer3'}
this_month_customers = {'customer2', 'customer3', 'customer4'}
recurring_customers = prev_month_customers & this_month_customers
print(recurring_customers) # {'customer2', 'customer3'}

Union of Sets

The union of two sets is a new set that contains all the elements from both sets. You can perform the union operation using the union() method or the | operator. You can use the union operation to find all the customers that have bought products from your store. If you have different branches and a list of customers for each branch, you can use union() to find the total number of customers:

branch1 = {'customer1', 'customer2', 'customer3'}
branch2 = {'customer2', 'customer3', 'customer4'}

# Using the union() method
customers = branch1.union(branch2)
print(customers) # {'customer1', 'customer2', 'customer3', 'customer4'}

# Using the `|` operator
customers = branch1 | branch2
print(customers) # {'customer1', 'customer2', 'customer3', 'customer4'}

Difference of Sets

The difference between the two sets is a new set that contains only the elements that are present in the first set but not in the second set. You can perform the difference operation using the difference() method or the - operator. Having customers of the previous month, and the current month, we can find the new customers that we got this month:

prev_month_customers = {'customer1', 'customer2', 'customer3'}
this_month_customers = {'customer2', 'customer3', 'customer4'}

# Using the difference() method
new_customers = this_month_customers.difference(prev_month_customers)
print(new_customers) # {'customer4'}

# Using the `-` operator
new_customers = this_month_customers - prev_month_customers
print(new_customers) # {'customer4'}

Set Comprehension

Set comprehension is a concise way to create a new set in Python. It is similar to list comprehension, but it creates a set instead of a list. Set comprehension uses curly braces {} and includes an expression followed by a for loop. We can compute squares of numbers and store that in a set:

squares = {x**2 for x in range(10)}
print(squares) # {0, 1, 64, 4, 36, 9, 16, 49, 81, 25}

Similar to a list comprehension, you can use set comprehension with conditional statements:

evens = {x for x in range(10) if x % 2 == 0}
print(evens) # {0, 2, 4, 6, 8}

Set comprehension can also be used with nested loops. Here’s an example of how to create a set of all possible full user names from a list of names and a list of surnames:

names = ['Olivia', 'Ethan', 'Isabella']
surnames = ['Smith', 'Johnson', 'Garcia']

full_names = {(name, surname) for name in names for surname in surnames}
print(full_names)
# {('Isabella', 'Smith'), ('Isabella', 'Johnson'), ('Olivia', 'Smith'),
# ('Olivia', 'Johnson'), ('Ethan', 'Garcia'), ('Isabella', 'Garcia'),
# ('Ethan', 'Smith'), ('Ethan', 'Johnson'), ('Olivia', 'Garcia')}

Note that the elements are not ordered when printed on the screen. That’s because the set in Python is not ordered. It keeps all the elements in a hashtable, which does not maintain the order of the elements added to it.

Set comprehension is an efficient way of creating sets in Python. It is a concise and readable way to create sets, especially when you need to have only unique elements in the final set.

What’s next?

--

--