How to do Set Intersection in Python

by | Programming, Python, Tips

This tutorial will go through how to get the intersection between sets in Python with the help of some code examples.


What is a Set?

A Python set is one of the four built-in data types in Python to store collections of data. A set is a collection that has no duplicate elements, is unordered, unchangeable and not indexed. We can use sets for membership testing in a collection of items and removing duplicates. For further reading on the use of sets for removing duplicates, go to the article How to Get Unique Values from List in Python. Set objects also support mathematical operations like union, intersection, difference and symmetric difference. Union and intersection are the components of Jaccard similarity, which is a ubiquitous similarity measure in statistics.

What is Set Intersection?

set intersection
Set Intersection

The intersection of two sets is the largest set which contains all the elements common to both sets. The intersection is extensible to more than two sets. We can find the intersection between sets in Python using the intersection() method:

set_1.intersection(set_2, set_3, ..., set_n)

We can pass any number of sets to the intersection() method. The method returns a set with all the common elements between all sets. If we do not pass a parameter to intersection(), it returns a copy of the set.

Example #1: Using the Intersection Method

Let’s look at an example of using the intersection function with three sets. We will find the intersection between all possible set pairs and then the intersection between all three sets.

set_x = {3, 6, 9, 12}

set_y = {6, 12, 14, 16}

set_z = {1, 3, 6, 7, 16}

# Intersection between two sets

x_intersection_y = set_x.intersection(set_y)

y_intersection_z = set_y.intersection(set_z)

x_intersection_z = set_x.intersection(set_z)

# Intersection between all three sets

x_y_z = set_x.intersection(set_y, set_z)

print('set_x intersection set_y: ', x_intersection_y)

print('set_y intersection set_z: ', y_intersection_z)

print('set_x intersection set_z: ', x_intersection_z)

print('set_x intersection set_y intersection set_z:  ', x_y_z)

Let’s run the code to get the result:

set_x intersection set_y:  {12, 6}
set_y intersection set_z:  {16, 6}
set_x intersection set_z:  {3, 6}
set_x intersection set_y intersection set_z:   {6}

Example #2: Using the Intersection Operator &

We can also use the intersection operator & to get the intersection between sets. Let’s look at an example of using the intersection operator with three sets. We will find the intersection between all possible set pairs and then the intersection between all three sets.

set_x = {3, 6, 9, 12}
set_y = {6, 12, 14, 16}
set_z = {1, 3, 6, 7, 16}

# Intersection between two sets using intersection operator

x_intersection_y = set_x & set_y

y_intersection_z = set_y & set_z

x_intersection_z = set_x & set_z

# Intersection between all three sets using the intersection operator

x_y_z = set_x & set_y & set_z

print('set_x intersection set_y: ', x_intersection_y)

print('set_y intersection set_z: ', y_intersection_z)

print('set_x intersection set_z: ', x_intersection_z)

print('set_x intersection set_y intersection set_z:  ', x_y_z)

Let’s run the code to get the result:

set_x intersection set_y:  {12, 6}
set_y intersection set_z:  {16, 6}
set_x intersection set_z:  {3, 6}
set_x intersection set_y intersection set_z:   {6}

Example #3: Using Symmetric Difference

The symmetric difference is the opposite of the intersection method. The symmetric_difference() method returns a set containing all items from both sets but not those present in both sets. The symmetric_difference method only accepts one set as a parameter.

Let’s look at an example of using the symmetric_difference() with three sets. We will find the symmetric difference between all possible set pairs:

set_x = {3, 6, 9, 12}

set_y = {6, 12, 14, 16}

set_z = {1, 3, 6, 7, 16}

# Symmetric difference between two sets

x_symdiff_y = set_x.symmetric_difference(set_y)

y_symdiff_z = set_y.symmetric_difference(set_z)

x_symdiff_z = set_x.symmetric_difference(set_z)

print('set_x symmetric difference set_y: ', x_symdiff_y)

print('set_y symmetric difference set_z: ', y_symdiff_z)

print('set_x symmetric difference set_z: ', x_symdiff_z)

Let’s run the code to get the result.

set_x symmetric difference set_y:  {3, 9, 14, 16}
set_y symmetric difference set_z:  {1, 3, 7, 12, 14}
set_x symmetric difference set_z:  {16, 1, 7, 9, 12}

Summary

Congratulations on reading to the end of this tutorial. You can get the intersection between sets in Python using the intersection method or the intersection operator &. You can get the intersection of any number of sets using both methods. The opposite of set intersection is the symmetric difference. We can use the symmetric_difference() method, and it applies to a pair of sets.

Go to the online courses page on Python to learn more about Python for data science and machine learning.

Have fun and happy researching!

Profile Picture
Senior Advisor, Data Science | [email protected] | + posts

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.

Buy Me a Coffee