Learn Python Sets with Examples

Python provides many data structures,but many of us know only about Lists,Tuples and Dictionaries.But,there are many other data structures provides by python which can make our job easy.Python Set is one of them.In this tutorial,you will be learning about python set and its various methods like intersection,difference,symmetric_difference and many more with examples.

Set

Set is collection, which is unordered and unindexed. You can not access set elements with indexing because sets are unordered. Because,they are unordered,order in which elements gets displayed can change.

How to create set in python?

Sets can be created using different ways.Below I am going to show you different ways of creating set.

Using Set function

We can create set using python set function.Just pass a list to the set function,and returned datatype will be a set.

s1=set([1,2,3,4,5,5])
print(s1)
print(type(s1))

##OUTPUT##
{1, 2, 3, 4, 5}
<class 'set'>

You can see that,the output is in curly brackets and its datatype is <class ‘set’>.Also,notice that,I had passed 5 two times in list,but in set I have only one 5.This is because,in set duplicate items are removed.

Using { } brackets

We can create a set using {} brackets.This time we do not need to pass a list in {} brackets.Just,put all elements in {} and it will create a set.

s2={1,2,3,3,4,5}
print(s2)
print(type(s2))

###OUTPUT###
{1, 2, 3, 4, 5}
<class 'set'>

How to create a Empty Set?

Sometimes you may need to create empty set.So,how to create a empty set?You may be thinking of something like a={}.Lets see what happen when we try to create empty set using empty {}.

s3={}
print(s3)

###OUTPUT###
{}

You can see that we have a empty {}.But,Is this a set?Lets check its data type.

s3={}
print(type(s3))
<class 'dict'>

Notice that,data type is dict,means if we define a empty set in this way,it will not be a set,instead it will create a empty dictionary.So,how to create an empty set.To create empty set,we need to use set function.

s2=set()
print(s2)
print(type(s2))

###OUTPUT###
set() 
<class 'set'>

Now,we have a empty set.

How to access elements of set?

Sets are unordered collections of items.Means,you can not access elements with indexing,because order in which they are stored can change.

s4=set([1,2,3,4,5,6,7,8,9,0])
s4

###OUTPUT###
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

From above code,you can see that,I have created a set.While creating set,0 is at last index.But when I print set s4,0 is at first index.So,this will create problem when you try to access elements with indexing.So,python does not provide access to set elements with indexing.So,how to access set elements?We can use for loops.

for i in s4:
    print(i,end=' ')


####OUTPUT####
0 1 2 3 4 5 6 7 8 9

Operations on set

in operator on set

in operator is used to check whether a element is present in set or not.If element is present in set,it will return True else it will return False.

s4=set([1,2,3,4,5,6,7,8,9,0])
print(1 in s4)
print(11 in s4)


###OUTPUT##
True
False

Adding elements to a set

To add a item in a set we need to use add method.

s4.add(11)
print(11 in s4)

###OUTPUT###
True

Lets add 12,13,14,15 to set s4.

s4.add(11,12,13,14,15)

###ERROR###
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: add() takes exactly one argument (5 given)

It returned an error.If you read the error,it clearly mentions that,add() takes only one argument,but we have provides 5.This means,at a time we can add only one item to the set.Is there any other method so that we can add multiple items to a set at a time?Of course YES.

Adding multiple items to a set at a time

To add multiple items to a set,we need to use update method instead of add method.We need to pass a list of items to update method.

s4.update([12,13,14,15])


###OUTPUT###
print(s4)
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15}

Remove and Discard

Remove and Discard both are used to remove elements from the set.But,there is some difference between these two.Lets see this with example.

print(s4)
###OUTPUT###
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15}

s4.remove(0)
print(s4)
###OUTPUT###
{1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15} # 0 removed

s4.discard(1)
print(s4)
###OUTPUT###
{2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15} # 1 removed

###TRYING TO REMOVE 20 from s4###

s4.remove(20) ##20 not present in s4

###ERROR###
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 20

###TRYING TO DISCARD 20 from s4###

s4.discard(20) ##20 not present in s4

###NO OUTPUT,NO ERROR###

So, discard and remove both are used to remove element from the set.But the difference is,when a element which is not present in set is removed then python will generate an error.But,if a element which is not present in set is discarded,python will not generate error.

pop method on set

pop method is also used to remove elements from set.But,pop method removes items from end of set.As sets are unordered,you will not know which element will be deleted.pop will return the element which is popped from the set.

s5=set([1,2,3,4,5,6,7,8,9,10])
print(s5)

###OUTPUT###
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

s5.pop()
###OUTPUT###
1

Set Operations in python

If you know set theory,then you may be knowing about set operations like union,intersections,difference etc.I will show you,how you can perform set operation on set.

Union

Union is used to combine data from multiple sets.

s6=set([1,2,3])
s7=set([4,5,6])
s8=set([7,8,9])

s67=s6.union(s7) ##Combining set 6 and 7
print(s67)
###OUTPUT###
{1, 2, 3, 4, 5, 6}

s678=s6.union(s7,s8) ##Combining set 6,7,8
print(s678)

###OUTPUT###
{1, 2, 3, 4, 5, 6, 7, 8, 9}

Intersection

Intersection is used to find the common elements from the sets.

s9=set([1,2,3,4])
s10=set([4,5,6,7])
s11=set([7,8,9,10])

print(s9.intersection(s10)) ## 4 is common between s9 and s10
{4}

print(s9.intersection(s11)) ##Nothing common between s9 and s11
set()

print(s9.intersection(s10,s11))##Nothing common between s9 and s11
set()

Output of last line print(s9.intersections(s10,s11) is null set,because nothing is common between those three sets.4 is common in s9 and s10,but 4 is not in s11. 7 is common in s10 and s11,but 4 is not in s9. So output is a null set. Intersection only returns elements which are common in all the sets on which we are applying intersection.

difference

difference method is like minus operator in sql.Difference returns elements which are present in one set but not present in other set.While taking difference of sets,order in which they are specified matters.

s9=set([1,2,3,4])
s10=set([4,5,6,7])

print(s9.difference(s10))
###OUTPUT###
{1, 2, 3}

print(s10.difference(s9))
###OUTPUT###
{5, 6, 7}

Look carefully at above code and notice that although I took difference of s9 and s10 both the times,the results are different.Why so? Remember,while taking difference of set,order in which difference is applied on set matters.

s9.difference(s10) will return all the elements which are present in s9 but not in s10 while s10.difference(s9) will return all the elements which are present in s10 but not in s9. Hence,results are different.

Symmetric_difference

symmetric_differences is used to return the difference of elements from both the sets.It does not care about order in which sets are specified.

s9=set([1,2,3,4])
s10=set([4,5,6,7])

print(s9.symmetric_difference(s10))
###OUTPUT###
{1, 2, 3, 5, 6, 7}

print(s10.symmetric_difference(s9))
###OUTPUT###
{1, 2, 3, 5, 6, 7}

From the above output,you can see that both the outputs are similar although we changed the order of s9 and s10.

s9.symmetric_difference(s10) returns all the elements which are in s9 but not in s10 and it also returns all the elements which are in s10 but not in s9.So, writing s9.symmetric_difference(s10) or s10.symmetric_difference(s9) does not matters.

Real World Use cases of Set

Example 1. Removing duplicates from list

Suppose you have a list in which lots of items are repeated and you want to remove the duplicates.You can iterate over all the elements and check whether it is present in list more than one time.obviously,you can do that in this way,but is this a optimised way to do it?

You can solve this problem using set,in just two lines.First,typecast list into set.Doing this will remove duplicates automatically.Now,if you want,convert it back to list.You can do the same thing in one line too.

l1=[1,2,3,4,5,6,7,8,9,9,9,8,8,1,2,11,12]
print(l1)

###OUTPUT##
[1, 2, 3, 4, 5, 6, 7, 8, 9, 9, 9, 8, 8, 1, 2, 11, 12]

s=set(l1)
l1=list(s)
print(l1)

###OUTPUT###
[1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12]

print(list(set(l1)))  ##ONE LINER

Example 2:Using set operations in real world

Lets see how set operations can make complicated problems very easy.Suppose you have 3 lists football,cricket,basketball.These lists contains names of people who loves to play that particular sport.

football=['James','John','Robert','Michael','David']
basketball=['John','William','Joseph','David','Paul']
cricket=['David','Joseph','Mark','Donald']

Now,lets answer some questions.

Suppose,we want to know who are the people who like playing all 3 sports.In this case,we can take intersection of all the lists.

print(set(football).intersection(set(basketball),set(cricket))

###OUTPUT###
{'David'}

From list you can verify that,David is the only person who is in all the three lists.

Suppose,we want to know who are the people who like football but does not like cricket and basketball.

print(set(football).difference(set(basketball),set(cricket))
###OUTPUT###
{'Robert', 'James', 'Michael'}

Again,you can verify that,Robert,James and Michael like playing football but they do not like basketball and cricket.Notice that,although David is present in all the lists,but David is not in output,because, question is about persons who like to play only football and do not like to play cricket and basketball.But,David like to play all the sports.Hence,its not printed in output.

So,this was all about python set and its methods and usage.I hope,you understood this topic very well.If you have any doubt or any suggestion,feel free to comment below.

Thank You.

Amarjeet

About Amarjeet

Amarjeet,BE in CS ,love to code in python and passionate about Machine Learning and Data Science. Expertsteaching.com is just a medium to share what I have learned till now with world.
Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *