Prerequisite: Generating Word Cloud in Python | Set – 1
Word Cloud is a data visualization technique used for representing text data in which the size of each word indicates its frequency or importance. Significant textual data points can be highlighted using a word cloud. Word clouds are widely used for analyzing data from social network websites.
For generating word cloud in Python, modules needed are – matplotlib, pandas and wordcloud. To install these packages, run the following commands :
pip install matplotlib pip install pandas pip install wordcloud
To get the link to csv file used, click here.
Code #1 : Number of words
It is possible to set a maximum number of words to display on the tagcloud. For this purpose, Use max_words keyword arguments of WordCloud() function.
Python3
# importing the necessary modules from wordcloud import WordCloud import matplotlib.pyplot as plt import csv # file object is created file_ob = open (r "C:/Users/user/Documents/sample.csv" ) # reader object is created reader_ob = csv.reader(file_ob) # contents of reader object is stored . # data is stored in list of list format. reader_contents = list (reader_ob) # empty string is declare text = "" # iterating through list of rows for row in reader_contents : # iterating through words in the row for word in row : # concatenate the words text = text + " " + word # show only 10 words in the wordcloud . wordcloud = WordCloud(width = 480 , height = 480 , max_words = 10 ).generate(text) # plot the WordCloud image plt.figure() plt.imshow(wordcloud, interpolation = "bilinear" ) plt.axis( "off" ) plt.margins(x = 0 , y = 0 ) plt.show() |
Output:
Code #2 : Remove some words
Some words can be removed that we don’t want to show. For this purpose, pass those words to stopwords list arguments of WordCloud() function.
Python3
# importing the necessary modules from wordcloud import WordCloud import matplotlib.pyplot as plt import csv # file object is created file_ob = open (r "C:/Users/user/Documents/sample.csv" ) # reader object is created reader_ob = csv.reader(file_ob) # contents of reader object is stored . # data is stored in list of list format. reader_contents = list (reader_ob) # empty string is declare text = "" # iterating through list of rows for row in reader_contents : # iterating through words in the row for word in row : # concatenate the words text = text + " " + word # remove Python , Matplotlib , Geeks Words from WordCloud . wordcloud = WordCloud(width = 480 , height = 480 , stopwords = [ "Python" , "Matplotlib" , "Geeks" ]).generate(text) # plot the WordCloud image plt.figure() plt.imshow(wordcloud, interpolation = "bilinear" ) plt.axis( "off" ) plt.margins(x = 0 , y = 0 ) plt.show() |
Output:
Code #3 : Change background
We can Change the color of the background of the wordcloud. For this purpose, use background_color keyword arguments of WordCloud() function.
Python3
# importing the necessary modules from wordcloud import WordCloud import matplotlib.pyplot as plt import csv # file object is created file_ob = open (r "C:/Users/user/Documents/sample.csv" ) # reader object is created reader_ob = csv.reader(file_ob) # contents of reader object is stored . # data is stored in list of list format. reader_contents = list (reader_ob) # empty string is declare text = "" # iterating through list of rows for row in reader_contents : # iterating through words in the row for word in row : # concatenate the words text = text + " " + word wordcloud = WordCloud(width = 480 , height = 480 , background_color = "pink" ).generate(text) # plot the WordCloud image plt.figure() plt.imshow(wordcloud, interpolation = "bilinear" ) plt.axis( "off" ) plt.margins(x = 0 , y = 0 ) plt.show() |
Output:
Code #4 : Change color of words
We can change the color of words using colormap keyword arguments of WordCloud() function.
Python3
# importing the necessary modules from wordcloud import WordCloud import matplotlib.pyplot as plt import csv # file object is created file_ob = open (r "C:/Users/user/Documents/sample.csv" ) # reader object is created reader_ob = csv.reader(file_ob) # contents of reader object is stored . # data is stored in list of list format. reader_contents = list (reader_ob) # empty string is declare text = "" # iterating through list of rows for row in reader_contents : # iterating through words in the row for word in row : # concatenate the words text = text + " " + word wordcloud = WordCloud(width = 480 , height = 480 , colormap = "Oranges_r" ).generate(text) # plot the WordCloud image plt.figure() plt.imshow(wordcloud, interpolation = "bilinear" ) plt.axis( "off" ) plt.margins(x = 0 , y = 0 ) plt.show() |
Output:
Code #5 : Maximum and minimum font size
We can control minimum and maximum font size of the wordcloud. For this purpose, use max_font_size, min_font_size keyword arguments of WordCloud() function .
Python3
# importing the necessary modules from wordcloud import WordCloud import matplotlib.pyplot as plt import csv # file object is created file_ob = open (r "C:/Users/user/Documents/sample.csv" ) # reader object is created reader_ob = csv.reader(file_ob) # contents of reader object is stored . # data is stored in list of list format. reader_contents = list (reader_ob) # empty string is declare text = "" # iterating through list of rows for row in reader_contents : # iterating through words in the row for word in row : # concatenate the words text = text + " " + word wordcloud = WordCloud(width = 480 , height = 480 , max_font_size = 20 , min_font_size = 10 ).generate(text) plt.figure() plt.imshow(wordcloud, interpolation = "bilinear" ) plt.axis( "off" ) plt.margins(x = 0 , y = 0 ) plt.show() |
Output: