image word-cloud script   0   1547
Python Script 16: Generating word cloud image of a text using python


Word cloud is an image composed of words used in a particular text or subject, in which the size of each word indicates its frequency or importance.

In this python script, we will generate a word cloud image of text from a news article on CNN.


Dependencies:

- wordcloud 1.5.0
- matplotlib 3.0.3

Install the dependencies in a virtual environment and activate it. 


Image configurations:

The image we want to generate will have below configurations.

# image configurations
background_color = "#101010"
height = 720
width = 1080


I have copy-pasted the content of the news article in a text file. Read the file and store words in a list.

# Read a text file and calculate frequency of words in it
with open("/tmp/sample_text.txt", "r") as f:
words = f.read().split()


Now generate a dictionary with keys as words and values as frequency of words. We will ignore the stop words.

data = dict()

for word in words:
word = word.lower()
if word in stop_words:
continue

data[word] = data.get(word, 0) + 1


You can get the list of stopwords from nltk library or from resource available online.

import nltk
from nltk.corpus import stopwords
set(stopwords.words('english'))


Now create word cloud object and initialize with image configurations.

word_cloud = WordCloud(
background_color=background_color,
width=width,
height=height
)

word_cloud.generate_from_frequencies(data)
word_cloud.to_file('image.png')


Call the generate_from_frequencies method with data dictionary as input and then generate the image and save to file.


Code is available at Github.


Complete Script:

"""
Python script to generate word cloud image.
Author - Anurag Rana
Read more on - https://www.pythoncircle.com
"""

from wordcloud import WordCloud

# image configurations
background_color = "#101010"
height = 720
width = 1080

with open("stopwords.txt", "r") as f:
stop_words = f.read().split()

# Read a text file and calculate frequency of words in it
with open("/tmp/sample_text.txt", "r") as f:
words = f.read().split()

data = dict()

for word in words:
word = word.lower()
if word in stop_words:
continue

data[word] = data.get(word, 0) + 1

word_cloud = WordCloud(
background_color=background_color,
width=width,
height=height
)

word_cloud.generate_from_frequencies(data)
word_cloud.to_file('image.png')


For more details, visit official documentation of word cloud. 

image word-cloud script   0   1547

Related Articles:
Python Script 1: Convert ebooks from epub to mobi format
Python script to convert the ebooks from one format to another in bulk, Automated book conversion to kindle format, Free kindle ebook format conversion, automating the book format conversion, python code to book format convert,...
Python Script 8: Validating Credit Card Number - Luhn's Algorithm
Validating credit card number using Luhn' Algorithm, Verifying Credit and Debit card using python script, Python code to validate the credit card number, Luhn' algorithm implementation in Python...
Python Script 9: Getting System Information in Linux using python script
Finding linux system information like processor details, memory usage and average load using python script. Python program to find Ubuntu information....
Python Script 5: How to find most popular technologies on Stackoverflow
How to find most popular technology on stackoverflow by crawling the stackoverflow site using python. Using python beautifulsoup to crawl web pages on stackoverflow. Python code to crawl stackoverflow, crawling stackoverflow for tags, python script to fetch data from stackoverflow....

SUBSCRIBE
Please subscribe to get the latest articles in your mailbox.


Recent Posts:






© 2017-2019 Python Circle   Contact Us   Advertise with Us