Word Cloud Generator

Algorithm Overview

This is a Word Cloud Generator, which will attempt to place w number of the most common words from an input text file, f, on an image of size NxN. Below I've summarized my implementation and some of the considerations taken:

Collect input parameters w, f, and N
Build Dictionary of stopwords, which are uninteresting words to blacklist.
Build the vocabulary Dictionary, which are the words contained in f that are NOT in the stopwords Dictionary.
Generate an NxN grid to optimize placement of words in the generated Word Cloud, as it will only try to place words in places where two lines of the same colour/ thickness intersect perpendicularly.
Attempts to place each word on a square in the NxN grid. A valid placement is:
1. When the collision box around the word does NOT collide with any other placed word's collision box. For this I used the does_overlap function from the Axis Aligned Bounding Box (AABB) Trees library where the tree contains the coordinates of the word boxes for each placed word. This allowed for efficient checks to see whether a word placement would overlap with an already placed word.
2. When the collision box around the word lies within the NxN grid's coordinates
Upon successful execution, the output Word Cloud image will be saved to output/wordcloud.png

Sample Outputs

The following Word Cloud was generated using the novel 1984 by George Orwell's:

Left most image is the generated NxN grid mentioned in A.O. #4
Middle image is a visualization of the word-box collision boxes mentioned in in A.O. #5.1
Right most image is the generated Word Cloud image mentioned in in A.O. #6

Sample Execution

Clone the repository, install the dependencies via pip, then from the terminal run: python main.py

Upon execution, you will be prompted for 3 pieces for information;

Input text (.txt) file f, must be in the input directory
Number of words w
Image dimension N

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
__pycache__		__pycache__
fonts		fonts
lang		lang
output		output
wordcloud		wordcloud
.gitignore		.gitignore
README.MD		README.MD
input.py		input.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word Cloud Generator

Algorithm Overview

Sample Outputs

Sample Execution

References

About

Releases

Packages

Languages

GrahlmanMatthew/Word-Cloud-Generator

Folders and files

Latest commit

History

Repository files navigation

Word Cloud Generator

Algorithm Overview

Sample Outputs

Sample Execution

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages