Skip to content

webcrawler/R code to download,parse and visualize popular baby names

Notifications You must be signed in to change notification settings

johnistan/ssa-baby-names

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

While taking the Google Dev's Python course. One of the exercises used Social Security Baby Name Data. I thought it a good exercise to get said data. So here is a python web scraper to get the data and some quick ggplots to visualize changes in rankings over time. They are in the R file. There are straight forward instructions to change the appropriate variables if you want to examine the declining popularity of the nomenclature attached to your parents greatest mistake.

There are walkthrough on my site. http://notebookonthewebs.tumblr.com/post/13114811173/popular-baby-names-walk-through-part-2-graphing-the

After writing this scraper Hadley Wickham pointed me to an identical exercise in R/Ruby.(alas the first rule in coding, find some else who has solved the same problem and steal their work) It can be found here https://github.com/hadley/data-baby-names. He also has more R backflips to use(being the inventer of R blackfilps). Take a look. 

Written with:
Python 2.7
	BeautifulSoup
	mechanical

R 2.13
	ggplot2

git user: electrum Pointed out that the raw data is also available form the SSa website. Here: http://www.ssa.gov/oact/babynames/limits.html

About

webcrawler/R code to download,parse and visualize popular baby names

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published