Skip to content

A web crawler that indexes shows and episodes from the netcast network Twit which can generate RSS feeds with every single episode in a show.

Notifications You must be signed in to change notification settings

Ralle/TwitSpider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Description:

A Python scraper for Leo Laporte's online podcast / netcast network. It indexes shows and their episodes and can generate RSS feeds from them. It takes quite a while to index all episodes as there are more than 4000. So I reommend indexing a single show instead or just let it run overnight.

Reason:

So I can listen to the Security Now backlog without needing to manually download each episode. This allows you to subscribe in iTunes to the RSS file you generate if you host it on any HTTP server (I use my Public Dropbox folder).

Usage:

Index list of shows, this is required to do first, otherwise the scraper does not know which shows exist

python main.py showlist

Discover all episodes in all shows, indexes the lists of episodes for all shows, but does not get media links and descriptions

python main.py shows

Index all episodes in all shows

python main.py episodes

If you are only interested in a single show (for example Security Now) you can use the following command

python main.py show "Security Now"

Dependencies:

About

A web crawler that indexes shows and episodes from the netcast network Twit which can generate RSS feeds with every single episode in a show.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages