-
Notifications
You must be signed in to change notification settings - Fork 1
Home
- What is MediaCat and what is trying to achieve (use cases)
- A document with evolving notes about the next release of MediaCat is available here on Google Docs
- A diagram with evolving architecture is available
- The Slack for this project is teammediacat.slack.com
- MediaCat code is stored in three repositories. Each repository contains information about how to run and manage this component of the MediaCat stack.
MediaCat-twitter-API-crawler takes in a scope document in a prescribed format and crawls twitter handles, bringing back the contents of tweets. The end result is one or more .csvs containing all the tweets for the target twitter users. Detailed information for how to run and troubleshoot this application is available in the repository at:
mediacat-domain-crawler takes in a scope document in a prescribed format and crawls domains, bringing back the html contents of individual domains. Detailed information for how to run and troubleshoot this application is available in the repository at:
Post-processor takes in the data results from both the twitter and domain crawlers and produces a .csv file in a prescribed format from which a user can determine citational practices and approaches between scope twitter and news media sources. Detailed information for how to run and troubleshoot this application is available in the repository at:
- Additional developer Documentation is linked in the readme
- All students that worked on it and grant funding that supported it