Skip to content
Lucas Charles edited this page Jul 28, 2013 · 4 revisions

Stuff for Nerds:

  • Limitations of the HackerNews API require post to be older than 24hrs before it appears in our dataset
  • Reddit has absolutely no URL validation (go submit a something with http:@#$@$, seriously, try it) so we do a double-query to Reddit for every URL searched, with and without trailing slash. This essentially allows us to merge two disparate data sets and return the true 'best' discussions of a single article.
  • slashdot has Absolutely no API to speak of so we scrape and cache the site creating posting and url objects. Postings may have many Urls and Urls may have many Postings. This way we can return any Slashdot results live using our API architecture with a call to our concurrent web client.
Clone this wiki locally