Merge pull request #630 from podverse/develop

Release v4.13.3
podverse · Jul 7, 2023 · d24b4b6 · d24b4b6
2 parents 0deb1a6 + 24fbeef
commit d24b4b6
Show file tree

Hide file tree

Showing 6 changed files with 268 additions and 119 deletions.
diff --git a/README.md b/README.md
@@ -2,29 +2,30 @@
 
 Data API, database migration scripts, and backend services for the Podverse ecosystem
 
-## Getting started
-
-### Local Development and Deployment
-
-This repo contains steps for running podverse-api locally for development.
-
-For stage/prod deployment instructions, please refer to the
-[podverse-ops docs](https://github.com/podverse/podverse-ops).
+- [Getting started](#getting-started)
+  * [NPM or Yarn](#npm-or-yarn)
+  * [Setup environment variables](#setup-environment-variables)
+  * [Install node_modules](#install-node-modules)
+  * [Start dev server](#start-dev-server)
+  * [Populate database](#populate-database)
+  * [Add podcast categories to the database](#add-podcast-categories-to-the-database)
+  * [Sync podcast data with Podcast Index API](#sync-podcast-data-with-podcast-index-api)
+  * [Matomo page tracking and analytics](#matomo-page-tracking-and-analytics)
+  * [More info](#more-info)
+
+<small><i><a href='http://ecotrust-canada.github.io/markdown-toc/'>generated with markdown-toc</a></i></small>
 
-### Prereqs
+## Getting started
 
-Before you can run podverse-api you will need a local Postgres version 11.5 database running.
+If you are looking to run this app or contribute to Podverse for the first time, please read the sections that are relevant for you in our [CONTRIBUTE.md](https://github.com/podverse/podverse-ops/blob/master/CONTRIBUTING.md) file in the podverse-ops repo. Among other things, that file contains instructions for running a local instance of the Podverse database.
 
-You can setup your own database, or go to the
-[podverse-ops repo](https://github.com/podverse/podverse-ops), add the podverse-db-local.env file as explained in the docs, then run this command:
+### NPM or Yarn
 
-```bash
-docker-compose -f docker-compose.local.yml up -d podverse_db
-```
+We use yarn and maintain a `yarn.lock` file, but it's not really a requirement for you to use yarn. This documentation uses npm in examples, but we generally use the yarn equivalent commands.
 
 ### Setup environment variables
 
-For local development, environment variables are provided by a local .env file. Duplicate the .env.example file, rename it to .env, and update all of the environment variables to match what is needed for your environment.
+For local development, environment variables are provided by a local `.env` file. You can find a link to example `.env` files in the [CONTRIBUTING.md](https://github.com/podverse/podverse-ops/blob/master/CONTRIBUTING.md) file.
 
 ### Install node_modules
 
@@ -38,113 +39,32 @@ npm install
 npm run dev
 ```
 
-### Sample database data
+### Populate database
 
-**TODO: Sample db instructions are out of date**
-The [podverse-ops repo](https://github.com/podverse/podverse-ops) contains the qa-database.sql file to help you get started quickly with a development database. You can clone the podverse-ops repo, then run the following command after the Postgres database is running:
-
-```bash
-psql -h 0.0.0.0 -p 5432 -U postgres -W -f ./sample-database/qa-database.sql
-```
-
-The password for the .sql file is: mysecretpw
+Instructions for this can be found in the [podverse-ops CONTRIBUTING.md file](https://github.com/podverse/podverse-ops/blob/master/CONTRIBUTING.md).
 
 ### Add podcast categories to the database
 
-```bash
-npm run dev:seeds:categories
-```
-
-### Add feed urls to the database
-
-To add podcasts to the database, you first need to add feed urls to the
-database, and then run the podcast parser with those feed urls.
-
-You can pass multiple feed urls as a comma-delimited string parameter to the
-`npm run dev:scripts:addFeedUrls` command.
-
-A list of sample podcast feed urls can be found in
-[podverse-api/docs/sampleFeedUrls.txt](https://github.com/podverse/podverse-api/tree/deploy/docs/sampleFeedUrls.txt).
-
-```bash
-npm run dev:scripts:addFeedUrls <feed urls>
-```
-
-### Parse feed urls to add podcasts and episodes to the database
-
-Orphan feed urls do not have a podcast associated with them.
-
-```bash
-npm run dev:scripts:parseOrphanFeedUrls
-```
-
-To parse all non-orphan and public feed urls, you can run:
-
-```bash
-npm run dev:scripts:parsePublicFeedUrls
-```
-
-### Use SQS to add feed urls to a queue, then parse them
-
-This project uses AWS SQS for its remote queue.
-
-```bash
-npm run dev:scripts:addAllOrphanFeedUrlsToPriorityQueue
-```
-
-or:
-
-```bash
-npm run dev:scripts:addAllPublicFeedUrlsToQueue
-```
-
-or:
-
-```bash
-npm run dev:scripts:addNonPodcastIndexFeedUrlsToPriorityQueue
-```
-
-or to add all recently updated (according to Podcast Index), public feeds to the priority queue:
+If you are creating a database from scratch, and not using the `populateDatabase` command explained in the CONTRIBUTE.md file, then you will need to populate the database with categories.
 
 ```bash
-yarn dev:scripts:addRecentlyUpdatedFeedUrlsToPriorityQueue
-```
-
-After you have added feed urls to a queue, you can retrieve and then parse
-the feed urls by running:
-
-```bash
-npm run dev:scripts:parseFeedUrlsFromQueue <restartTimeOut> <queueType>
-# restartTimeOut in milliseconds; queueType is optional and only acceptable value is "priority"
+npm run dev:seeds:categories
 ```
 
-We also have a self-managed parsing queue, where we manually mark podcasts to be added to a separate queue for parsing at a regular cadence. The property is `Podcast.parsingPriority` and the `parsingPriority` is a value between 0-5. 0 is the default, and means the podcast should not be added to the self-managed queue. 1 is the most frequent, and 5 is the least frequent parsing.
+### Sync podcast data with Podcast Index API
 
-At the time of writing this, 3 is the value we are using the most, which adds the feeds to the queue every 30 minutes.
+Podverse maintains its own podcast directory, and parses RSS feeds to populate it with data.
 
-The `offset` value is optional, and probably not needed.
+However, in prod Podverse syncs its database with the [Podcast Index API](https://podcastindex.org/), the world's largest open podcast directory, and the maintainers of the "Podcasting 2.0" RSS spec.
 
-```bash
-npm run dev:scripts:addFeedsToQueueByPriority <parsingPriority> <offset>
-```
+We run scripts in a cron interval that request from PI API a list of all the podcasts that it has detected updates in over the past X minutes, and then add those podcast IDs to an Amazon SQS queue for parsing, and then our parser containers, which run continuously, pull items from the queue, run our parser logic over it, then save the parsed data to our database.
 
-Then to parse from the self-managed queue call:
+If you'd like to run your own full instance of Podverse and would like a thorough explanation of the processes involved, please contact us and we can document it.
 
-```bash
-npm run dev:scripts:parseFeedUrlsFromQueue 
-```
+### Matomo page tracking and analytics
 
-### Request Google Analytics pageview data and save to database
+TODO: explain the Matomo setup
 
-Below are sample commands for requesting unique pageview data from Google
-Analytics, which is used throughout the site for sorting by popularity (not a
-great/accurate system for popularity sorting...).
-
-```bash
-npm run dev:scripts:queryUniquePageviews -- clips month
-npm run dev:scripts:queryUniquePageviews -- episodes week
-npm run dev:scripts:queryUniquePageviews -- podcasts allTime
-```
+### More info
 
-See the [podverse-ops repo](https://github.com/podverse/podverse-ops) for a sample
-cron configuration for querying the Google API on a timer.
+We used to have a more detailed README file, but I removed most of the content, since it is unnecessary for most local development workflows, and the information it was getting out-of-date. If you're looking for more info though, you can try digging through our [old README file here](https://github.com/podverse/podverse-api/blob/develop/docs/old/old-readme.md).
diff --git a/docs/old/old-readme.md b/docs/old/old-readme.md
@@ -0,0 +1,150 @@
+# podverse-api
+
+Data API, database migration scripts, and backend services for the Podverse ecosystem
+
+## Getting started
+
+### Local Development and Deployment
+
+This repo contains steps for running podverse-api locally for development.
+
+For stage/prod deployment instructions, please refer to the
+[podverse-ops docs](https://github.com/podverse/podverse-ops).
+
+### Prereqs
+
+Before you can run podverse-api you will need a local Postgres version 11.5 database running.
+
+You can setup your own database, or go to the
+[podverse-ops repo](https://github.com/podverse/podverse-ops), add the podverse-db-local.env file as explained in the docs, then run this command:
+
+```bash
+docker-compose -f docker-compose.local.yml up -d podverse_db
+```
+
+### Setup environment variables
+
+For local development, environment variables are provided by a local .env file. Duplicate the .env.example file, rename it to .env, and update all of the environment variables to match what is needed for your environment.
+
+### Install node_modules
+
+```bash
+npm install
+```
+
+### Start dev server
+
+```bash
+npm run dev
+```
+
+### Sample database data
+
+**TODO: Sample db instructions are out of date**
+The [podverse-ops repo](https://github.com/podverse/podverse-ops) contains the qa-database.sql file to help you get started quickly with a development database. You can clone the podverse-ops repo, then run the following command after the Postgres database is running:
+
+```bash
+psql -h 0.0.0.0 -p 5432 -U postgres -W -f ./sample-database/qa-database.sql
+```
+
+The password for the .sql file is: mysecretpw
+
+### Add podcast categories to the database
+
+```bash
+npm run dev:seeds:categories
+```
+
+### Add feed urls to the database
+
+To add podcasts to the database, you first need to add feed urls to the
+database, and then run the podcast parser with those feed urls.
+
+You can pass multiple feed urls as a comma-delimited string parameter to the
+`npm run dev:scripts:addFeedUrls` command.
+
+A list of sample podcast feed urls can be found in
+[podverse-api/docs/sampleFeedUrls.txt](https://github.com/podverse/podverse-api/tree/deploy/docs/sampleFeedUrls.txt).
+
+```bash
+npm run dev:scripts:addFeedUrls <feed urls>
+```
+
+### Parse feed urls to add podcasts and episodes to the database
+
+Orphan feed urls do not have a podcast associated with them.
+
+```bash
+npm run dev:scripts:parseOrphanFeedUrls
+```
+
+To parse all non-orphan and public feed urls, you can run:
+
+```bash
+npm run dev:scripts:parsePublicFeedUrls
+```
+
+### Use SQS to add feed urls to a queue, then parse them
+
+This project uses AWS SQS for its remote queue.
+
+```bash
+npm run dev:scripts:addAllOrphanFeedUrlsToPriorityQueue
+```
+
+or:
+
+```bash
+npm run dev:scripts:addAllPublicFeedUrlsToQueue
+```
+
+or:
+
+```bash
+npm run dev:scripts:addNonPodcastIndexFeedUrlsToPriorityQueue
+```
+
+or to add all recently updated (according to Podcast Index), public feeds to the priority queue:
+
+```bash
+yarn dev:scripts:addRecentlyUpdatedFeedUrlsToPriorityQueue
+```
+
+After you have added feed urls to a queue, you can retrieve and then parse
+the feed urls by running:
+
+```bash
+npm run dev:scripts:parseFeedUrlsFromQueue <restartTimeOut> <queueType>
+# restartTimeOut in milliseconds; queueType is optional and only acceptable value is "priority"
+```
+
+We also have a self-managed parsing queue, where we manually mark podcasts to be added to a separate queue for parsing at a regular cadence. The property is `Podcast.parsingPriority` and the `parsingPriority` is a value between 0-5. 0 is the default, and means the podcast should not be added to the self-managed queue. 1 is the most frequent, and 5 is the least frequent parsing.
+
+At the time of writing this, 3 is the value we are using the most, which adds the feeds to the queue every 30 minutes.
+
+The `offset` value is optional, and probably not needed.
+
+```bash
+npm run dev:scripts:addFeedsToQueueByPriority <parsingPriority> <offset>
+```
+
+Then to parse from the self-managed queue call:
+
+```bash
+npm run dev:scripts:parseFeedUrlsFromQueue 
+```
+
+### Request Google Analytics pageview data and save to database
+
+Below are sample commands for requesting unique pageview data from Google
+Analytics, which is used throughout the site for sorting by popularity (not a
+great/accurate system for popularity sorting...).
+
+```bash
+npm run dev:scripts:queryUniquePageviews -- clips month
+npm run dev:scripts:queryUniquePageviews -- episodes week
+npm run dev:scripts:queryUniquePageviews -- podcasts allTime
+```
+
+See the [podverse-ops repo](https://github.com/podverse/podverse-ops) for a sample
+cron configuration for querying the Google API on a timer.
diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "podverse-api",
-  "version": "4.13.2",
+  "version": "4.13.3",
   "description": "Data API, database migration scripts, and backend services for all Podverse models.",
   "contributors": [
     "Mitch Downey"
@@ -156,7 +156,7 @@
     "class-validator": "0.14.0",
     "clean-webpack-plugin": "3.0.0",
     "cookie": "0.4.0",
-    "crypto-js": "~3.1.9-1",
+    "crypto-js": "~3.2.1",
     "csvtojson": "^2.0.10",
     "date-fns": "2.8.1",
     "docker-cli-js": "2.9.0",

diff --git a/src/controllers/podcast.ts b/src/controllers/podcast.ts
@@ -46,6 +46,45 @@ const getPodcastByPodcastIndexId = async (podcastIndexId, includeRelations = tru
   return podcast
 }
 
+const getPodcastByPodcastGuid = async (podcastGuid: string, includeRelations?: boolean) => {
+  const repository = getRepository(Podcast)
+  const podcast = await repository.findOne(
+    {
+      podcastGuid,
+      isPublic: true
+    },
+    {
+      relations: includeRelations ? ['authors', 'categories', 'feedUrls'] : []
+    }
+  )
+
+  if (!podcast) {
+    throw new createError.NotFound('Podcast not found')
+  }
+
+  return podcast
+}
+
+const getPodcastByFeedUrl = async (feedUrl: string, includeRelations?: boolean) => {
+  const podcastId = await getPodcastIdByFeedUrl(feedUrl)
+  const repository = getRepository(Podcast)
+  const podcast = await repository.findOne(
+    {
+      id: podcastId,
+      isPublic: true
+    },
+    {
+      relations: includeRelations ? ['authors', 'categories', 'feedUrls'] : []
+    }
+  )
+
+  if (!podcast) {
+    throw new createError.NotFound('Podcast not found')
+  }
+
+  return podcast
+}
+
 const findPodcastsByFeedUrls = async (urls: string[]) => {
   const foundPodcastIds = [] as any
   const notFoundFeedUrls = [] as any
@@ -410,6 +449,8 @@ export {
   findPodcastsByFeedUrls,
   getPodcast,
   getPodcasts,
+  getPodcastByFeedUrl,
+  getPodcastByPodcastGuid,
   getPodcastByPodcastIndexId,
   getPodcastsFromSearchEngine,
   getMetadata,