This folder contains the source code for the backend architecture.

The purpose of the backend is to crawl for usable videos, download the subtitles, and send all the information to the indexer (a web service). That's it.

Frontend(s)

There is a Web frontend to the backend [index.php], which generates some html that acts as a view/controller to the main crawling+indexing engine.

There is a commandline frontend to the backend [console.php], which shows the status and allows to add things to be queried.

Crawling + Downloading + Indexing

The YouTube crawler + processor + Search indexer [worker.php] works by atomically de-queuing a work item (or 'query') and processing it. There can be more than one in parallel.

Needs:

This operation needs REDIS. Which is used for:

local cache for downloaded subtitles (key example: cc_v=n_mTiDeQvWg&lang=en&name=English&fmt=srv1)
remembers if a video was processed already (key example: use_Yqv3ebAFluQ) to stop further processing
atomic work orders and IPC:

work orders are in a Queue
executor processes use atomic process counter to know whether to run or not

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Frontend(s)

Crawling + Downloading + Indexing

Needs:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Frontend(s)

Crawling + Downloading + Indexing

Needs: