Download the files of all your courses in a brief
Redoit it, and only the new ones will be added;
All new files will be in the inbox folder 📥
Uses Scrapy's library.
Disclaimer: optimized for Fenix's pages, but easily customizable with other XPaths.
- Prepare the virtual environment
bash install.sh
- Create the file with credentials
touch istransferido/.env
echo USERNAME="istxxxxx" >> .env
echo PASSWORD="your_password" >> .env # have you heard about password managers?
# ⚠️ IMPORTANT ⚠️
# If this is used in another repo, create the .gitignore file with the content as follows:
*.env
venv/
- Chose which courses to download
nano config.yaml # maintain the base URL as used (it's the main link from each course page)
Download ALL the files + organize the inbox
# Run spider, RUN!
cd istransferido/ && scrapy crawl ist_spider && cd ../ && bash filter-inbox.sh
If the command fails for some reason (configuration, conflicts, ...), remember to go back the main directory of the project.
Otherwise, it won't work.
Feel free to change or contribute!
I leave some unimplemented ideas, for me or others :)
- Avoid repeated downloads without the files being in the folder (it uses bash for now because of the relatively small amount of files)
- An alternative for credentials in the .env
- Please, do not change the download delay to something that can create too much requests to the servers