job-scraper

A selenium based web scraper to automate job searching. Basic flow is as follows:

Setup selenium bot to open whatever company page you wish to track.
Use beautiful soup to extract information from the page.
Filter down to jobs you are interested in.
Create diff and send an email to notify recipients of new postings.

scraper.py contains selenium driver, code to direct selenium to company page, and code to download company job postings with beautiful soup.

logger.py instantiates scraper, gets job data, writes data to jobs.txt file to track diff, then emails recipients if there are new postings based on file diff.

download

git clone https://github.com/Aplank14/jobs.git

dependencies

pip install -r requirements.txt

env

Specify the following env vars. Recipients are comma delimited. When running locally, DEV should probably be set to True.

DEV='<True_or_False>'
EMAIL='<[email protected]>'
PASSWORD='<APP_PASSWORD_FOR_GMAIL>'
RECIPIENT_EMAILS='[email protected],[email protected]'

run

Running once is simple with:

python logger.py

To run without sending an email:

python logger.py -n

For development, set the environment variable DEV to True. Then, you can test a single scraper function name using the following:

python logger.py -f discord

To recieve email alerts you can set it to run periodically automatically on a local machine or Github actions. For local automation with Linux just use crontab. On Windows you can setup a scheduled task with Task Scheduler to run the logger automatically.

Create a new task in Task Scheduler
Set trigger to be whatever frequency you wish
Action should be "Start a program"
Program: python.exe
Arguments: C:\Users\path\to\jobs\logger.py
Start in: C:\Users\path\to\jobs\

It is even easier to run periodically using Github actions! All you have to do is fork this repository, create an environment called bot, then set the enviornment variables described above.

apply

😭

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
deny.txt		deny.txt
logger.py		logger.py
requirements.txt		requirements.txt
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

job-scraper

download

dependencies

env

run

apply

About

Languages

License

andyplank/jobs

Folders and files

Latest commit

History

Repository files navigation

job-scraper

download

dependencies

env

run

apply

About

Topics

Resources

License

Stars

Watchers

Forks

Languages