Skip to content

Commit

Permalink
Merge pull request #134 from thebigG/job_id
Browse files Browse the repository at this point in the history
-Provider names are used as a prefix for job ids now.
  • Loading branch information
thebigG authored Feb 16, 2021
2 parents e509ef4 + 728849f commit 446e9e0
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 1 deletion.
2 changes: 1 addition & 1 deletion jobfunnel/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"""JobFunnel base package init, we keep module version here.
"""
__version__ = '3.0.1'
__version__ = '3.0.2'
3 changes: 3 additions & 0 deletions jobfunnel/backend/scrapers/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,9 @@ def scrape_job(self, job_soup: BeautifulSoup, delay: float,
if job and not invalid_job:
try:
job.validate()
# Prefix the id with the scraper name to avoid key conflicts
new_key_id = job.provider + '_' + job.key_id
job.key_id = new_key_id
except Exception as err:
# Bad job scrapes can't take down execution!
# NOTE: desc too short etc, usually indicates that the job
Expand Down
5 changes: 5 additions & 0 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,11 @@ Open the master CSV file and update the per-job `status`:
```
funnel inline -h
```

# CAPTCHA
JobFunnel does not solve CAPTCHA. If, while scraping, you receive a
`Unable to extract jobs from initial search result page:\` error.
Then open that url on your browser and solve the CAPTCHA manually.

<!-- links -->
[requirements]:requirements.txt
Expand Down

0 comments on commit 446e9e0

Please sign in to comment.