Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-Provider names are used as a prefix for job ids now. #134

Merged
merged 3 commits into from
Feb 16, 2021

Conversation

thebigG
Copy link
Collaborator

@thebigG thebigG commented Feb 16, 2021

Hi everyone,

hope you are all doing well.

Description

Job ids are now created with the following format PROVIDER_ID. This should address #123. It partially addresses part of the issues discussed in #133. Hopefully #132 can move forward after this change.

Hopefully the changes make sense.

Context of change

Please add options that are relevant and mark any boxes that apply.

  • Software (software that runs on the PC)
  • Library (library that runs on the PC)
  • Tool (tool that assists coding development)
  • Other

Type of change

Please mark any boxes that apply.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • Ran pytest locally

Checklist:

Please mark any boxes that have been completed.

  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • Any dependent changes have been merged and published in downstream modules.

Copy link
Owner

@PaulMcInnis PaulMcInnis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets up the minor revision since this will require handling by end-user if they wish to continue a search across this update.

Additionally, have we confirmed that this resolves the large city issue? I can check on my end as well.

@thebigG
Copy link
Collaborator Author

thebigG commented Feb 16, 2021

#123 is a bit confusing, sorry about that. It looks like the original error was a CAPTCHA error. But looking at the conversation it looks like at some point the user did get the key conflict this PR solves. So I guess this PR solves half of #123, if you will. The other half is the CAPTCHA issue which is inevitable. Obviously CAPTCHA is something users will encounter no matter, for a myriad of reasons. We should probably document these CAPTCHA issues on the README, because at the end of the day it's an easy problem to solve; one can go on the browser and solve the CAPTCHA for Indeed/Monster/etc manually.

@PaulMcInnis
Copy link
Owner

PaulMcInnis commented Feb 16, 2021 via email

@thebigG
Copy link
Collaborator Author

thebigG commented Feb 16, 2021

Right, maybe we can add a readme for that issue and then this can close that issue for now.

Just to confirm; do you want to modify the current readme or add a new document to JobFunnel explaining how to handle CAPTCHA? I was thinking of adding a section to the current readme called CAPTCHA which is very brief?

Now that I think about it, I remember you mentioning in the past that you want to keep the readme as short as possible. With that said, how about adding all of this CAPTCHA documentation to the wiki?

In the wiki, can even write a little tutorial on how to solve the CAPTCHA for JobFunnel.

@PaulMcInnis
Copy link
Owner

PaulMcInnis commented Feb 16, 2021 via email

@PaulMcInnis
Copy link
Owner

We may want to ensure we return the failed scraping url in the error message if we dont already

@thebigG
Copy link
Collaborator Author

thebigG commented Feb 16, 2021

We may want to ensure we return the failed scraping url in the error message if we dont already

It looks like we do for Monster and Indeed, which are the ones supported at the moment:

        if not num_res:
            raise ValueError(
                "Unable to identify number of pages of results for query: {}"
                " Please ensure linked page contains results, you may have"
                " provided a city for which there are no results within this"
                " province or state.".format(search_url)
            )

@PaulMcInnis
Copy link
Owner

Ok looks good, when we release next version we should note this change, thanks @thebigG !

@thebigG thebigG merged commit 446e9e0 into PaulMcInnis:master Feb 16, 2021
@thebigG thebigG mentioned this pull request Mar 20, 2021
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants