-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify scrapy-zyte-api usage via custom user-agent #130
Identify scrapy-zyte-api usage via custom user-agent #130
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably best to pass the user agent to AsyncClient
as a parameter instead, once python-zyte-api supports that.
@Gallaecio is it ok to change def __init__(self, *,
api_key=None,
api_url=API_URL,
n_conn=15,
retrying: Optional[AsyncRetrying] = None,
custom_user_agent=None
): |
Sorry, accidentally edited your comment instead of answering 🤦 Answer: I think so, yes. Maybe even call it just |
aha, ok, with this. @Gallaecio How do you think we can send user-agent from zyte-crawlers? I was thinking about adding user-agent for python-zyte-api here https://github.com/scrapy-plugins/scrapy-zyte-api/blob/main/scrapy_zyte_api/providers.py#L81 with meta (as only spider with zyte-crawlers go here), but probably it's better to send this right from zyte-crawlers repo with settings or else - any ideas? |
@Gallaecio could you take a look? |
@Gallaecio @kmike could you take a look? |
scrapy_zyte_api/utils.py
Outdated
@@ -1,6 +1,10 @@ | |||
from importlib.metadata import version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we can use it though, https://docs.python.org/3/library/importlib.metadata.html says it's 3.8+, while scrapy-zyte-api declares Python 3.7 support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scrapy-zyte-api declares Python 3.7 support
That has an easy fix :)
@kmike @Gallaecio @wRAR @BurnzZ could you take a look at the PR with last changes from the discussion about version? |
Codecov Report
@@ Coverage Diff @@
## main #130 +/- ##
=======================================
Coverage 98.81% 98.82%
=======================================
Files 9 10 +1
Lines 673 678 +5
=======================================
+ Hits 665 670 +5
Misses 8 8
|
I think it was OK to leave setup.py the way it was and have bump2version configuration edit an extra file instead of a different file, but I am OK with the new approach as well. The only thing left is Python 3.7 support, I think. |
|
Co-authored-by: Adrián Chaves <[email protected]>
Oh, actually, I've made it as in python-zyte-api just for consistency |
@wRAR what do you think will be the best option to fix it:
|
Goals:
This is one of the options to set custom user-agent for scrapy-zyte-api and keep for python-zyte-api. But it requires changes in client.py python-zyte-api https://github.com/zytedata/python-zyte-api/blob/main/zyte_api/aio/client.py#L60 like:
Some other options to provide the way to send user-agent from zyte-crawlers:
1.Use settings with name of package and set them during creating client (requires changes in AsyncClient)
2. Use cb_kwargs or meta - but looks like they are cleaned during the request process
It's good to discuss all above.