Add special "session" option to RequestOptions & EnqueueLinksOptions or session/cookie sharing for new requests #1699
Replies: 3 comments 2 replies
-
Thanks a lot. I definitely agree that SessionPool needs to get more flexible and customizable, especially in assigning sessions to specific requests. But we should do a bigger think through since there was no big update since the original implementation. Here are some other ideas - #1573 |
Beta Was this translation helpful? Give feedback.
-
Your idea has a potential problem: What would happen when after you enqueue the session, it is used is some other context and it is retired before the new request gets processed? |
Beta Was this translation helpful? Give feedback.
-
I will add to this, that the plan for next year is to have a user pool - you would specify how many user sessions you want to generate, and add some constraints around them (e.g. fingerprint options), that would then first generate all the virtual user sessions together with their fingerprints, and this pool would be then used instead. I guess with this setup, the |
Beta Was this translation helpful? Give feedback.
-
Hi, great framework but i've found some kind of not obvious ideas behind of its architecture (or this is just my personal misunderstanding)
so, is it possible (and how) to use only one session (with its cookies, selected user-agent, userData etc) which has been initialized and settled updated after previous response for all the following requests i'll enqueue in requests handlers in crawling process?
as i could understand the BasicCrawler creates new session (or picks any random) for each new request from RequestQueue, and this is not accepted behaviuor in case if we have to use some preconditions for next requests (auth and some other monitoring & tracking cookies, selected User-Agent, all the other headers, previously selected and binded "fingerprints" with the session etc)
so, i'd like to ask to add this important feature - ability to reuse current session in new requests.
IMO the most handy way to do it - provide
session
option forRequestOptions
&EnqueueLinksOptions
arguments, for example:here we explicitly use the received session from previous response to enqueue new links (with
ctx.enqueueLinks
orcrawler.addRequest
) to fetch.Thanks!
Beta Was this translation helpful? Give feedback.
All reactions