Scrapy-palywright cannot start working if the reactor is already installed #131

alosultan · 2022-10-11T08:18:06Z

Python 3.9.13
Daphne 4.0.0
Django 4.1.2
Channels 4.0.0
Scrapy 2.7.0
scrapy-playwright 0.0.22

My settings:

DOWNLOAD_HANDLERS = {
    "http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
    "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}

TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"

My Scrapy app is running under another app (django-channels) that runs a reactor twisted.internet.asyncioreactor.AsyncioSelectorReactor in the process. Therefore, to run spiders by my custom Django management command, I use CrawlerRunner so as not to install a reactor that is already installed.

class Command(BaseCommand):
    help = 'Runs the specified spider'

    def add_arguments(self, parser):
        parser.add_argument('spider', type=str, help="The spider name that to be located, instanced, and crawled.")

    def handle(self, *args, **options):
        from twisted.internet import reactor
        configure_logging()

        runner = CrawlerRunner(settings=get_project_settings())
        d = runner.crawl(options['spider'])
        d.addBoth(lambda _: reactor.stop())
        reactor.run()

But in this case, Scrapy-palywright cannot start working. There is no line in logs like:

... [scrapy-playwright] INFO: Starting download handler

In order for Scrapy-palywright to start working properly, I have to:

Remove the already installed reactor:

if sys.modules.get("twisted.internet.reactor", False):
    del sys.modules["twisted.internet.reactor"]

Use CrawlerProcess which installs the appropriate reactor.

process = CrawlerProcess(settings=get_project_settings())
process.crawl(options['spider'])
process.start()

Is there any idea to continue using the already installed reactor?

The text was updated successfully, but these errors were encountered:

elacuesta · 2022-10-17T21:42:15Z

Hi, could you provide a minimal, reproducible example? I'm able to run a spider using the CrawlerRunner as described in the Scrapy docs:

import scrapy
from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
from twisted.internet.asyncioreactor import install as install_asyncio_reactor


class TestSpider(scrapy.Spider):
    name = "example"
    custom_settings = {
        "DOWNLOAD_HANDLERS": {
            "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
        },
    }

    def start_requests(self):
        yield scrapy.Request(url="https://example.org", meta={"playwright": True})

    def parse(self, response):
        yield {"url": response.url}


if __name__ == "__main__":
    install_asyncio_reactor()
    from twisted.internet import reactor

    configure_logging({"LOG_FORMAT": "%(levelname)s: %(message)s"})
    runner = CrawlerRunner()
    d = runner.crawl(TestSpider)
    d.addBoth(lambda _: reactor.stop())
    reactor.run()  # the script will block here until the crawling is finished

$ python examples/reactor.py
INFO: Overridden settings:
{}
2022-10-17 18:36:37 [scrapy.extensions.telnet] INFO: Telnet Password: c1f8e1c8505cbd6f
2022-10-17 18:36:37 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats']
2022-10-17 18:36:37 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-10-17 18:36:37 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-10-17 18:36:37 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-10-17 18:36:37 [scrapy.core.engine] INFO: Spider opened
2022-10-17 18:36:38 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-10-17 18:36:38 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-10-17 18:36:38 [scrapy-playwright] INFO: Starting download handler
2022-10-17 18:36:43 [scrapy-playwright] INFO: Launching browser chromium
2022-10-17 18:36:43 [scrapy-playwright] INFO: Browser chromium launched
2022-10-17 18:36:43 [scrapy-playwright] DEBUG: Browser context started: 'default' (persistent=False)
2022-10-17 18:36:43 [scrapy-playwright] DEBUG: [Context=default] New page created, page count is 1 (1 for all contexts)
2022-10-17 18:36:43 [scrapy-playwright] DEBUG: [Context=default] Request: <GET https://example.org/> (resource type: document, referrer: None)
2022-10-17 18:36:44 [scrapy-playwright] DEBUG: [Context=default] Response: <200 https://example.org/> (referrer: None)
2022-10-17 18:36:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://example.org> (referer: None) ['playwright']
2022-10-17 18:36:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://example.org/>
{'url': 'https://example.org/'}
2022-10-17 18:36:44 [scrapy.core.engine] INFO: Closing spider (finished)
2022-10-17 18:36:44 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 211,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 1600,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'elapsed_time_seconds': 6.194073,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2022, 10, 17, 21, 36, 44, 277270),
 'item_scraped_count': 1,
 'log_count/DEBUG': 6,
 'log_count/INFO': 13,
 'memusage/max': 58142720,
 'memusage/startup': 58142720,
 'playwright/context_count': 1,
 'playwright/context_count/max_concurrent': 1,
 'playwright/context_count/non-persistent': 1,
 'playwright/page_count': 1,
 'playwright/page_count/closed': 1,
 'playwright/page_count/max_concurrent': 1,
 'playwright/request_count': 1,
 'playwright/request_count/method/GET': 1,
 'playwright/request_count/navigation': 1,
 'playwright/request_count/resource_type/document': 1,
 'playwright/response_count': 1,
 'playwright/response_count/method/GET': 1,
 'playwright/response_count/resource_type/document': 1,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2022, 10, 17, 21, 36, 38, 83197)}
2022-10-17 18:36:44 [scrapy.core.engine] INFO: Spider closed (finished)
2022-10-17 18:36:44 [scrapy-playwright] INFO: Closing download handler
2022-10-17 18:36:44 [scrapy-playwright] DEBUG: Browser context closed: 'default' (persistent=False)
2022-10-17 18:36:44 [scrapy-playwright] INFO: Closing browser

alosultan · 2022-10-31T14:48:52Z

I created a Django project "channels-scrapy" with two applications:

Django app myapp, which is responsible for launching Scrapy spiders via Django custom command crawl:

myapp.management.commands.crawl.py

from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
from scrapy.utils.project import get_project_settings
from django.core.management.base import BaseCommand


class Command(BaseCommand):
    help = 'Runs the specified spider'

    def add_arguments(self, parser):
        parser.add_argument('spider', type=str, help="The spider name that to be located, instanced, and crawled.")

    def handle(self, *args, **options):
        # An asyncio Twisted reactor has already installed (AsyncioSelectorReactor object)
        from twisted.internet import reactor

        configure_logging()
        runner = CrawlerRunner(settings=get_project_settings())
        d = runner.crawl(options['spider'])
        d.addBoth(lambda _: reactor.stop())
        reactor.run()   # the script will block here until the crawling is finished

Scrapy app scrapy_app, which contains spiders. For this example, there is only one spider (TestSpider):

scrapy_app.spiders.py

import scrapy


class TestSpider(scrapy.Spider):
    name = "example"
    # If you comment these settings, then no problem will appear.
    custom_settings = {
        "DOWNLOAD_HANDLERS": {
            "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
        },
    }

    def start_requests(self):
        yield scrapy.Request(url="https://example.org", meta={"playwright": True})

    def parse(self, response, **kwargs):
        yield {"url": response.url}

scrapy_app.settings.py

BOT_NAME = 'scrapy_app'

SPIDER_MODULES = ['scrapy_app.spiders']
NEWSPIDER_MODULE = 'scrapy_app.spiders'

ROBOTSTXT_OBEY = True

# No need to this setting. The reactor will be already installed from outside.
# TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"

The INSTALLED_APPS list includes "daphne" app as mentioned in the channels documentation and 'myapp'.

INSTALLED_APPS = [
    "daphne",
    "django.contrib.admin",
    "django.contrib.auth",
    "django.contrib.contenttypes",
    "django.contrib.sessions",
    "django.contrib.messages",
    "django.contrib.staticfiles",
    "myapp",
]

In general Django "channels-scrapy" project looks like this:

./channels-scrapy
├── config
│   ├── __init__.py
│   ├── asgi.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── manage.py
├── myapp
│   ├── __init__.py
│   ├── apps.py
│   ├── management
│   │   ├── __init__.py
│   │   └── commands
│   │       ├── __init__.py
│   │       └── crawl.py
│   ├── migrations
│   │   └── __init__.py
│   └── views.py
├── scrapy.cfg
├── scrapy_app
   ├── __init__.py
   ├── items.py
   ├── middlewares.py
   ├── pipelines.py
   ├── settings.py
   └── spiders.py

Now when I run the spider example:

python manage.py crawl example

the application freezes and does not continue to work (note the line [asyncio] DEBUG: Using selector: KqueueSelector):

2022-10-31 14:33:10 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'scrapy_app',
 'NEWSPIDER_MODULE': 'scrapy_app.spiders',
 'ROBOTSTXT_OBEY': True,
 'SPIDER_MODULES': ['scrapy_app.spiders']}
2022-10-31 14:33:10 [scrapy.extensions.telnet] INFO: Telnet Password: 394a0b2b4debf964
2022-10-31 14:33:10 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats']
2022-10-31 14:33:10 [asyncio] DEBUG: Using selector: KqueueSelector <-------- it's strange here
2022-10-31 14:33:10 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-10-31 14:33:10 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-10-31 14:33:10 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-10-31 14:33:10 [scrapy.core.engine] INFO: Spider opened
2022-10-31 14:33:10 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-10-31 14:33:10 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023

But if you turn off playwright, then everything will work fine and the line [asyncio] DEBUG: Using selector: KqueueSelector will disappear:

2022-10-31 14:45:15 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'scrapy_app',
 'NEWSPIDER_MODULE': 'scrapy_app.spiders',
 'ROBOTSTXT_OBEY': True,
 'SPIDER_MODULES': ['scrapy_app.spiders']}
2022-10-31 14:45:15 [scrapy.extensions.telnet] INFO: Telnet Password: 0cb868371d556578
2022-10-31 14:45:15 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats']
2022-10-31 14:45:15 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-10-31 14:45:15 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-10-31 14:45:15 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-10-31 14:45:15 [scrapy.core.engine] INFO: Spider opened
2022-10-31 14:45:15 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-10-31 14:45:15 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-10-31 14:45:16 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://example.org/robots.txt> (referer: None)
2022-10-31 14:45:16 [protego] DEBUG: Rule at line 12 without any user agent to enforce it on.
.........
2022-10-31 14:45:16 [protego] DEBUG: Rule at line 43 without any user agent to enforce it on.
2022-10-31 14:45:16 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://example.org> (referer: None)
2022-10-31 14:45:16 [scrapy.core.scraper] DEBUG: Scraped from <200 https://example.org>
{'url': 'https://example.org'}
2022-10-31 14:45:16 [scrapy.core.engine] INFO: Closing spider (finished)
2022-10-31 14:45:16 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 432,
 'downloader/request_count': 2,
 'downloader/request_method_count/GET': 2,
 'downloader/response_bytes': 2034,
 'downloader/response_count': 2,
 'downloader/response_status_count/200': 1,
 'downloader/response_status_count/404': 1,
 'elapsed_time_seconds': 0.965107,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2022, 10, 31, 14, 45, 16, 728119),
 'httpcompression/response_bytes': 2512,
 'httpcompression/response_count': 2,
 'item_scraped_count': 1,
 'log_count/DEBUG': 17,
 'log_count/INFO': 10,
 'log_count/WARNING': 1,
 'memusage/max': 69152768,
 'memusage/startup': 69152768,
 'response_received_count': 2,
 'robotstxt/request_count': 1,
 'robotstxt/response_count': 1,
 'robotstxt/response_status_count/404': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2022, 10, 31, 14, 45, 15, 763012)}
2022-10-31 14:45:16 [scrapy.core.engine] INFO: Spider closed (finished)

Gallaecio · 2022-10-31T15:10:59Z

No need to this setting.

TWISTED_REACTOR is still needed, I think. Scrapy checks if the installed reactor matches the setting and complaints otherwise.

alosultan · 2022-10-31T16:25:32Z

Yes, (unlike CrawlerProcess that install and verify the reactor) the CrawlerRunner only checks if the installed reactor matches the settingTWISTED_REACTOR. So we can uncomment this setting just to check the installed reactor. But that still won't solve the problem.

elacuesta · 2022-10-31T16:27:47Z

Are you filtering some logs out? I see some DEBUG messages in your post, but Scrapy also logs the reactor (and event loop, if present) at the beginning of the crawl, like:

2022-10-31 13:16:39 [scrapy.crawler] INFO: Overridden settings:
{'EDITOR': 'nano',
 'SPIDER_LOADER_WARN_ONLY': True,
 'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'}
2022-10-31 13:16:39 [asyncio] DEBUG: Using selector: EpollSelector
2022-10-31 13:16:39 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor
2022-10-31 13:16:39 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.unix_events._UnixSelectorEventLoop

or

2022-10-31 13:17:19 [scrapy.crawler] INFO: Overridden settings:
{'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter',
 'EDITOR': 'nano',
 'LOGSTATS_INTERVAL': 0}
2022-10-31 13:17:19 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor

alosultan · 2022-10-31T16:38:00Z

'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'

Scrapy logs the reactor if the setting TWISTED_REACTOR is given.

I only filtered [py.warnings]

2022-10-31 16:22:51 [py.warnings] WARNING: /Users/alosultan/Development/Python/Django/channels-scrapy/venv/.envs/lib/python3.9/site-packages/scrapy/utils/request.py:231: ScrapyDeprecationWarning: '2.6' is a deprecated value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting.

It is also the default value. In other words, it is normal to get this warning if you have not defined a value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting. This is so for backward compatibility reasons, but it will change in a future version of Scrapy.

See the documentation of the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting for information on how to handle this deprecation.
  return cls(crawler)

alosultan · 2022-10-31T16:41:05Z

2022-10-31 14:33:10 [asyncio] DEBUG: Using selector: KqueueSelector <-------- it's strange here

What do you think about this DEBUG message?

if I disable playwright, then this message will disappear:

elacuesta · 2022-10-31T16:41:43Z

Scrapy logs the reactor if the setting TWISTED_REACTOR is given.

That's from the "Overriden settings" line, not the one from scrapy.utils.log which shows the actual reactor being used (https://github.com/scrapy/scrapy/blob/2.7.0/scrapy/utils/log.py#L157).

    def handle(self, *args, **options):
        # An asyncio Twisted reactor has already installed (AsyncioSelectorReactor object)
        from twisted.internet import reactor

I don't understand where this is installed. I'm not that familiar with channels, but I suppose it might give you a running asyncio loop. The Twisted reactor works on top of that, are you sure it's also being installed?

alosultan · 2022-10-31T16:52:52Z

I don't understand where this is installed. I'm not that familiar with channels, but I suppose it might give you a running asyncio loop. The Twisted reactor works on top of that, are you sure it's also being installed?

I checked as follows:

    def handle(self, *args, **options):
        current_reactor = sys.modules.get("twisted.internet.reactor", None)
        print(isinstance(current_reactor, asyncioreactor.AsyncioSelectorReactor))   # True
        print(current_reactor.running)  # False

        # An asyncio Twisted reactor has already installed (AsyncioSelectorReactor object)
        from twisted.internet import reactor

        configure_logging()
        runner = CrawlerRunner(settings=get_project_settings())
        d = runner.crawl(options['spider'])
        d.addBoth(lambda _: reactor.stop())
        reactor.run()   # the script will block here until the crawling is finished

elacuesta · 2022-10-31T17:01:38Z

How is it being installed? Where in the code is there something like the following?

from twisted.internet.asyncioreactor import install
install()

alosultan · 2022-10-31T17:14:08Z

How is it being installed? Where in the code is there something like the following?
from twisted.internet.asyncioreactor import install
install()

It being installed in the daphne.server.py module, which is imported in daphne.apps.py (Django app configuration module).

daphne.server.py

# This has to be done first as Twisted is import-order-sensitive with reactors
import asyncio  # isort:skip
import os  # isort:skip
import sys  # isort:skip
import warnings  # isort:skip
from concurrent.futures import ThreadPoolExecutor  # isort:skip
from twisted.internet import asyncioreactor  # isort:skip


twisted_loop = asyncio.new_event_loop()
if "ASGI_THREADS" in os.environ:
    twisted_loop.set_default_executor(
        ThreadPoolExecutor(max_workers=int(os.environ["ASGI_THREADS"]))
    )

current_reactor = sys.modules.get("twisted.internet.reactor", None)
if current_reactor is not None:
    if not isinstance(current_reactor, asyncioreactor.AsyncioSelectorReactor):
        warnings.warn(
            "Something has already installed a non-asyncio Twisted reactor. Attempting to uninstall it; "
            + "you can fix this warning by importing daphne.server early in your codebase or "
            + "finding the package that imports Twisted and importing it later on.",
            UserWarning,
        )
        del sys.modules["twisted.internet.reactor"]
        asyncioreactor.install(twisted_loop)
else:
    asyncioreactor.install(twisted_loop)

elacuesta added the needs more info label Oct 17, 2022

jeroenvermunt mentioned this issue Dec 6, 2022

scrapy.exceptions.NotSupported: Unsupported URL scheme 'https': The installed reactor (twisted.internet.epollreactor.EPollReactor) does not match the requested one (twisted.internet.asyncioreactor.AsyncioSelectorReactor) #145

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scrapy-palywright cannot start working if the reactor is already installed #131

Scrapy-palywright cannot start working if the reactor is already installed #131

alosultan commented Oct 11, 2022 •

edited

Loading

elacuesta commented Oct 17, 2022

alosultan commented Oct 31, 2022 •

edited

Loading

Gallaecio commented Oct 31, 2022

alosultan commented Oct 31, 2022

elacuesta commented Oct 31, 2022

alosultan commented Oct 31, 2022 •

edited

Loading

alosultan commented Oct 31, 2022 •

edited

Loading

elacuesta commented Oct 31, 2022

alosultan commented Oct 31, 2022 •

edited

Loading

elacuesta commented Oct 31, 2022

alosultan commented Oct 31, 2022 •

edited

Loading

Scrapy-palywright cannot start working if the reactor is already installed #131

Scrapy-palywright cannot start working if the reactor is already installed #131

Comments

alosultan commented Oct 11, 2022 • edited Loading

elacuesta commented Oct 17, 2022

alosultan commented Oct 31, 2022 • edited Loading

Gallaecio commented Oct 31, 2022

alosultan commented Oct 31, 2022

elacuesta commented Oct 31, 2022

alosultan commented Oct 31, 2022 • edited Loading

alosultan commented Oct 31, 2022 • edited Loading

elacuesta commented Oct 31, 2022

alosultan commented Oct 31, 2022 • edited Loading

elacuesta commented Oct 31, 2022

alosultan commented Oct 31, 2022 • edited Loading

alosultan commented Oct 11, 2022 •

edited

Loading

alosultan commented Oct 31, 2022 •

edited

Loading

alosultan commented Oct 31, 2022 •

edited

Loading

alosultan commented Oct 31, 2022 •

edited

Loading

alosultan commented Oct 31, 2022 •

edited

Loading

alosultan commented Oct 31, 2022 •

edited

Loading