Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrapy-palywright cannot start working if the reactor is already installed #131

Open
alosultan opened this issue Oct 11, 2022 · 11 comments
Open

Comments

@alosultan
Copy link

alosultan commented Oct 11, 2022

Python 3.9.13
Daphne 4.0.0
Django 4.1.2
Channels 4.0.0
Scrapy 2.7.0
scrapy-playwright 0.0.22

My settings:

DOWNLOAD_HANDLERS = {
    "http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
    "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}

TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"

My Scrapy app is running under another app (django-channels) that runs a reactor twisted.internet.asyncioreactor.AsyncioSelectorReactor in the process. Therefore, to run spiders by my custom Django management command, I use CrawlerRunner so as not to install a reactor that is already installed.

class Command(BaseCommand):
    help = 'Runs the specified spider'

    def add_arguments(self, parser):
        parser.add_argument('spider', type=str, help="The spider name that to be located, instanced, and crawled.")

    def handle(self, *args, **options):
        from twisted.internet import reactor
        configure_logging()

        runner = CrawlerRunner(settings=get_project_settings())
        d = runner.crawl(options['spider'])
        d.addBoth(lambda _: reactor.stop())
        reactor.run()

But in this case, Scrapy-palywright cannot start working. There is no line in logs like:

... [scrapy-playwright] INFO: Starting download handler

In order for Scrapy-palywright to start working properly, I have to:

  1. Remove the already installed reactor:
if sys.modules.get("twisted.internet.reactor", False):
    del sys.modules["twisted.internet.reactor"]
  1. Use CrawlerProcess which installs the appropriate reactor.
process = CrawlerProcess(settings=get_project_settings())
process.crawl(options['spider'])
process.start()

Is there any idea to continue using the already installed reactor?

@elacuesta
Copy link
Member

Hi, could you provide a minimal, reproducible example? I'm able to run a spider using the CrawlerRunner as described in the Scrapy docs:

import scrapy
from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
from twisted.internet.asyncioreactor import install as install_asyncio_reactor


class TestSpider(scrapy.Spider):
    name = "example"
    custom_settings = {
        "DOWNLOAD_HANDLERS": {
            "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
        },
    }

    def start_requests(self):
        yield scrapy.Request(url="https://example.org", meta={"playwright": True})

    def parse(self, response):
        yield {"url": response.url}


if __name__ == "__main__":
    install_asyncio_reactor()
    from twisted.internet import reactor

    configure_logging({"LOG_FORMAT": "%(levelname)s: %(message)s"})
    runner = CrawlerRunner()
    d = runner.crawl(TestSpider)
    d.addBoth(lambda _: reactor.stop())
    reactor.run()  # the script will block here until the crawling is finished
$ python examples/reactor.py
INFO: Overridden settings:
{}
2022-10-17 18:36:37 [scrapy.extensions.telnet] INFO: Telnet Password: c1f8e1c8505cbd6f
2022-10-17 18:36:37 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats']
2022-10-17 18:36:37 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-10-17 18:36:37 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-10-17 18:36:37 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-10-17 18:36:37 [scrapy.core.engine] INFO: Spider opened
2022-10-17 18:36:38 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-10-17 18:36:38 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-10-17 18:36:38 [scrapy-playwright] INFO: Starting download handler
2022-10-17 18:36:43 [scrapy-playwright] INFO: Launching browser chromium
2022-10-17 18:36:43 [scrapy-playwright] INFO: Browser chromium launched
2022-10-17 18:36:43 [scrapy-playwright] DEBUG: Browser context started: 'default' (persistent=False)
2022-10-17 18:36:43 [scrapy-playwright] DEBUG: [Context=default] New page created, page count is 1 (1 for all contexts)
2022-10-17 18:36:43 [scrapy-playwright] DEBUG: [Context=default] Request: <GET https://example.org/> (resource type: document, referrer: None)
2022-10-17 18:36:44 [scrapy-playwright] DEBUG: [Context=default] Response: <200 https://example.org/> (referrer: None)
2022-10-17 18:36:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://example.org> (referer: None) ['playwright']
2022-10-17 18:36:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://example.org/>
{'url': 'https://example.org/'}
2022-10-17 18:36:44 [scrapy.core.engine] INFO: Closing spider (finished)
2022-10-17 18:36:44 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 211,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 1600,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'elapsed_time_seconds': 6.194073,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2022, 10, 17, 21, 36, 44, 277270),
 'item_scraped_count': 1,
 'log_count/DEBUG': 6,
 'log_count/INFO': 13,
 'memusage/max': 58142720,
 'memusage/startup': 58142720,
 'playwright/context_count': 1,
 'playwright/context_count/max_concurrent': 1,
 'playwright/context_count/non-persistent': 1,
 'playwright/page_count': 1,
 'playwright/page_count/closed': 1,
 'playwright/page_count/max_concurrent': 1,
 'playwright/request_count': 1,
 'playwright/request_count/method/GET': 1,
 'playwright/request_count/navigation': 1,
 'playwright/request_count/resource_type/document': 1,
 'playwright/response_count': 1,
 'playwright/response_count/method/GET': 1,
 'playwright/response_count/resource_type/document': 1,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2022, 10, 17, 21, 36, 38, 83197)}
2022-10-17 18:36:44 [scrapy.core.engine] INFO: Spider closed (finished)
2022-10-17 18:36:44 [scrapy-playwright] INFO: Closing download handler
2022-10-17 18:36:44 [scrapy-playwright] DEBUG: Browser context closed: 'default' (persistent=False)
2022-10-17 18:36:44 [scrapy-playwright] INFO: Closing browser

@alosultan
Copy link
Author

alosultan commented Oct 31, 2022

I created a Django project "channels-scrapy" with two applications:

  1. Django app myapp, which is responsible for launching Scrapy spiders via Django custom command crawl:

myapp.management.commands.crawl.py

from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
from scrapy.utils.project import get_project_settings
from django.core.management.base import BaseCommand


class Command(BaseCommand):
    help = 'Runs the specified spider'

    def add_arguments(self, parser):
        parser.add_argument('spider', type=str, help="The spider name that to be located, instanced, and crawled.")

    def handle(self, *args, **options):
        # An asyncio Twisted reactor has already installed (AsyncioSelectorReactor object)
        from twisted.internet import reactor

        configure_logging()
        runner = CrawlerRunner(settings=get_project_settings())
        d = runner.crawl(options['spider'])
        d.addBoth(lambda _: reactor.stop())
        reactor.run()   # the script will block here until the crawling is finished
  1. Scrapy app scrapy_app, which contains spiders. For this example, there is only one spider (TestSpider):

scrapy_app.spiders.py

import scrapy


class TestSpider(scrapy.Spider):
    name = "example"
    # If you comment these settings, then no problem will appear.
    custom_settings = {
        "DOWNLOAD_HANDLERS": {
            "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
        },
    }

    def start_requests(self):
        yield scrapy.Request(url="https://example.org", meta={"playwright": True})

    def parse(self, response, **kwargs):
        yield {"url": response.url}

scrapy_app.settings.py

BOT_NAME = 'scrapy_app'

SPIDER_MODULES = ['scrapy_app.spiders']
NEWSPIDER_MODULE = 'scrapy_app.spiders'

ROBOTSTXT_OBEY = True

# No need to this setting. The reactor will be already installed from outside.
# TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"

The INSTALLED_APPS list includes "daphne" app as mentioned in the channels documentation and 'myapp'.

INSTALLED_APPS = [
    "daphne",
    "django.contrib.admin",
    "django.contrib.auth",
    "django.contrib.contenttypes",
    "django.contrib.sessions",
    "django.contrib.messages",
    "django.contrib.staticfiles",
    "myapp",
]

In general Django "channels-scrapy" project looks like this:

./channels-scrapy
├── config
│   ├── __init__.py
│   ├── asgi.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── manage.py
├── myapp
│   ├── __init__.py
│   ├── apps.py
│   ├── management
│   │   ├── __init__.py
│   │   └── commands
│   │       ├── __init__.py
│   │       └── crawl.py
│   ├── migrations
│   │   └── __init__.py
│   └── views.py
├── scrapy.cfg
├── scrapy_app
   ├── __init__.py
   ├── items.py
   ├── middlewares.py
   ├── pipelines.py
   ├── settings.py
   └── spiders.py

Now when I run the spider example:

python manage.py crawl example

the application freezes and does not continue to work (note the line [asyncio] DEBUG: Using selector: KqueueSelector):

2022-10-31 14:33:10 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'scrapy_app',
 'NEWSPIDER_MODULE': 'scrapy_app.spiders',
 'ROBOTSTXT_OBEY': True,
 'SPIDER_MODULES': ['scrapy_app.spiders']}
2022-10-31 14:33:10 [scrapy.extensions.telnet] INFO: Telnet Password: 394a0b2b4debf964
2022-10-31 14:33:10 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats']
2022-10-31 14:33:10 [asyncio] DEBUG: Using selector: KqueueSelector <-------- it's strange here
2022-10-31 14:33:10 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-10-31 14:33:10 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-10-31 14:33:10 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-10-31 14:33:10 [scrapy.core.engine] INFO: Spider opened
2022-10-31 14:33:10 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-10-31 14:33:10 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023

But if you turn off playwright, then everything will work fine and the line [asyncio] DEBUG: Using selector: KqueueSelector will disappear:

2022-10-31 14:45:15 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'scrapy_app',
 'NEWSPIDER_MODULE': 'scrapy_app.spiders',
 'ROBOTSTXT_OBEY': True,
 'SPIDER_MODULES': ['scrapy_app.spiders']}
2022-10-31 14:45:15 [scrapy.extensions.telnet] INFO: Telnet Password: 0cb868371d556578
2022-10-31 14:45:15 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats']
2022-10-31 14:45:15 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-10-31 14:45:15 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-10-31 14:45:15 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-10-31 14:45:15 [scrapy.core.engine] INFO: Spider opened
2022-10-31 14:45:15 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-10-31 14:45:15 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-10-31 14:45:16 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://example.org/robots.txt> (referer: None)
2022-10-31 14:45:16 [protego] DEBUG: Rule at line 12 without any user agent to enforce it on.
.........
2022-10-31 14:45:16 [protego] DEBUG: Rule at line 43 without any user agent to enforce it on.
2022-10-31 14:45:16 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://example.org> (referer: None)
2022-10-31 14:45:16 [scrapy.core.scraper] DEBUG: Scraped from <200 https://example.org>
{'url': 'https://example.org'}
2022-10-31 14:45:16 [scrapy.core.engine] INFO: Closing spider (finished)
2022-10-31 14:45:16 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 432,
 'downloader/request_count': 2,
 'downloader/request_method_count/GET': 2,
 'downloader/response_bytes': 2034,
 'downloader/response_count': 2,
 'downloader/response_status_count/200': 1,
 'downloader/response_status_count/404': 1,
 'elapsed_time_seconds': 0.965107,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2022, 10, 31, 14, 45, 16, 728119),
 'httpcompression/response_bytes': 2512,
 'httpcompression/response_count': 2,
 'item_scraped_count': 1,
 'log_count/DEBUG': 17,
 'log_count/INFO': 10,
 'log_count/WARNING': 1,
 'memusage/max': 69152768,
 'memusage/startup': 69152768,
 'response_received_count': 2,
 'robotstxt/request_count': 1,
 'robotstxt/response_count': 1,
 'robotstxt/response_status_count/404': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2022, 10, 31, 14, 45, 15, 763012)}
2022-10-31 14:45:16 [scrapy.core.engine] INFO: Spider closed (finished)

@Gallaecio
Copy link
Contributor

No need to this setting.

TWISTED_REACTOR is still needed, I think. Scrapy checks if the installed reactor matches the setting and complaints otherwise.

@alosultan
Copy link
Author

Yes, (unlike CrawlerProcess that install and verify the reactor) the CrawlerRunner only checks if the installed reactor matches the settingTWISTED_REACTOR. So we can uncomment this setting just to check the installed reactor. But that still won't solve the problem.

@elacuesta
Copy link
Member

Are you filtering some logs out? I see some DEBUG messages in your post, but Scrapy also logs the reactor (and event loop, if present) at the beginning of the crawl, like:

2022-10-31 13:16:39 [scrapy.crawler] INFO: Overridden settings:
{'EDITOR': 'nano',
 'SPIDER_LOADER_WARN_ONLY': True,
 'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'}
2022-10-31 13:16:39 [asyncio] DEBUG: Using selector: EpollSelector
2022-10-31 13:16:39 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor
2022-10-31 13:16:39 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.unix_events._UnixSelectorEventLoop

or

2022-10-31 13:17:19 [scrapy.crawler] INFO: Overridden settings:
{'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter',
 'EDITOR': 'nano',
 'LOGSTATS_INTERVAL': 0}
2022-10-31 13:17:19 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor

@alosultan
Copy link
Author

alosultan commented Oct 31, 2022

'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'

Scrapy logs the reactor if the setting TWISTED_REACTOR is given.

I only filtered [py.warnings]

2022-10-31 16:22:51 [py.warnings] WARNING: /Users/alosultan/Development/Python/Django/channels-scrapy/venv/.envs/lib/python3.9/site-packages/scrapy/utils/request.py:231: ScrapyDeprecationWarning: '2.6' is a deprecated value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting.

It is also the default value. In other words, it is normal to get this warning if you have not defined a value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting. This is so for backward compatibility reasons, but it will change in a future version of Scrapy.

See the documentation of the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting for information on how to handle this deprecation.
  return cls(crawler)

@alosultan
Copy link
Author

alosultan commented Oct 31, 2022

2022-10-31 14:33:10 [asyncio] DEBUG: Using selector: KqueueSelector <-------- it's strange here

What do you think about this DEBUG message?

if I disable playwright, then this message will disappear:

@elacuesta
Copy link
Member

Scrapy logs the reactor if the setting TWISTED_REACTOR is given.

That's from the "Overriden settings" line, not the one from scrapy.utils.log which shows the actual reactor being used (https://github.com/scrapy/scrapy/blob/2.7.0/scrapy/utils/log.py#L157).

    def handle(self, *args, **options):
        # An asyncio Twisted reactor has already installed (AsyncioSelectorReactor object)
        from twisted.internet import reactor

I don't understand where this is installed. I'm not that familiar with channels, but I suppose it might give you a running asyncio loop. The Twisted reactor works on top of that, are you sure it's also being installed?

@alosultan
Copy link
Author

alosultan commented Oct 31, 2022

I don't understand where this is installed. I'm not that familiar with channels, but I suppose it might give you a running asyncio loop. The Twisted reactor works on top of that, are you sure it's also being installed?

I checked as follows:

    def handle(self, *args, **options):
        current_reactor = sys.modules.get("twisted.internet.reactor", None)
        print(isinstance(current_reactor, asyncioreactor.AsyncioSelectorReactor))   # True
        print(current_reactor.running)  # False

        # An asyncio Twisted reactor has already installed (AsyncioSelectorReactor object)
        from twisted.internet import reactor

        configure_logging()
        runner = CrawlerRunner(settings=get_project_settings())
        d = runner.crawl(options['spider'])
        d.addBoth(lambda _: reactor.stop())
        reactor.run()   # the script will block here until the crawling is finished

@elacuesta
Copy link
Member

How is it being installed? Where in the code is there something like the following?

from twisted.internet.asyncioreactor import install
install()

@alosultan
Copy link
Author

alosultan commented Oct 31, 2022

How is it being installed? Where in the code is there something like the following?

from twisted.internet.asyncioreactor import install
install()

It being installed in the daphne.server.py module, which is imported in daphne.apps.py (Django app configuration module).

daphne.server.py

# This has to be done first as Twisted is import-order-sensitive with reactors
import asyncio  # isort:skip
import os  # isort:skip
import sys  # isort:skip
import warnings  # isort:skip
from concurrent.futures import ThreadPoolExecutor  # isort:skip
from twisted.internet import asyncioreactor  # isort:skip


twisted_loop = asyncio.new_event_loop()
if "ASGI_THREADS" in os.environ:
    twisted_loop.set_default_executor(
        ThreadPoolExecutor(max_workers=int(os.environ["ASGI_THREADS"]))
    )

current_reactor = sys.modules.get("twisted.internet.reactor", None)
if current_reactor is not None:
    if not isinstance(current_reactor, asyncioreactor.AsyncioSelectorReactor):
        warnings.warn(
            "Something has already installed a non-asyncio Twisted reactor. Attempting to uninstall it; "
            + "you can fix this warning by importing daphne.server early in your codebase or "
            + "finding the package that imports Twisted and importing it later on.",
            UserWarning,
        )
        del sys.modules["twisted.internet.reactor"]
        asyncioreactor.install(twisted_loop)
else:
    asyncioreactor.install(twisted_loop)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants