Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple from Django/Flask #37

Closed
jvanasco opened this issue Feb 12, 2014 · 28 comments
Closed

Decouple from Django/Flask #37

jvanasco opened this issue Feb 12, 2014 · 28 comments

Comments

@jvanasco
Copy link

this looks like an interesting improvement to celery , but the django/flask requirement is annoyingly limiting.

since the only thing you rely on from these packages is their cache, have you considered using a standalone caching library like Dogpile ( https://pypi.python.org/pypi/dogpile.cache ) ?

@winhamwr
Copy link
Contributor

Hello!

the django/flask requirement is annoyingly limiting.

I certainly agree. I would much prefer a solution that works easily with Django and Flask, but that doesn't require them. I would absolutely like to support a general python caching option, if one exists.

Looking at Dogpile, it appears to be a specialized caching library. Even so, if Dogpile is what you're currently using and you'd like to use Jobtastic, then I'd be very happy to accept a pull request that either allows the necessary hooks to add support, or adds it as a specific option.

Thanks
-Wes

@winhamwr
Copy link
Contributor

It looks like at least one other person shares your interest via #36. How do you feel about the Beaker library? I'm still interested in Dogpile, if that's your preferred alternative, but Beaker might kill two birds with one stone.

@jvanasco
Copy link
Author

Beaker is bad and in maintenance mode, with a recommendation to update.

Dogpile is basically Beaker 2.0, without cookie support. It was written by one of the two core authors.

One of the roadblocks in trying to write a patch ties into issues #8 and #10 -- it looks like you're just piggybacking onto the cache namespaces of Django and Flask, and expecting the application to have set everything up. I was having a hard time wrapping my head around a good way to write a patch.

That gets really messy when thinking about how to support other frameworks. I think it would be cleaner if you had something like a 'CachingApi' object that would get instantiated on setup, then just proxy requests through that. If someone is using Django/Flask, it might behave in a similar way. But if someone is using Pyramid,Bottle,CherryPy,Tornado,etc/etc/etc, the library could set up a dogpile cache and send requests into it.

@winhamwr
Copy link
Contributor

Dogpile is basically Beaker 2.0, without cookie support. It was written by one of the two core authors.

Ah, cool. Thanks for the insight. I have an (apologetically) Django-centric knowledge of python web development, since Django and Flask have met all of my needs.

it looks like you're just piggybacking onto the cache namespaces of Django and Flask

Yup. The current behavior is absolutely non-ideal in that regard. It's nice from the perspective that if your Django or Flask app is already working with Celery, you have no additional configuration, though.

I think it would be cleaner if you had something like a 'CachingApi' object that would get instantiated on setup

Ultimately, I think that is the best idea. By default, I do think we should follow the advice of #10 and piggy-back off of the CELERY_RESULT_BACKEND, but we should allow users to pass jobtastic configurations to the Celery app. Something like:

# your_app.py
from celery import Celery

app = Celery()
app.config_from_object('celeryconfig')

and then

# celeryconfig.py

CELERY_ENABLE_UTC = True
CELERY_TIMEZONE = 'Europe/London'
# Available options mirror the CELERY_RESULT_BACKEND option
JOBTASTIC_CACHE = 'cache+memcached://127.0.0.1:11211/'

Then, inside JobtasticTask, maybe we piggyback on the Celery app object to store a jobtastic_cache object using celery.app.app_or_default and their backend loading. Something like:

from celery.backends import get_backend_by_url

app = celery.app.app_or_default()
if not hasattr(app, 'jobtastic_result_backend'):
    # Initialize the cache object using Celery's code
    jobtastic_cache = get_backend_by_url(app.conf.JOBTASTIC_CACHE)
    app.jobtastic_cache = jobtastic_cache

Later, when we need to use what is now the cache object, like in this case:

app.jobtastic_cache.set('foo', 'bar')

Thoughts?

@jvanasco
Copy link
Author

i like the idea of custom configurations via the jobtastic cache. i didn't like the idea at first, as we use postgres for the result backends and I'd rather use memcached/redis for this... but the custom config solves that.

@winhamwr
Copy link
Contributor

we use postgres for the result backends and I'd rather use memcached/redis for this

Yup. I bet that is a common situation. #35 is another similar problem this would help. I think configuration like this might even make #2 easier.

The earliest time I would have to start working on this would be next week, but that's not very likely. I am very excited about this idea, though and thanks for walking through it with me. Pull requests are certainly welcome :)

-Wes

@rhunwicks
Copy link
Contributor

@winhamwr I need to get JobTastic to use a separate cache from the Django default one because we want to switch our default cache from shared redis to local redis for performance reasons. This seems like the best approach to fixing it. Are you still interested in a pull request?

@rhunwicks
Copy link
Contributor

One problem with using the Celery Result Backend(s) as the cache is that the Result Backend doesn't expose the expiry/timeout as a parameter to set() - it is set for the Backend instance as a whole and read from self.expires, and you can't use the underlying client very easily because of the different syntax that the Memcache and Redis clients use.

@winhamwr
Copy link
Contributor

Are you still interested in a pull request?

Very much so, yes.

Does the jobtastic_cache API I proposed seem like it would meet your use case?

@rhunwicks
Copy link
Contributor

Yes, perfectly - if I can work out how to use the Celery Backends with a custom expiry on each set() command.

@rhunwicks
Copy link
Contributor

After examining the Celery Backends code, I have concluded that:

  1. We can't use the Celery Backend without changing the Celery to allow the expiry to be passed to set() as a parameter
  2. Even if we could get that change accepted upstream, it would make JobTastic dependent on very recent Celery, and would probably inconvenience a lot of existing users.
  3. Therefore reusing the Celery backends, however attractive, is not going to work.

I think our next best alternative is to create a jobtastic.backends.BaseBackend that defines an API suitable for use by JobTastic (get/set/delete/lock (as a context manager)) and some specific implementations (Django Cache or Flask with Redis or Memcache) and a way to set the correct one using the Celery app.conf.

That way, we can deliver the existing functionality with a better API to allow users to provide a custom cache class if necessary.

@rhunwicks
Copy link
Contributor

rhunwicks commented Jul 12, 2016

@winhamwr given that we can't use the Celery backends directly because we can't control the timeout on the set(), I have create a new cache module. It contains get_cache which attempts to find a cache from the Celery app.conf - using a JOBTASTIC_CACHE setting if it exists but falling back to the Django default cache if Django is installed, or the default Celery backend if Werkzeug is installed and the Celery default backend is using Memcached or Redis. This is roughly the same as the existing functionality (except that Werkzeug + Redis should work).

Because the JOBTASTIC_CACHE setting can be a subclass of jobtastic.cache.BaseCache it should be easy for users to add new cache providers.

There is also a jobtastic.cache.WrappedCache which should be capable of wrapping any cache provider that supports get/set (with timeout)/delete and either an atomic add or a distributed lock.

As part of this I have incorporated the locking mechanism described in #6.

I also think it would help with #59, although if Task-level cache busting is required then the solution there is preferable and could also be incorporated.

My initial code is at kimetrica@8fe5133 - please can you take a look and give some feedback. I'll look at tests, etc. once I know you are happy with the general approach.

@winhamwr
Copy link
Contributor

@rhunwicks your approach sounds perfect. A great combination of backwards-compatibility and forward flexibility. I also like the movement of the locking logic to the cache module. 👍 from me

@rhunwicks
Copy link
Contributor

@winhamwr should all tox tests pass? Even running in @master I get errors on all the py27-celery31 test runs, regardless of the Django version.

@winhamwr
Copy link
Contributor

should all tox tests pass?

Only the combinations specified in .travis.yml should pass, but I think that should include py27-celery31.

At this point, there's no reason to keep supporting all of the super-old versions of Django and celery and I'm sorry that you've had to deal with that. I'll see if I can throw together a PR to drop support for py2.6 and Django before 1.8 (the current LTS).

@winhamwr
Copy link
Contributor

Looks like there are some non-obvious problems with the build. I'll have to take another run at fixing it tomorrow. I have #62 going as a PR to drop support for all of the old versions.

@rhunwicks
Copy link
Contributor

I'd prefer to leave support for Django 1.6 / Python 2.7 for the time being - and I have coded this issue such that both 1.6 and 1.7+ should work without any deprecation warning.

@winhamwr
Copy link
Contributor

I'd prefer to leave support for Django 1.6 / Python 2.7 for the time being

That's enough to convince me. Is dropping support for Celery earlier than 3.1 acceptable?

@rhunwicks
Copy link
Contributor

It is to me.

On Thu, Jul 14, 2016 at 9:36 PM, Wes Winham [email protected]
wrote:

I'd prefer to leave support for Django 1.6 / Python 2.7 for the time being

That's enough to convince me. Is dropping support for Celery earlier than
3.1 acceptable?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#37 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABQav_FvUqLPuJ73W_xwtCOH_5DAUq4fks5qVo_ZgaJpZM4BhL2Y
.

@winhamwr
Copy link
Contributor

If you merge master in to your branch, you should now have a passing tox run. Sorry about that.

@rhunwicks
Copy link
Contributor

@winhamwr I've merged master into my branch and all tox runs pass.

Do you have any ideas on how to write tests to check that the cache config is being picked up?

I have tried to write tests with CELERY_ALWAYS_EAGER=False so that I can inspect the cache to see that the default cache is empty, and the cache specified by JOBTASTIC_CACHE contains the thundering herd key, but doing so leaves stuff in the cache and makes the other tests fail.

Because of the way tox runs I don't seem to be able to drop into pdb during a test run. Can you give me a pointer on how to run the tests outside of tox.

@rhunwicks
Copy link
Contributor

As an example, here is a new file jobtastic/tests/test_settings.py that should check that the Django default cache is used by default, and that the JOBTASTIC_CACHE setting can be used to override the cache used.

from __future__ import print_function
from unittest import skipIf
from celery import Celery, states
import django
from django.conf import settings
from django.core.cache import caches
from django.test import TestCase, override_settings
from .test_broker_fallbacks import ParrotTask


@skipIf(django.VERSION < (1, 7),
        "Django < 1.7 doesn't support django.core.cache.caches")
class DjangoSettingsTestCase(TestCase):

    def setUp(self):
        self.task = ParrotTask
        self.kwargs = {'result': 42}
        self.cache_key = self.task._get_cache_key(**self.kwargs)

    def test_sanity(self):
        # The task actually runs
        with self.settings(CELERY_ALWAYS_EAGER=True):
            async_task = self.task.delay(result=1)
        self.assertEqual(async_task.status, states.SUCCESS)
        self.assertEqual(async_task.result, 1)

    @override_settings(CELERY_ALWAYS_EAGER=False)
    def test_default_django_cache(self):
        app = Celery()
        app.config_from_object(settings)
        ParrotTask.bind(app)
        app.finalize()
        async_task = self.task.delay(**self.kwargs)
        self.assertEqual(async_task.status, states.PENDING)
        self.assertTrue('lock:%s' % self.cache_key in caches['default'])

    @override_settings(CELERY_ALWAYS_EAGER=False,
                       JOBTASTIC_CACHE='shared',
                       CACHES = {'default': {'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
                                             'LOCATION': 'default'},
                                 'shared': {'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
                                             'LOCATION': 'shared'}})
    def test_custom_django_cache(self):
        app = Celery()
        app.config_from_object(settings)
        ParrotTask.bind(app)
        app.finalize()
        async_task = self.task.delay(**self.kwargs)
        self.assertEqual(async_task.status, states.PENDING)
        self.assertTrue('lock:%s' % self.cache_key in caches['shared'])
        self.assertTrue('lock:%s' % self.cache_key not in caches['default'])

If I run this test file against current master, I am expecting the test for the JOBTASTIC_CACHE setting to fail (because that code isn't in master), but everything else should work.

If I run this code manually, it works:

source .tox/py35-django1.8.X-celery3.1.X/bin/activate
cd test_projects/django/
DJANGO_SETTINGS_MODULE=testproj.settings nosetests ../../jobtastic/tests  --verbosity=3

...

FAIL: test_custom_django_cache (jobtastic.tests.test_settings.DjangoSettingsTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/roger/git/jobtastic/.tox/py27-django1.8.X-celery3.1.X/local/lib/python2.7/site-packages/django/test/utils.py", line 196, in inner
    return test_func(*args, **kwargs)
  File "/home/roger/git/jobtastic/jobtastic/tests/test_settings.py", line 50, in test_custom_django_cache
    self.assertTrue('lock:%s' % self.cache_key in caches['shared'])
AssertionError: False is not true

...

Ran 23 tests in 3.997s

FAILED (failures=1)

But if I run it using tox it all fails:

tox -e py27-django1.8.X-celery3.1.XGLOB sdist-make: /home/roger/git/jobtastic/setup.py
py27-django1.8.X-celery3.1.X inst-nodeps: /home/roger/git/jobtastic/.tox/dist/jobtastic-1.0.0a1.zip
py27-django1.8.X-celery3.1.X installed: amqp==1.4.9,anyjson==0.3.3,billiard==3.3.0.23,celery==3.1.23,Django==1.8.13,django-nose==1.4.3,django-picklefield==0.3.1,funcsigs==1.0.2,jobtastic==1.0.0a1,kombu==3.0.35,linecache2==1.0.0,mock==2.0.0,nose==1.3.7,pbr==1.10.0,psutil==3.4.2,pytz==2016.6.1,six==1.10.0,traceback2==1.4.0,unittest2==1.1.0
py27-django1.8.X-celery3.1.X runtests: PYTHONHASHSEED='869036002'
py27-django1.8.X-celery3.1.X runtests: commands[0] | /home/roger/git/jobtastic/.tox/py27-django1.8.X-celery3.1.X/bin/python setup.py test
running test
nosetests /home/roger/git/jobtastic/test_projects/django/testproj/../../../jobtastic/tests  --verbosity=1
Creating test database for alias 'default'...
......F...FFFFFF........
======================================================================
FAIL: test_custom_django_cache (jobtastic.tests.test_settings.DjangoSettingsTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/roger/git/jobtastic/.tox/py27-django1.8.X-celery3.1.X/local/lib/python2.7/site-packages/django/test/utils.py", line 196, in inner
    return test_func(*args, **kwargs)
  File "/home/roger/git/jobtastic/jobtastic/tests/test_settings.py", line 50, in test_custom_django_cache
    self.assertTrue('lock:%s' % self.cache_key in caches['shared'])
AssertionError: False is not true
-------------------- >> begin captured logging << --------------------
root: INFO: Current status: PENDING
root: INFO: Setting herd-avoidance cache for task: jobtastic.tests.test_broker_fallbacks.ParrotTask:a1d0c6e83f027327d8461063f4ac58a6
--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: test_delay_or_eager_bad_channel (jobtastic.tests.test_broker_fallbacks.BrokenBrokerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/roger/git/jobtastic/jobtastic/tests/test_broker_fallbacks.py", line 148, in test_delay_or_eager_bad_channel
    self.assertEqual(async_task.status, states.SUCCESS)
AssertionError: 'PENDING' != 'SUCCESS'
-------------------- >> begin captured logging << --------------------
root: INFO: Current status: PENDING
root: INFO: Setting herd-avoidance cache for task: jobtastic.tests.test_broker_fallbacks.ParrotTask:02e74f10e0327ad868d138f2b4fdd6f0
--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: test_delay_or_eager_bad_connection (jobtastic.tests.test_broker_fallbacks.BrokenBrokerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/roger/git/jobtastic/jobtastic/tests/test_broker_fallbacks.py", line 142, in test_delay_or_eager_bad_connection
    self.assertEqual(async_task.status, states.SUCCESS)
AssertionError: 'PENDING' != 'SUCCESS'
-------------------- >> begin captured logging << --------------------
root: INFO: Current status: PENDING
root: INFO: Setting herd-avoidance cache for task: jobtastic.tests.test_broker_fallbacks.ParrotTask:02e74f10e0327ad868d138f2b4fdd6f0
--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: test_delay_or_fail_bad_channel (jobtastic.tests.test_broker_fallbacks.BrokenBrokerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/roger/git/jobtastic/.tox/py27-django1.8.X-celery3.1.X/local/lib/python2.7/site-packages/mock/mock.py", line 1305, in patched
    return func(*args, **keywargs)
  File "/home/roger/git/jobtastic/jobtastic/tests/test_broker_fallbacks.py", line 125, in test_delay_or_fail_bad_channel
    self.assertEqual(async_task.status, states.FAILURE)
AssertionError: 'PENDING' != 'FAILURE'
-------------------- >> begin captured logging << --------------------
root: INFO: Current status: PENDING
root: INFO: Setting herd-avoidance cache for task: jobtastic.tests.test_broker_fallbacks.ParrotTask:c4ca4238a0b923820dcc509a6f75849b
--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: test_delay_or_fail_bad_connection (jobtastic.tests.test_broker_fallbacks.BrokenBrokerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/roger/git/jobtastic/.tox/py27-django1.8.X-celery3.1.X/local/lib/python2.7/site-packages/mock/mock.py", line 1305, in patched
    return func(*args, **keywargs)
  File "/home/roger/git/jobtastic/jobtastic/tests/test_broker_fallbacks.py", line 119, in test_delay_or_fail_bad_connection
    self.assertEqual(async_task.status, states.FAILURE)
AssertionError: 'PENDING' != 'FAILURE'
-------------------- >> begin captured logging << --------------------
root: INFO: Current status: PENDING
root: INFO: Setting herd-avoidance cache for task: jobtastic.tests.test_broker_fallbacks.ParrotTask:c4ca4238a0b923820dcc509a6f75849b
--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: test_delay_or_run_bad_channel (jobtastic.tests.test_broker_fallbacks.BrokenBrokerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/roger/git/jobtastic/jobtastic/tests/test_broker_fallbacks.py", line 136, in test_delay_or_run_bad_channel
    self.assertTrue(was_fallback)
AssertionError: False is not true
-------------------- >> begin captured logging << --------------------
root: INFO: Current status: PENDING
root: INFO: Setting herd-avoidance cache for task: jobtastic.tests.test_broker_fallbacks.ParrotTask:02e74f10e0327ad868d138f2b4fdd6f0
--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: test_delay_or_run_bad_connection (jobtastic.tests.test_broker_fallbacks.BrokenBrokerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/roger/git/jobtastic/jobtastic/tests/test_broker_fallbacks.py", line 130, in test_delay_or_run_bad_connection
    self.assertTrue(was_fallback)
AssertionError: False is not true
-------------------- >> begin captured logging << --------------------
root: INFO: Current status: PENDING
root: INFO: Setting herd-avoidance cache for task: jobtastic.tests.test_broker_fallbacks.ParrotTask:02e74f10e0327ad868d138f2b4fdd6f0
--------------------- >> end captured logging << ---------------------

----------------------------------------------------------------------
Ran 24 tests in 0.160s

FAILED (failures=7)
Destroying test database for alias 'default'...

@winhamwr
Copy link
Contributor

doing so leaves stuff in the cache and makes the other tests fail.

Both of these options are kind of hacky, but probably worth doing:

  • We could write a setUp/tearDown methods to explicitly clear the cache
  • We could also use a per-test custom cache prefix

@winhamwr
Copy link
Contributor

Another hacky way to change the settings might be to define different testproj.settings_foo modules (e.g. settings_with_custom_django_cache.py), then modify the testrunner to switch between different DJANGO_SETTINGS_MODULE's. This would be the most hacky of the 3 options, though, and I hope we can avoid it.

@rhunwicks
Copy link
Contributor

What I can't understand is why it works when I run the tests by activating the virtualenv manually and then running the nosetests command - but doesn't work when I run tox -e py27-django1.8.X-celery3.1.X.

What's different about running through tox?

@winhamwr
Copy link
Contributor

What's different about running through tox?

Maybe it's some of the other requirements that tox installs/loads? Perhaps in test.txt?

rhunwicks added a commit to kimetrica/jobtastic that referenced this issue Jul 21, 2016
@rhunwicks
Copy link
Contributor

It was to do with state being remembered in ParrotTask which was imported from another module. I switched to defining the task in setUp() and it all works. I have created a PR,

caffodian added a commit that referenced this issue Sep 1, 2016
@winhamwr
Copy link
Contributor

winhamwr commented Sep 1, 2016

Thanks so much @rhunwicks! We just merged in #63 and cut a Jobtastic 1.0 release.

@winhamwr winhamwr closed this as completed Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants