Django: celery vs. feed vs. view

Suppose I have a button on my page that, when clicked, triggers an ajax request to my api endpoint, which then fetches data from a third party site. Let's say this task takes about 2-5 seconds with a timeout after 5 seconds. What's the ideal way to do this:

  • celery task.delay () on api endpoint and returns url to poll all x intervals for result.
  • just do it in view

All the tutorials I've seen suggest the celery path, but this seems like a lot of machine / overhead for a simple query with minimal processing. Is there a generally accepted threshold (seconds to complete, etc.) where one can choose one by one?

Then there are django channels which seem to be perfect for this. But at first glance, the distinction between canal workers and celery tasks seems blurred. Can I replace celery with working channels and just use that for the above task? Will the channels handle my longer running tasks? What would be the advantages / disadvantages with channels (with celery or celery substitute)?

Finally, which of the 3 (celery / feeds / browsing) would be the recommended approach for the given example scenario?

+3


source to share


1 answer


I'm not a channel specialist, but here we go.

Pipes are an abstraction above WSGI (new protocol - ASGI) that allows you to communicate over "abstract" pipes. Sometimes you will be doing HTTP, sometimes websockets, sometimes other things that you can do with just about any communication pattern.

Celery is built in a similar way, it uses a message bus (sometimes a more complex broker mechanism depending on how you start it) to send work to the work computer, which can send additional results.

Now what do you choose?

In view

I would avoid this if you have a view specifically designed for this purpose. You will need to make sure your stack can handle long lasting connections (the heroku router will complain if it takes longer than 30 seconds, for example), or you might want to implement some kind of long polling interface.

With celery

You will need to do all the setup to get the information online.

Having a task whose results you want will require a result backend and passing in the task ID.

You will need to implement a view that can query celery to figure out where the task is in completion, success, etc.

Eg.

# kick of the task somewhere

def create_task(request, *args, **kwargs):
    task_id = some_task.delay(param)
    return Response({'task_id': task_id})

      



urls.py

url(r'^/tasks/<task_id>/$', name='task-progress')

      

views.py

def task_progress_view(request, task_id):
    # get fancier here, this is just an example
    return Response(some_task.AsyncResult(task_id).state)

      

This is a very basic example, but this should be a starting point.

With channels

You need to set up the bus, collect the required views in essentially the same way as celery, only you will need to have a piece of code that fetches the data with some retry, timeout, etc ...

What to choose

Celery will take care of the working part, you will have to take care of updates and inform your client. Channels would be a smart way to deal with this back and forth, but you may not need to.

I would think about what else you need to do. Most applications require to run asynchronously at some point because business logic often dictates it. If you plan on using webs etc. but you don't want to break your django app into services, I would just bite the bullet and do both.

If you don't need more than one communication protocol, just use celery and make submissions.

+1


source







All Articles