Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KILL-TASKS and nested tasks #32

Open
i-perminov opened this issue May 10, 2017 · 7 comments
Open

KILL-TASKS and nested tasks #32

i-perminov opened this issue May 10, 2017 · 7 comments

Comments

@i-perminov
Copy link

When KILL-TASKS is used with nested pmap calls, it sometimes kills unrelated tasks. It is not unexpected, because KILL-TASKS destroys threads, but in some cases it is inconvenient.
What are the reasons for using DESTROY-THREAD in KILL-TASKS? It probably would be possible to use INTERRUPT-THREAD to abort a task without killing its thread.
Here is an example that show the problem:
The following code does not signal any errors

(handler-case 
    (let ((*task-category* :my-tasks))
      (pmapcar (lambda (i) 
                 (if (= i 0)
                     (progn
                       (sleep 0.1)
                       (kill-tasks :my-tasks))
                     (sleep 10))) :parts 2 '(0 1)))
  (task-killed-error ()
    nil))

but if it is wrapped with another pmapcar, it signals TASK-KILLED-ERROR

(pmapcar (lambda (x) 
           (handler-case 
               (let ((*task-category* :my-tasks))
                 (pmapcar (lambda (i) 
                            (if (= i 0)
                                (progn
                                  (sleep 0.1)
                                  (kill-tasks :my-tasks))
                                (sleep 10))) :parts 2 '(0 1)))
             (task-killed-error ()
               nil)))
         '(0))
@lmj
Copy link
Owner

lmj commented May 10, 2017

Interesting, all tests pass after replacing in kill.lisp

(destroy-thread (thread worker))

with

(bt:interrupt-thread
 (thread worker)
 (lambda ()
   (invoke-restart 'transfer-error (make-condition 'task-killed-error))))

For each lisp implementation I would have to check that interrupt-thread respects unwind-protect whenever destroy-thread does. I expect that's true, and if so, this looks like a definite improvement with no drawbacks that I can see. My concern, though, would be that some people may be depending upon the destroy-thread behavior.

I realize your example is just for demonstration purposes, but using pmapcar inside a worker thread is wasteful because it ties up that worker thread while it waits for other worker threads. One improvement I've been contemplating is making pmap et al aware of when they are being called inside a worker, as functions defined with defpun do.

@i-perminov
Copy link
Author

i-perminov commented May 10, 2017

I do not think pmapcar inside a worker wastes a thread, your nice work stealing logic seems to prevent the problem.

@lmj
Copy link
Owner

lmj commented May 10, 2017

Ha, you're right: there it is in kernel-util.lisp: steal-until-receive-result. I completely forgot that I fixed that. There's even lparallel-test::cognate-steal-test. Thanks for the info!

@i-perminov
Copy link
Author

kill-tasks only kills active tasks, is there a good way to cancel queued tasks? For example, if a pmap produces more tasks than there are available workers, kill-tasks is not sufficient to abort the pmap cleanly. Is there a way to do anything better than

(defun abortable-pmapcar (func &rest rest)
  (let ((*task-category* (cons nil nil))
        (aborted nil))
    (unwind-protect-case ()
        (apply #'pmapcar (lambda (&rest args)
                           (if aborted
                               (invoke-transfer-error (make-condition 'task-killed-error))
                               (apply func args)))
               rest)
      (:abort
       (setf aborted t)
       (kill-tasks *task-category*)))))

@lmj
Copy link
Owner

lmj commented May 11, 2017

It's come up before, and I have a couple private branches which have defmacro with-tasks-flushed-on-abort (&key kill &body body). The first method of doing this is more or less what you have above, wrapping submitted tasks with some logic.

The second method flushes the internal lockless queues (called spin queues in lparallel), which also lets us write defun flush-tasks (task-category &key kill). However since the moment-to-moment state of spin queues can be quite volatile, flushing tasks this way could unexpectedly end up reordering the non-flushed tasks. (Enqueue and dequeue are the only operations you have; you can't lock the queue, replace its contents, then unlock it. So you have to remove the tasks one by one, then enqueue the unflushed tasks, during which time the queue may have changed. Tasks aren't lost, just possibly reordered.)

@i-perminov
Copy link
Author

Thanks for the explanation. The second method depends a lot on internals of lparallel, so I will use the first one for now.
Did you consider adding an extension point to allow users to put arbitrary wrappers around tasks? E.g. each time a task lambda is created it is passed to a function specified in a variable (that users can set), and the function result is used instead of the task lambda. A version of with-kill-on-about that also cancels queued tasks would look like

(defmacro with-kill-on-abort (&body body)
  `(let ((aborted nil)
         (*wrap-task-func* (lambda (thunk)
                             (lambda ()
                               (if aborted
                                   (invoke-transfer-error (make-condition 'task-killed-error))
                                   (funcall thunk)))))
         ...)
     (unwind-protect-case ()
         (progn ,@body)
       (:abort
        (setf aborted t)
        ...))))

I am aware of 3 use cases where such an extension point would be useful: the one discussed above, passing values of special variables from a caller of pmap to tasks, and pmap/wait (we discussed it in issue 18).

@lmj
Copy link
Owner

lmj commented May 12, 2017

Yes, successively wrapping lambdas around tasks is the natural way to do this. There is already one such lambda wrapping in lparallel, namely the rebinding of the handlers established with task-handler-bind. In my kill-tasks-on-abort branch I do something similar to what you describe, but I hadn't considered adding the dynamic variable to the API. I think it's a good idea, thanks! Like the :context hook in make-kernel it would be an "advanced use" feature -- the user is screwed if the parameter isn't funcalled. Though I suppose the hook could be used to ignore any submitted tasks as well, or (deepening the level of trickery) execute something else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants