ForkIO and coroutines in Haskell

I'm trying to understand Coroutines, but I don't quite understand their purpose given the existence of threads with forkIO. What use cases require using coroutines on streams?

+3


source to share


2 answers


A bit confusing from your question if you are talking about a specific implementation of a Haskell coroutine (if so please add a link) or a general concept.

Usage forkIO

and some inter-thread communication is one way to implement coroutines. The advantage is that you can take advantage of having multiple processors / cores this way, but there are several disadvantages in my opinion:

  • Explicit concurrency IO

    , so all your computation must be done in a monad IO

    .
  • You must explicitly implement cross-thread communication.
  • You need to take care of starting threads and more importantly getting rid of them and preventing hunger / dead ends.
  • The architecture is (obviously) multithreaded. In some cases, this can be a disadvantage. For example, you might want your computations to be clean, deterministic, single-threaded, but still use the concept of coroutines.

I also assume your question was about thisCoroutine

implementation.

Let me give you a small example. Suppose we want to compute large factorials, but since the computation can take too long, we want it to pause after every loop so that we can give some feedback to the user. Moreover, we want to tell you how many cycles are left to calculate:

import Control.Monad
import Control.Monad.Coroutine
import Control.Monad.Coroutine.SuspensionFunctors
import Control.Parallel
import Data.Functor.Identity

-- A helper function, a monadic version of 'pogoStick':

-- | Runs a suspendable 'Coroutine' to its completion.
pogoStickM :: Monad m => (s (Coroutine s m x) -> m (Coroutine s m x))
                      -> Coroutine s m x -> m x
pogoStickM spring c = resume c >>= either (pogoStickM spring <=< spring) return


factorial1 :: (Monad m) => Integer -> Coroutine (Yield Integer) m Integer
factorial1 = loop 1
  where
    loop r 0 = return r
    loop r n = do
                  let r' = r * n
                  r' `par` yield n
                  (r' `pseq` loop r') (n - 1)


run1 :: IO ()
run1 = pogoStickM (\(Yield i c) -> print i >> return c) (factorial1 20) >>= print

      



Now let's say we realize that the rebate after each cycle is too ineffective. Instead, we want the caller to determine how many loops we have to compute before pause again. To do this, we simply replace the functor Yield

with Request

:

factorial2 :: (Monad m) => Integer
                        -> Coroutine (Request Integer Integer) m Integer
factorial2 n = loop 1 n n
  where
    loop r t 0 = return r
    loop r t n | t >= n       = r' `par` request n >>= rec
               | otherwise    = rec t
      where
        rec t' = (r' `pseq` loop r') t' (n - 1)
        r' = r * n

run2 :: IO ()
run2 = pogoStickM (\(Request i c) -> print i >> return (c (i - 5)))
                  (factorial2 30)
       >>= print

      

While our examples run...

are based on IO

, factorial calculations are clean, they allow any monad (including Identity

).

However, using Haskell parallelism , we were doing pure computation in parallel with the reporting code (before yielding from the coroutine, we create a spark that computes the multiplication step with par

).

And, perhaps most importantly, types ensure that coroutines cannot misbehave. There is no way coprocessors can deadlock - receiving or requesting feedback is always combined with a matching response (unless the caller decides not to continue with the coroutine, in which case it is automatically garbage collected, no thread locked).

+5


source


No use cases required. Anything you can do with coroutines you can do with the forkIO

+ communication channel. In fact, I believe that Go (a language where concurrency is very cheap, like Haskell) avoids coroutines altogether and does everything with parallel light streams ("goroutines").

However, sometimes there forkIO

is overkill . Sometimes you don't need concurrency, you want to decompose the problem into conceptually separate instruction streams that are inferior to each other at certain explicitly defined points.



Consider the problem of reading from a file and writing to another. Instead of having a monolithic nested loop, a more reusable solution would be to compose an accompanying copy of the file with the record of the file. When you later decide to print the file to the screen, you do not need to change the accompanying program to read the files at all, you only need to compose it differently. But note that this issue has nothing to do with concurrency, it is about separation of concerns and reuse.

+4


source







All Articles