Haskell: How to timeout a function that runs an ex

2019-03-18 05:21发布

问题:

I call an external program inside a function. Now i would like to timeout this function and not just the external program. But after the function times out, the external program is still running on my computer (i'm using debian) until it finishes its computation, after that its thread still remains in the process table as a subthread of my main program until the main program terminates.

Here are two minimal examples which illustrates what i would like to do. The first uses unsafePerformIO, the second is completely in the IO monad. I don't really depend on the unsafePerformIO but would like to keep it if possible. The described problem occures with and without it.

With unsafePerformIO

module Main where

import System.Timeout
import Criterion.Measurement
import System.IO.Unsafe
import System.Process

main = do
    x <- time $ timeoutP (1 * 1000000) $ mytest 2
    y <- getLine
    putStrLn $ show x ++ y

timeoutP :: Int -> a -> IO (Maybe a)
timeoutP t fun = timeout t $ return $! fun

mytest :: Int -> String
mytest n =
  let
    x = runOnExternalProgram $ n * 1000
  in
    x ++ ". Indeed."

runOnExternalProgram :: Int -> String
runOnExternalProgram n = unsafePerformIO $ do
    -- convert the input to a parameter of the external program
    let x = show $ n + 12
    -- run the external program
    -- (here i use "sleep" to indicate a slow computation)
    answer <- readProcess "sleep" [x] ""
    -- convert the output as needed
    let verboseAnswer = "External program answered: " ++ answer
    return verboseAnswer

Without unsafePerformIO

module Main where

import System.Timeout
import Criterion.Measurement
import System.IO.Unsafe
import System.Process

main = do
    x <- time $ timeout (1 * 1000000) $ mytest 2
    y <- getLine
    putStrLn $ show x ++ y

mytest :: Int -> IO String
mytest n = do
    x <- runOnExternalProgram $ n * 1000
    return $ x ++ ". Indeed."

runOnExternalProgram :: Int -> IO String
runOnExternalProgram n = do
    -- convert the input to a parameter for the external program:
    let x = show $ n + 12
    -- run the external program
    -- (here i use "sleep" to indicate a slow computation):
    answer <- readProcess "sleep" [x] ""
    -- convert the output as needed:
    let verboseAnswer = "External program answered: " ++ answer
    return verboseAnswer

Maybe bracket can be of help here, but i don't really know how.

Edit: I adopted John L's answer. Now i am using the following:

import Control.Concurrent
import Control.Exception
import System.Exit
import System.IO
import System.IO.Error
import System.Posix.Signals
import System.Process
import System.Process.Internals

safeCreateProcess :: String -> [String] -> StdStream -> StdStream -> StdStream
                  -> ( ( Maybe Handle
                       , Maybe Handle
                       , Maybe Handle
                       , ProcessHandle
                       ) -> IO a )
                  -> IO a
safeCreateProcess prog args streamIn streamOut streamErr fun = bracket
    ( do
        h <- createProcess (proc prog args) 
                 { std_in  = streamIn
                 , std_out = streamOut
                 , std_err = streamErr
                 , create_group = True }
        return h
    )
-- "interruptProcessGroupOf" is in the new System.Process. Since some
-- programs return funny exit codes i implemented a "terminateProcessGroupOf".
--    (\(_, _, _, ph) -> interruptProcessGroupOf ph >> waitForProcess ph)
    (\(_, _, _, ph) -> terminateProcessGroup ph >> waitForProcess ph)
    fun
{-# NOINLINE safeCreateProcess #-}

safeReadProcess :: String -> [String] -> String -> IO String
safeReadProcess prog args str =
    safeCreateProcess prog args CreatePipe CreatePipe Inherit
      (\(Just inh, Just outh, _, ph) -> do
        hPutStr inh str
        hClose inh
        -- fork a thread to consume output
        output <- hGetContents outh
        outMVar <- newEmptyMVar
        forkIO $ evaluate (length output) >> putMVar outMVar ()
        -- wait on output
        takeMVar outMVar
        hClose outh
        return output
-- The following would be great, if some programs did not return funny
-- exit codes!
--            ex <- waitForProcess ph
--            case ex of
--                ExitSuccess -> return output
--                ExitFailure r ->
--                    fail ("spawned process " ++ prog ++ " exit: " ++ show r)
      )

terminateProcessGroup :: ProcessHandle -> IO ()
terminateProcessGroup ph = do
    let (ProcessHandle pmvar) = ph
    ph_ <- readMVar pmvar
    case ph_ of
        OpenHandle pid -> do  -- pid is a POSIX pid
            signalProcessGroup 15 pid
        otherwise -> return ()

This solves my problem. It kills all child processes of the spawned process and that at the right time.

Kind regards.

回答1:

Edit: it is possible to get the pid of the spawned process. You can do so with code like the following:

-- highly non-portable, and liable to change between versions
import System.Process.Internals

-- from the finalizer of the bracketed function
-- `ph` is a ProcessHandle as returned by createProcess
  (\(_,_,_,ph) -> do
    let (ProcessHandle pmvar) = ph
    ph_ <- takeMVar pmvar
    case ph_ of
      OpenHandle pid -> do  -- pid is a POSIX pid
        ... -- do stuff
        putMVar pmvar ph_

If you kill the process, instead of putting the open ph_ into the mvar you should create an appropriate ClosedHandle and put that back instead. It's important that this code executes masked (bracket will do this for you).

Now that you have a POSIX id you can use system calls or shell out to kill as necessary. Just be careful that your Haskell executable isn't in the same process group if you go that route.

/end edit

This behavior seems sort of sensible. The documentation for timeout claims that it doesn't work at all for non-Haskell code, and indeed I don't see any way that it could generically. What's happening is that readProcess spawns a new process, but then is timed out while waiting for output from that process. It seems that readProcess doesn't terminate the spawned process when it's aborted abnormally. This could be a bug in readProcess, or it could be by design.

As a workaround, I think you'll need to implement some of this yourself. timeout works by raising an async exception in a spawned thread. If you wrap your runOnExternalProgram in an exception handler, you'll get the behavior you want.

The key function here is the new runOnExternalProgram, which is a combination of your original function and readProcess. It would be better (more modular, more reusable, more maintainable) to make a new readProcess that kills the spawned process when an exception is raised, but I'll leave that as an exercise.

module Main where

import System.Timeout
import Criterion.Measurement
import System.IO.Unsafe
import System.Process
import Control.Exception
import System.IO
import System.IO.Error
import GHC.IO.Exception
import System.Exit
import Control.Concurrent.MVar
import Control.Concurrent

main = do
    x <- time $ timeoutP (1 * 1000000) $ mytest 2
    y <- getLine
    putStrLn $ show x ++ y

timeoutP :: Int -> IO a -> IO (Maybe a)
timeoutP t fun = timeout t $ fun

mytest :: Int -> IO String
mytest n = do
  x <- runOnExternalProgram $ n * 1000
  return $ x ++ ". Indeed."

runOnExternalProgram :: Int -> IO String
runOnExternalProgram n = 
    -- convert the input to a parameter of the external program
    let x = show $ n + 12
    in bracketOnError
        (createProcess (proc "sleep" [x]){std_in = CreatePipe
                                         ,std_out = CreatePipe
                                         ,std_err = Inherit})
        (\(Just inh, Just outh, _, pid) -> terminateProcess pid >> waitForProcess pid)

        (\(Just inh, Just outh, _, pid) -> do
          -- fork a thread to consume output
          output <- hGetContents outh
          outMVar <- newEmptyMVar
          forkIO $ evaluate (length output) >> putMVar outMVar ()

          -- no input in this case
          hClose inh

          -- wait on output
          takeMVar outMVar
          hClose outh

          -- wait for process
          ex <- waitForProcess pid

          case ex of
            ExitSuccess -> do
              -- convert the output as needed
              let verboseAnswer = "External program answered: " ++ output
              return verboseAnswer
            ExitFailure r ->
              ioError (mkIOError OtherError ("spawned process exit: " ++ show r) Nothing Nothing) )