Constructing RequestBodyStream from Lazy ByteStrin

2019-07-25 08:33发布

问题:

I am trying to adapt this AWS S3 upload code to handle Lazy ByteString where length is already known (so that it is not forced to be read in its entirety in memory - it comes over the network where length is sent beforehand). It seems I have to define a GivesPopper function over Lazy ByteString to convert it to RequestBodyStream. Because of the convoluted way GivesPopper is defined, I am not sure how to write it for Lazy ByteString. Will appreciate pointers on how to write it. Here is how it is written for reading from the file:

let file ="test"
-- streams large file content, without buffering more than 10k in memory
let streamer sink = withFile file ReadMode $ \h -> sink $ S.hGet h 10240

streamer in the code above is of type GivesPopper () if I understand it correctly. Given a Lazy ByteString with known length len, what would be a good way to write GivesPopper function over it? We can read one chunk at a time.

回答1:

Is this what you're looking for?

import qualified Data.ByteString as S
import qualified Data.ByteString.Lazy as L
import System.IO

file = "test"
-- original streamer for feeding a sink from a file
streamer :: (IO S.ByteString -> IO r) -> IO r
streamer sink = withFile file ReadMode $ \h -> sink $ S.hGet h 10240

-- feed a lazy ByteString to sink    
lstreamer :: L.ByteString -> (IO S.ByteString -> IO r) -> IO r
lstreamer lbs sink = sink (return (L.toStrict lbs))

lstreamer type checks but probably doesn't do exactly what you want it to do. It simply returns the same data every time the sink calls it. On the other hand S.hGet h ... will eventually return the empty string.

Here is a solution which uses an IORef to keep track of if we should start returning the empty string:

import Data.IORef

mklstream :: L.ByteString -> (IO S.ByteString -> IO r) -> IO r
mklstream lbs sink = do
  ref <- newIORef False
  let fetch :: IO S.ByteString
      fetch = do sent <- readIORef ref
                 writeIORef ref True
                 if sent
                   then return S.empty
                   else return (L.toStrict lbs)
  sink fetch

Here fetch is the action which gets the next chunk. The first time you call it you will get the original lazy Bytestring (strict-ified). Subsequent calls will always return the empty string.

Update

Here's how to give out a small amount at a time:

mklstream :: L.ByteString -> (IO S.ByteString -> IO r) -> IO r
mklstream lbs sink = do
  ref <- newIORef (L.toChunks lbs)
  let fetch :: IO S.ByteString
      fetch = do chunks <- readIORef ref
                 case chunks of
                   [] -> return S.empty
                   (c:cs) -> do writeIORef ref cs
                                return c
  sink fetch