Is there an in memory stream that blocks like a fi

2019-02-12 15:19发布

问题:

I'm using a library that requires I provide an object that implements this interface:

public interface IConsole {
    TextWriter StandardInput { get; }
    TextReader StandardOutput { get; }
    TextReader StandardError { get; }
}

The object's readers then get used by the library with:

IConsole console = new MyConsole();
int readBytes = console.StandardOutput.Read(buffer, 0, buffer.Length);

Normally the class implementing IConsole has the StandardOutput stream as coming from an external process. In that case the console.StandardOutput.Read calls work by blocking until there is some data written to the StandardOutput stream.

What I'm trying to do is create a test IConsole implementation that uses MemoryStreams and echo's whatever appears on the StandardInput back onto the StandardInput. I tried:

MemoryStream echoOutStream = new MemoryStream();
StandardOutput = new StreamReader(echoOutStream);

But the problem with that is the console.StandardOutput.Read will return 0 rather than block until there is some data. Is there anyway I can get a MemoryStream to block if there is no data available or is there a different in memory stream I could use?

回答1:

Inspired by your answer, here's my multi-thread, multi-write version:

public class EchoStream : MemoryStream
{
    private readonly ManualResetEvent _DataReady = new ManualResetEvent(false);
    private readonly ConcurrentQueue<byte[]> _Buffers = new ConcurrentQueue<byte[]>();

    public bool DataAvailable{get { return !_Buffers.IsEmpty; }}

    public override void Write(byte[] buffer, int offset, int count)
    {
        _Buffers.Enqueue(buffer);
        _DataReady.Set();
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        _DataReady.WaitOne();

        byte[] lBuffer;

        if (!_Buffers.TryDequeue(out lBuffer))
        {
            _DataReady.Reset();
            return -1;
        }

        if (!DataAvailable)
            _DataReady.Reset();

        Array.Copy(lBuffer, buffer, lBuffer.Length);
        return lBuffer.Length;
    }
}

With your version you should Read the Stream upon Write, without any consecutively write be possible. My version buffers any written buffer in a ConcurrentQueue (it's fairly simple to change it to a simple Queue and lock it)



回答2:

In the end I found an easy way to do it by inheriting from MemoryStream and taking over the Read and Write methods.

public class EchoStream : MemoryStream {

    private ManualResetEvent m_dataReady = new ManualResetEvent(false);
    private byte[] m_buffer;
    private int m_offset;
    private int m_count;

    public override void Write(byte[] buffer, int offset, int count) {
        m_buffer = buffer;
        m_offset = offset;
        m_count = count;
        m_dataReady.Set();
    }

    public override int Read(byte[] buffer, int offset, int count) {
        if (m_buffer == null) {
            // Block until the stream has some more data.
            m_dataReady.Reset();
            m_dataReady.WaitOne();    
        }

        Buffer.BlockCopy(m_buffer, m_offset, buffer, offset, (count < m_count) ? count : m_count);
        m_buffer = null;
        return (count < m_count) ? count : m_count;
    }
}


回答3:

I'm going to add one more refined version of EchoStream. This is a combination of the other two versions, plus some suggestions from the comments.

UPDATE - I have tested this EchoStream with over 50 terrabytes of data run through it for days on end. The test had it sitting between a network stream and the ZStandard compression stream. The async has also been tested, which brought a rare hanging condition to the surface. It appears the built in System.IO.Stream does not expect one to call both ReadAsync and WriteAsync on the same stream at the same time, which can cause it to hang if there isn't any data available because both calls utilize the same internal variables. Therefore I had to override those functions, which resolved the hanging issue.

This version the following features.

1) This was written from scratch using the System.IO.Stream base class instead of MemoryStream.

2) The constructor can set a max queue depth and if this level is reached then stream writes will block until a Read is performed which drops the queue depth back below the max level (no limit=0, default=10).

3) When reading/writing data, the buffer offset and count are now honored. Also, you can call Read with a smaller buffer than Write without throwing an exception or losing data. BlockCopy is used in a loop to fill in the bytes until count is satisfied.

4) There is a public property called AlwaysCopyBuffer, which makes a copy of the buffer in the Write function. Setting this to true will safely allow the byte buffer to be reused after calling Write.

5) There is a public property called ReadTimeout/WriteTimeout, which controls how long the Read/Write function will block before it returns 0 (default=Infinite, -1).

6) The BlockingQueue class is used, which combines the ConcurrentQueue and AutoResetEvent classes. There is a rare condition in the ConcurrentQueue class where you will find that after data has been Enqueued, that it is not available immediately when AutoResetEvent allows a thread through in the Read( ). This happens about once every 500GB of data that passes through it. The cure is to Sleep and check for the data again. Sometimes a Sleep(0) works, but in extreme cases where the CPU usage was high, it could be as high as Sleep(1000) before the data shows up. The ConcurrentQueue handles all of this without any issues.

7) This has been tested to be thread safe for simultaneous async reads and writes.

using System;
using System.IO;
using System.Threading.Tasks;
using System.Threading;
using System.Collections.Concurrent;

public class EchoStream : Stream
{
    public override bool CanTimeout { get; } = true;
    public override int ReadTimeout { get; set; } = Timeout.Infinite;
    public override int WriteTimeout { get; set; } = Timeout.Infinite;
    public override bool CanRead { get; } = true;
    public override bool CanSeek { get; } = false;
    public override bool CanWrite { get; } = true;

    public bool CopyBufferOnWrite { get; set; } = false;

    private readonly object _lock = new object();

    // Default underlying mechanism for BlockingCollection is ConcurrentQueue<T>, which is what we want
    private readonly BlockingCollection<byte[]> _Buffers;
    private int _maxQueueDepth = 10;

    private byte[] m_buffer = null;
    private int m_offset = 0;
    private int m_count = 0;

    private bool m_Closed = false;
    public override void Close()
    {
        m_Closed = true;

        // release any waiting writes
        _Buffers.CompleteAdding();
    }

    public bool DataAvailable
    {
        get
        {
            return _Buffers.Count > 0;
        }
    }

    private long _Length = 0L;
    public override long Length
    {
        get
        {
            return _Length;
        }
    }

    private long _Position = 0L;
    public override long Position
    {
        get
        {
            return _Position;
        }
        set
        {
            throw new NotImplementedException();
        }
    }

    public EchoStream() : this(10)
    {
    }

    public EchoStream(int maxQueueDepth)
    {
        _maxQueueDepth = maxQueueDepth;
        _Buffers = new BlockingCollection<byte[]>(_maxQueueDepth);
    }

    // we override the xxxxAsync functions because the default base class shares state between ReadAsync and WriteAsync, which causes a hang if both are called at once
    public new Task WriteAsync(byte[] buffer, int offset, int count)
    {
        return Task.Run(() => Write(buffer, offset, count));
    }

    // we override the xxxxAsync functions because the default base class shares state between ReadAsync and WriteAsync, which causes a hang if both are called at once
    public new Task<int> ReadAsync(byte[] buffer, int offset, int count)
    {
        return Task.Run(() =>
        {
            return Read(buffer, offset, count);
        });
    }

    public override void Write(byte[] buffer, int offset, int count)
    {
        if (m_Closed || buffer.Length - offset < count || count <= 0)
            return;

        byte[] newBuffer;
        if (!CopyBufferOnWrite && offset == 0 && count == buffer.Length)
            newBuffer = buffer;
        else
        {
            newBuffer = new byte[count];
            System.Buffer.BlockCopy(buffer, offset, newBuffer, 0, count);
        }
        if (!_Buffers.TryAdd(newBuffer, WriteTimeout))
            throw new TimeoutException("EchoStream Write() Timeout");

        _Length += count;
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        if (count == 0)
            return 0;
        lock (_lock)
        {
            if (m_count == 0 && _Buffers.Count == 0)
            {
                if (m_Closed)
                    return -1;

                if (_Buffers.TryTake(out m_buffer, ReadTimeout))
                {
                    m_offset = 0;
                    m_count = m_buffer.Length;
                }
                else
                    return m_Closed ? -1 : 0;
            }

            int returnBytes = 0;
            while (count > 0)
            {
                if (m_count == 0)
                {
                    if (_Buffers.TryTake(out m_buffer, 0))
                    {
                        m_offset = 0;
                        m_count = m_buffer.Length;
                    }
                    else
                        break;
                }

                var bytesToCopy = (count < m_count) ? count : m_count;
                System.Buffer.BlockCopy(m_buffer, m_offset, buffer, offset, bytesToCopy);
                m_offset += bytesToCopy;
                m_count -= bytesToCopy;
                offset += bytesToCopy;
                count -= bytesToCopy;

                returnBytes += bytesToCopy;
            }

            _Position += returnBytes;

            return returnBytes;
        }
    }

    public override void Flush()
    {
    }

    public override long Seek(long offset, SeekOrigin origin)
    {
        throw new NotImplementedException();
    }

    public override void SetLength(long value)
    {
        throw new NotImplementedException();
    }
}