I'm using boost 1.50 with VS2010, reading using a Windows file HANDLE (which seems to be relatively uncommon compared to asio use with sockets).
Problem
The
handle_read
callback gets to line 8 and returns the first bit with all of line 1 appended; further callbacks cycle through from line 2 again, ad nauseum:
- open a short text file (below)
- get expected
handle_read
callbacks with correct content for lines 1 through 7 - the next callback has a longer-than-expected bytes-read
length
parameter - though not using
length
,getline
extracts a correspondingly longer line from the asio stream buffer - extracted content switches mid-line to repeat the first line from the input file
- further
handle_read
callbacks recycle lines 2 through 7, then the "long hybrid" line problem happens - ad nauseum
Input
LINE 1 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
LINE 2 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
...3--E similarly...
LINE F abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
Output
Here's the first 15 lines of output (it continues forever):
line #1, length 70, getline() [69] 'LINE 1 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
line #2, length 70, getline() [69] 'LINE 2 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
...line #3 through #6 are fine too...
line #7, length 70, getline() [69] 'LINE 7 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
line #8, length 92, getline() [91] 'LINE 8 abcdefghijklmnoLINE 1 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
line #9, length 70, getline() [69] 'LINE 2 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
...line #10 through #13 are fine...
line #14, length 70, getline() [69] 'LINE 7 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
line #15, length 92, getline() [91] 'LINE 8 abcdefghijklmnoLINE 1 abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
...
Please note how output lines #8 and #15 are a mix of input LINE 8 and LINE 1.
The code
#include "stdafx.h"
#include <cassert>
#include <iostream>
#include <string>
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <Windows.h>
#include <WinBase.h>
class AsyncReader
{
public:
AsyncReader(boost::asio::io_service& io_service, HANDLE handle)
: io_service_(io_service),
input_buffer(/*size*/ 8192),
input_handle(io_service, handle)
{
start_read();
}
void start_read()
{
boost::asio::async_read_until(input_handle, input_buffer, '\n',
boost::bind(&AsyncReader::handle_read, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
void handle_read(const boost::system::error_code& error, std::size_t length);
// void handle_write(const boost::system::error_code& error);
private:
boost::asio::io_service& io_service_;
boost::asio::streambuf input_buffer;
boost::asio::windows::stream_handle input_handle;
};
void AsyncReader::handle_read(const boost::system::error_code& error, std::size_t length)
{
if (!error)
{
static int count = 0;
++count;
// method 1: (same problem)
// const char* pStart = boost::asio::buffer_cast<const char*>(input_buffer.data());
// std::string s(pStart, length);
// input_buffer.consume(length);
// method 2:
std::istream is(&input_buffer);
std::string s;
assert(std::getline(is, s));
std::cout << "line #" << count << ", length " << length << ", getline() [" << s.size() << "] '" << s << "'\n";
start_read();
}
else if (error == boost::asio::error::not_found)
std::cerr << "Did not receive ending character!\n";
else
std::cerr << "Misc error during read!\n";
}
int _tmain(int argc, _TCHAR* argv[])
{
boost::asio::io_service io_service;
HANDLE handle = ::CreateFile(TEXT("c:/temp/input.txt"),
GENERIC_READ,
0, // share mode
NULL, // security attribute: NULL = default
OPEN_EXISTING, // creation disposition
FILE_FLAG_OVERLAPPED,
NULL // template file
);
AsyncReader obj(io_service, handle);
io_service.run();
std::cout << "Normal termination\n";
getchar();
return 0;
}
My thoughts
- It might be something in the
CreateFile
options - it didn't work at all until I switched toFILE_FLAG_OVERLAPPED
- not sure if there are other requirements that don't even manifest as errors...? - I've tried
input_buffer.commit
and even.consume
- not sure if there's something like that I'm supposed to do, even though all the example code I could find (for sockets) suggestsgetline
takes care of that... - Exasperation / I miss Linux....
One option is to
fseek()
the file to the next position before the user's ReadHandler is called. Thenasync_read_some()
can be implemented asasync_read_at(ftell())
.The AsyncReader can use ReadUntilHandle instead of the stream_handle:
A
stream_handle
will always read at offset zero. I think it's meant for sockets handles and useless for regular files.Calling async_read_until() gets 512 bytes if the streambuf doesn't already contain a newline. The first call reads a bit more than 7 lines. When seven lines are extracted the remainig characters ("LINE 8 abcdefghijklmno") don't have a newline and (the same) 512 bytes are appended.
To solve the problem I'd suggest to use a
random_access_handle
. You have to track the file position manually and replaceasync_read_until
withasync_read_at
.This mailing list post describes the same problem. While
CreateFile
withFILE_FLAG_OVERLAPPED
allows for asynchronous I/O, it does not establish it as a stream in the context of Boost.Asio. For streams, Boost.Asio implementsread_some
asread_some_at
with the offset always being0
. This is the source of the problem, as theReadFile()
documentation states:Adapting to Type Requirements
Boost.Asio is written very generically, often requiring arguments to meet a certain type requirement rather than be a specific type. Therefore, it is often possible to adapt either the I/O object or its service to obtain the desired behavior. First, one must identify what the adapted interface needs to support. In this case,
async_read_until
accepts any type fulfilling the type requirements ofAsyncReadStream
.AsyncReadStream
's requirements are fairly basic, requiring avoid async_read_some(MutableBufferSequence, ReadHandler)
member function.As the offset value will need to be tracked throughout the composed
async_read_until
operation, a simple type meeting the requirements of ReadHandler can be introduced that will wrap an application's ReadHandler, and update the offset accordingly.The
asio_handler_invoke
hook will be found through ADL to support invoking user handlers in the proper context. This is critical for tread safety when a composed operation is being invoked within astrand
. For more details on composed operations and strands, see this answer.The following class will adapt
boost::asio::windows::random_access_handle
to meet the type requirements ofAsyncReadStream
.Alternatively,
boost::asio::windows::basic_stream_handle
can be provided a custom type meeting the requirements of StreamHandleService types, and implementasync_read_some
in terms ofasync_read_some_at
.I have opted for simplicity in the example code, but the same service will be used by multiple I/O objects. Thus, the
offset_stream_handle_service
would need to manage an offset per handler to function properly when multiple I/O objects use the service.To use the adapted types, modify the
AsyncReader::input_handle
member variable to be either abasic_adapted_stream<boost::asio::windows::random_access_handle>
(adapted I/O object) orboost::asio::windows::basic_stream_handle<offset_stream_handle_service>
(adapted service).Example
Here is the complete example based on the original code, only modifying the
AsyncReader::input_handler
's type:Which produces the following output when using the input from the original question:
My input file did not have a
\n
character at the end of LINE F. Thus,AsyncReader::handle_read()
gets invoked with an error ofboost::asio::error::eof
andinput_buffer
's contents contain LINE F. After modifying the final else case to print more information:I get the following output: