Convert Haskell ByteStrings to C++ std::string

2019-07-17 19:47发布

问题:

I want to convert strict ByteStrings from Haskell into C++'s std::string to pass it to a C++ library via the FFI. As the ByteString may contain NULL characters, converting into a CString as an intermediate step is not viable. What is the right approach here?

current solution

Thanks for the answers so far. I hoped for a canonical solution for that task, but maybe it does not exist yet :)

Some c++ library documentation says following:

string ( const char * s, size_t n );

Content is initialized to a copy of the string formed by the first n characters in the array of characters pointed by s.

Therefore one can write such a function which copies once from the ByteString to construct a std::string

foreign import ccall unsafe toCCString_ :: CString -> CUInt -> IO (Ptr CCString)
toCCString :: ByteString -> IO (Ptr CCString)
toCCString bs =
    unsafeUseAsCStringLen bs $ \(cstring,len) ->
    toCCString_ cstring (fromIntegral len)

The C++ code accompanying toCCString_ then would just look like Neil and Alan pointed out.

回答1:

The documentation is great!

type CString = Ptr CChar

A C string is a reference to an array of C characters terminated by NUL.

type CStringLen = (Ptr CChar, Int)

A string with explicit length information in bytes instead of a terminating NUL (allowing NUL characters in the middle of the string).

If you use a CStringLen, you should have no problems. (In fact, I recommend this because interfacing C++ and Haskell is a nightmare.)

NULL characters in the middle of char buffers is only problematic when you don't know how long the data contained therein should be (and thus have to traverse it looking for a NULL, hoping that that's the intended end of the data).



回答2:

Does your ByteString (with its nulls) actually represent a text string? If not, then std::vector<char> would be more appropriate.

That being said, the internal representation of std::string does not depend on null termination so you can have a std::string with null characters in it. Use the constructor with the prototype string(const char * s, size_t n). Just don't depend on .c_str() to interface with anything expecting a null terminated c string.



回答3:

C++ strings can contain null characters. Assuming you have something like this:

char s1[] ="string containing nulls";

then you can convert to a std::string

string s2( s1, length_of_s1 );

The problem is how to get length_of_s1 - obviously you can't use strlen, or similar functions, but presumably your strings are maintaining a length indicator you can use.