How to get printed output from ctypes C functions

2019-04-28 06:14发布

问题:

Introduction

Suppose I have this C code:

#include <stdio.h>

// Of course, these functions are simplified for the purposes of this question.
// The actual functions are more complex and may receive additional arguments.

void printout() {
    puts("Hello");
}
void printhere(FILE* f) {
    fputs("Hello\n", f);
}

That I'm compiling as a shared object (DLL): gcc -Wall -std=c99 -fPIC -shared example.c -o example.so

And then I'm importing it into Python 3.x running inside Jupyter or IPython notebook:

import ctypes
example = ctypes.cdll.LoadLibrary('./example.so')

printout = example.printout
printout.argtypes = ()
printout.restype = None

printhere = example.printhere
printhere.argtypes = (ctypes.c_void_p)  # Should have been FILE* instead
printhere.restype = None

Question

How can I execute both printout() and printhere() C functions (through ctypes) and get the output printed inside the Jupyter/IPython notebook?

If possible, I want to avoid writing more C code. I would prefer a pure-Python solution.

I also would prefer to avoid writing to a temporary file. Writing to a pipe/socket might be reasonable, though.

The the expected state, the current state

If I type the following code in one Notebook cell:

print("Hi")           # Python-style print
printout()            # C-style print
printhere(something)  # C-style print
print("Bye")          # Python-style print

I want to get this output:

Hi
Hello
Hello
Bye

But, instead, I only get the Python-style output results inside the notebook. The C-style output gets printed to the terminal that started the notebook process.

Research

As far as I know, inside Jupyter/IPython notebook, the sys.stdout is not a wrapper to any file:

import sys

sys.stdout

# Output in command-line Python/IPython shell:
<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
# Output in IPython Notebook:
<IPython.kernel.zmq.iostream.OutStream at 0x7f39c6930438>
# Output in Jupyter:
<ipykernel.iostream.OutStream at 0x7f6dc8f2de80>

sys.stdout.fileno()

# Output in command-line Python/IPython shell:
1
# Output in command-line Jupyter and IPython notebook:
UnsupportedOperation: IOStream has no fileno.

Related questions and links:

  • Python ctypes: Python file object <-> C FILE *
  • Python 3 replacement for PyFile_AsFile
  • Using fopen, fwrite and fclose through ctypes
  • Python ctypes DLL stdout
  • Python: StringIO for Popen - Workaround for the lack of fileno() in StringIO, but only applies to subprocess.Popen.

The following two links use similar solutions that involve creating a temporary file. However, care must be taken when implementing such solution to make sure both Python-style output and C-style output gets printed in the correct order.

  • How do I prevent a C shared library to print on stdout in python?
  • Redirecting all kinds of stdout in Python

Is it possible to avoid a temporary file?

I tried finding a solution using C open_memstream() and assigning the returned FILE* to stdout, but it did not work because stdout cannot be assigned.

Then I tried getting the fileno() of the stream returned by open_memstream(), but I can't because it has no file descriptor.

Then I looked at freopen(), but its API requires passing a filename.

Then I looked at Python's standard library and found tempfile.SpooledTemporaryFile(), which is a temporary file-like object in memory. However, it gets written to the disk as soon as fileno() is called.

So far, I couldn't find any memory-only solution. Most likely, we will need to use a temporary file anyway. (Which is not a big deal, but just some extra overhead and extra cleanup that I'd prefer to avoid.)

It may be possible to use os.pipe(), but that seems difficult to do without forking.

回答1:

I've finally developed a solution. It requires wrapping the entire cell inside a context manager (or wrapping only the C code). It also uses a temporary file, since I couldn't find any solution without using one.

The full notebook is available as a GitHub Gist: https://gist.github.com/denilsonsa/9c8f5c44bf2038fd000f


Part 1: Preparing the C library in Python

import ctypes

# use_errno parameter is optional, because I'm not checking errno anyway.
libc = ctypes.CDLL(ctypes.util.find_library('c'), use_errno=True)

class FILE(ctypes.Structure):
    pass

FILE_p = ctypes.POINTER(FILE)

# Alternatively, we can just use:
# FILE_p = ctypes.c_void_p

# These variables, defined inside the C library, are readonly.
cstdin = FILE_p.in_dll(libc, 'stdin')
cstdout = FILE_p.in_dll(libc, 'stdout')
cstderr = FILE_p.in_dll(libc, 'stderr')

# C function to disable buffering.
csetbuf = libc.setbuf
csetbuf.argtypes = (FILE_p, ctypes.c_char_p)
csetbuf.restype = None

# C function to flush the C library buffer.
cfflush = libc.fflush
cfflush.argtypes = (FILE_p,)
cfflush.restype = ctypes.c_int

Part 2: Building our own context manager to capture stdout

import io
import os
import sys
import tempfile
from contextlib import contextmanager

@contextmanager
def capture_c_stdout(encoding='utf8'):
    # Flushing, it's a good practice.
    sys.stdout.flush()
    cfflush(cstdout)

    # We need to use a actual file because we need the file descriptor number.
    with tempfile.TemporaryFile(buffering=0) as temp:
        # Saving a copy of the original stdout.
        prev_sys_stdout = sys.stdout
        prev_stdout_fd = os.dup(1)
        os.close(1)

        # Duplicating the temporary file fd into the stdout fd.
        # In other words, replacing the stdout.
        os.dup2(temp.fileno(), 1)

        # Replacing sys.stdout for Python code.
        #
        # IPython Notebook version of sys.stdout is actually an
        # in-memory OutStream, so it does not have a file descriptor.
        # We need to replace sys.stdout so that interleaved Python
        # and C output gets captured in the correct order.
        #
        # We enable line_buffering to force a flush after each line.
        # And write_through to force all data to be passed through the
        # wrapper directly into the binary temporary file.
        temp_wrapper = io.TextIOWrapper(
            temp, encoding=encoding, line_buffering=True, write_through=True)
        sys.stdout = temp_wrapper

        # Disabling buffering of C stdout.
        csetbuf(cstdout, None)

        yield

        # Must flush to clear the C library buffer.
        cfflush(cstdout)

        # Restoring stdout.
        os.dup2(prev_stdout_fd, 1)
        os.close(prev_stdout_fd)
        sys.stdout = prev_sys_stdout

        # Printing the captured output.
        temp_wrapper.seek(0)
        print(temp_wrapper.read(), end='')

Part Fun: Using it!

libfoo = ctypes.CDLL('./foo.so')

printout = libfoo.printout
printout.argtypes = ()
printout.restype = None

printhere = libfoo.printhere
printhere.argtypes = (FILE_p,)
printhere.restype = None


print('Python Before capturing')
printout()  # Not captured, goes to the terminal

with capture_c_stdout():
    print('Python First')
    printout()
    print('Python Second')
    printhere(cstdout)
    print('Python Third')

print('Python After capturing')
printout()  # Not captured, goes to the terminal

Output:

Python Before capturing
Python First
C printout puts
Python Second
C printhere fputs
Python Third
Python After capturing

Credits and further work

This solution is fruit of reading all the links I linked at the question, plus a lot of trial and error.

This solution only redirects stdout, it could be interesting to redirect both stdout and stderr. For now, I'm leaving this as an exercise to the reader. ;)

Also, there is no exception handling in this solution (at least not yet).