The problem is that now, I have to use the Posix C getline
function to get the line from the file, only then convert it to a Python Unicode Object using PyUnicode_DecodeUTF8
and cache it using my caching policy algorithm. This process is losing 23% of performance compared to Python builtin for line in file
C implementation.
If I remove the PyUnicode_DecodeUTF8
call from my code, then, my implementation using the Posix C getline
becomes 5%
faster than the Python builtin for line in file
C implementation. So, if I can just make Python directly give me a Python Unicode String object, instead of having to call the Posix C getline
function first (only then convert its result to a Python Unicode Object), my code performance would improve almost by 20%
(from a maximum of 23%
), i.e., it will not become 100%
equivalent to for line in file
performance because I am doing a little work by caching stuff, however this overhead is minimal.
For example, I would like to take the _textiowrapper_readline() function and use it in my code like this:
#include <Python.h>
#include <textio.c.h> // C Python file defininig:
// _textiowrapper_readline(),
// CHECK_ATTACHED(),
// PyUnicode_READY(), etc
typedef struct
{
PyObject_HEAD
}
PyMymoduleExtendingPython;
static PyObject*
PyMymoduleExtendingPython_iternext(PyMymoduleExtendingPython* self, PyObject* args)
{
PyObject *line;
CHECK_ATTACHED(self);
line = _textiowrapper_readline(self, -1); // <- function from `textio.c`
if (line == NULL || PyUnicode_READY(line) == -1)
return NULL;
if (PyUnicode_GET_LENGTH(line) == 0) {
/* Reached EOF or would have blocked */
Py_DECREF(line);
Py_CLEAR(self->snapshot);
self->telling = self->seekable;
return NULL;
}
return line;
}
// create my module
PyMODINIT_FUNC PyInit_mymodule_extending_python_api(void)
{
PyObject* mymodule;
PyMymoduleExtendingPython.tp_iternext =
(iternextfunc) PyMymoduleExtendingPython_iternext;
Py_INCREF( &PyMymoduleExtendingPython );
PyModule_AddObject( mymodule, "FastFile", (PyObject*) &PyMymoduleExtendingPython );
return mymodule;
}
How could I include the textio implementation from C Python and reuse its code on my own Python C Extension/API?
As presented in my last question, How to improve Python C Extensions file line reading?, the Python builtin methods for reading lines are faster than writing my own with C or C++ standard methods to get lines from a file.
On this answer, it was suggested for me to reimplement the Python algorithm by reading chunks of 8KB and only then calling PyUnicode_DecodeUTF8
to decode them, instead of calling PyUnicode_DecodeUTF8
on every line I read.
However, instead of rewriting all C Python code already written/done/ready to read lines, I could just call its "getline" function _textiowrapper_readline()
to directly get the line as a Python Unicode Object, then, cache it/use as I am already doing with the lines I get from Posix C getline
function (and pass to PyUnicode_DecodeUTF8()
decode them into Python Unicode Objects).
I did not manage to directly import the C API (Extensions) functions, but I used Python to import the io
module, which has a link/reference to the global builtin function open
as io.open()
.
bool hasfinished;
const char* filepath;
long long int linecount;
std::deque<PyObject*> linecache;
PyObject* iomodule;
PyObject* openfile;
PyObject* fileiterator;
FastFile(const char* filepath) : hasfinished(false), filepath(filepath), linecount(0) {
iomodule = PyImport_ImportModule( "io" );
if( iomodule == NULL ) {
std::cerr << "ERROR: FastFile failed to import the io module '"
<< filepath << "')!" << std::endl;
PyErr_Print();
return;
}
PyObject* openfunction = PyObject_GetAttrString( iomodule, "open" );
if( openfunction == NULL ) {
std::cerr << "ERROR: FastFile failed get the io module open function '"
<< filepath << "')!" << std::endl;
PyErr_Print();
return;
}
openfile = PyObject_CallFunction( openfunction, "s", filepath,
"s", "r", "i", -1, "s", "UTF8", "s", "replace" );
PyObject* iterfunction = PyObject_GetAttrString( openfile, "__iter__" );
Py_DECREF( openfunction );
if( iterfunction == NULL ) {
std::cerr << "ERROR: FastFile failed get the io module iterator function '"
<< filepath << "')!" << std::endl;
PyErr_Print();
return;
}
PyObject* openfileresult = PyObject_CallObject( iterfunction, NULL );
Py_DECREF( iterfunction );
if( openfileresult == NULL ) {
std::cerr << "ERROR: FastFile failed get the io module iterator object '"
<< filepath << "')!" << std::endl;
PyErr_Print();
return;
}
fileiterator = PyObject_GetAttrString( openfile, "__next__" );
Py_DECREF( openfileresult );
if( fileiterator == NULL ) {
std::cerr << "ERROR: FastFile failed get the io module iterator object '"
<< filepath << "')!" << std::endl;
PyErr_Print();
return;
}
}
~FastFile() {
this->close();
Py_XDECREF( iomodule );
Py_XDECREF( openfile );
Py_XDECREF( fileiterator );
for( PyObject* pyobject : linecache ) {
Py_DECREF( pyobject );
}
}
void close() {
PyObject* closefunction = PyObject_GetAttrString( openfile, "close" );
if( closefunction == NULL ) {
std::cerr << "ERROR: FastFile failed get the close file function for '"
<< filepath << "')!" << std::endl;
PyErr_Print();
return;
}
PyObject* closefileresult = PyObject_CallObject( closefunction, NULL );
Py_DECREF( closefunction );
if( closefileresult == NULL ) {
std::cerr << "ERROR: FastFile failed close open file '"
<< filepath << "')!" << std::endl;
PyErr_Print();
return;
}
Py_DECREF( closefileresult );
}
bool _getline() {
// Fix StopIteration being raised multiple times because
// _getlines is called multiple times
if( hasfinished ) { return false; }
PyObject* readline = PyObject_CallObject( fileiterator, NULL );
if( readline != NULL ) {
linecount += 1;
linecache.push_back( readline );
return true;
}
// PyErr_Print();
PyErr_Clear();
hasfinished = true;
return false;
}
When compiling this with Visual Studio Compiler
, it has the following performance using this code:
print( 'fastfile_time %.2f%%, python_time %.2f%%' % (
fastfile_time/python_time, python_time/fastfile_time ), flush=True )
$ python3 fastfileperformance.py
Python timedifference 0:00:00.985254
FastFile timedifference 0:00:01.084283
fastfile_time 1.10%, python_time 0.91% = 0.09%
$ python3 fastfileperformance.py
Python timedifference 0:00:00.979861
FastFile timedifference 0:00:01.073879
fastfile_time 1.10%, python_time 0.91% = 0.09%
$ python3 fastfileperformance.py
Python timedifference 0:00:00.990369
FastFile timedifference 0:00:01.086416
fastfile_time 1.10%, python_time 0.91% = 0.09%
$ python3 fastfileperformance.py
Python timedifference 0:00:00.975223
FastFile timedifference 0:00:01.077857
fastfile_time 1.11%, python_time 0.90% = 0.10%
$ python3 fastfileperformance.py
Python timedifference 0:00:00.988327
FastFile timedifference 0:00:01.085866
fastfile_time 1.10%, python_time 0.91% = 0.09%
$ python3 fastfileperformance.py
Python timedifference 0:00:00.971848
FastFile timedifference 0:00:01.087894
fastfile_time 1.12%, python_time 0.89% = 0.11%
$ python3 fastfileperformance.py
Python timedifference 0:00:00.968116
FastFile timedifference 0:00:01.079976
fastfile_time 1.12%, python_time 0.90% = 0.10%
$ python3 fastfileperformance.py
Python timedifference 0:00:00.980856
FastFile timedifference 0:00:01.068325
fastfile_time 1.09%, python_time 0.92% = 0.08%
But when compiling it with g++
, it got this performance:
$ /bin/python3.6 fastfileperformance.py
Python timedifference 0:00:00.703964
FastFile timedifference 0:00:00.813478
fastfile_time 1.16%, python_time 0.87% = 0.13%
$ /bin/python3.6 fastfileperformance.py
Python timedifference 0:00:00.703432
FastFile timedifference 0:00:00.809531
fastfile_time 1.15%, python_time 0.87% = 0.13%
$ /bin/python3.6 fastfileperformance.py
Python timedifference 0:00:00.705319
FastFile timedifference 0:00:00.814130
fastfile_time 1.15%, python_time 0.87% = 0.13%
$ /bin/python3.6 fastfileperformance.py
Python timedifference 0:00:00.711852
FastFile timedifference 0:00:00.837132
fastfile_time 1.18%, python_time 0.85% = 0.15%
$ /bin/python3.6 fastfileperformance.py
Python timedifference 0:00:00.695033
FastFile timedifference 0:00:00.800901
fastfile_time 1.15%, python_time 0.87% = 0.13%
$ /bin/python3.6 fastfileperformance.py
Python timedifference 0:00:00.694661
FastFile timedifference 0:00:00.796754
fastfile_time 1.15%, python_time 0.87% = 0.13%
$ /bin/python3.6 fastfileperformance.py
Python timedifference 0:00:00.699377
FastFile timedifference 0:00:00.816715
fastfile_time 1.17%, python_time 0.86% = 0.14%
$ /bin/python3.6 fastfileperformance.py
Python timedifference 0:00:00.699229
FastFile timedifference 0:00:00.818774
fastfile_time 1.17%, python_time 0.85% = 0.15%