Wrapping a LAPACKE function using Cython

I'm trying to wrap the LAPACK function dgtsv (a solver for tridiagonal systems of equations) using Cython.

I came across this previous answer, but since dgtsv is not one of the LAPACK functions that are wrapped in scipy.linalg I don't think I can use this particular approach. Instead I've been trying to follow this example.

Here's the contents of my lapacke.pxd file:

ctypedef int lapack_int

cdef extern from "lapacke.h" nogil:

    int LAPACK_ROW_MAJOR
    int LAPACK_COL_MAJOR

    lapack_int LAPACKE_dgtsv(int matrix_order,
                             lapack_int n,
                             lapack_int nrhs,
                             double * dl,
                             double * d,
                             double * du,
                             double * b,
                             lapack_int ldb)

...here's my thin Cython wrapper in _solvers.pyx:

#!python

cimport cython
from lapacke cimport *

cpdef TDMA_lapacke(double[::1] DL, double[::1] D, double[::1] DU,
                   double[:, ::1] B):

    cdef:
        lapack_int n = D.shape[0]
        lapack_int nrhs = B.shape[1]
        lapack_int ldb = B.shape[0]
        double * dl = &DL[0]
        double * d = &D[0]
        double * du = &DU[0]
        double * b = &B[0, 0]
        lapack_int info

    info = LAPACKE_dgtsv(LAPACK_ROW_MAJOR, n, nrhs, dl, d, du, b, ldb)

    return info

...and here's a Python wrapper and test script:

import numpy as np
from scipy import sparse
from cymodules import _solvers


def trisolve_lapacke(dl, d, du, b, inplace=False):

    if (dl.shape[0] != du.shape[0] or dl.shape[0] != d.shape[0] - 1
            or b.shape != d.shape):
        raise ValueError('Invalid diagonal shapes')

    if b.ndim == 1:
        # b is (LDB, NRHS)
        b = b[:, None]

    # be sure to force a copy of d and b if we're not solving in place
    if not inplace:
        d = d.copy()
        b = b.copy()

    # this may also force copies if arrays are improperly typed/noncontiguous
    dl, d, du, b = (np.ascontiguousarray(v, dtype=np.float64)
                    for v in (dl, d, du, b))

    # b will now be modified in place to contain the solution
    info = _solvers.TDMA_lapacke(dl, d, du, b)
    print info

    return b.ravel()


def test_trisolve(n=20000):

    dl = np.random.randn(n - 1)
    d = np.random.randn(n)
    du = np.random.randn(n - 1)

    M = sparse.diags((dl, d, du), (-1, 0, 1), format='csc')
    x = np.random.randn(n)
    b = M.dot(x)

    x_hat = trisolve_lapacke(dl, d, du, b)

    print "||x - x_hat|| = ", np.linalg.norm(x - x_hat)

Unfortunately, test_trisolve just segfaults on the call to _solvers.TDMA_lapacke. I'm pretty sure my setup.py is correct - ldd _solvers.so shows that _solvers.so is being linked to the correct shared libraries at runtime.

I'm not really sure how to proceed from here - any ideas?

A brief update:

for smaller values of n I tend not to get segfaults immediately, but I do get nonsense results (||x - x_hat|| ought to be very close to 0):

In [28]: test_trisolve2.test_trisolve(10)
0
||x - x_hat|| =  6.23202576396

In [29]: test_trisolve2.test_trisolve(10)
-7
||x - x_hat|| =  3.88623414288

In [30]: test_trisolve2.test_trisolve(10)
0
||x - x_hat|| =  2.60190676562

In [31]: test_trisolve2.test_trisolve(10)
0
||x - x_hat|| =  3.86631743386

In [32]: test_trisolve2.test_trisolve(10)
Segmentation fault

Usually LAPACKE_dgtsv returns with code 0 (which should indicate success), but occasionally I get -7, which means that argument 7 (b) had an illegal value. What's happening is that only the first value of b is actually being modified in place. If I keep on calling test_trisolve I will eventually hit a segfault even when n is small.

回答1:

OK, I figured it out eventually - it seems I've misunderstood what row- and column-major refer to in this case.

Since C-contiguous arrays follow row-major order, I assumed that I ought to specify LAPACK_ROW_MAJOR as the first argument to LAPACKE_dgtsv.

In fact, if I change

info = LAPACKE_dgtsv(LAPACK_ROW_MAJOR, ...)

info = LAPACKE_dgtsv(LAPACK_COL_MAJOR, ...)

then my function works:

test_trisolve2.test_trisolve()
0
||x - x_hat|| =  6.67064747632e-12

This seems pretty counter-intuitive to me - can anyone explain why this is the case?

回答2:

Although rather old the question seems still to be relevant. The observed behavior is the result of a misinterpretation of parameter LDB:

Fortran arrays are col major and the leading dimension of the array B corresponds to N. Therefore LDB >= max(1,N).
With row major LDB corresponds to NRHS and therefore the condition LDB >= max(1,NRHS) must be met.

Comment # b is (LDB, NRHS) is not correct since b has the dimension (LDB,N) and LDB should be 1 in this case.

Switching from LAPACK_ROW_MAJOR to LAPACK_COL_MAJOR fixes the issue as long as NRHS is equal to 1. The memory layout of a col major (N,1) is the same as row major (1,N). It will fail, however, if NRHS is greater than 1.