I've asked this question before with no replies. I'm asking it again, much more simplified this time.
I have a dll called by Python ctypes, with a callback function. The callback works correctly all the way through (I can see it in operation if I step through the program in Visual Studio), but on exit Visual Studio throws an "access violation" exception. BUT if I remove the call to the callback from the dll, it exits normally without an access violation.
Is there something else I must do to exit from a dll with a callback? I have researched this for hours and I haven't found anything online that solves this.
Here's the ctypes code. I omitted the dll code to keep this short (it's written in NASM) but if it's needed I can post it, too.
def SimpleTestFunction_asm(X):
Input_Length_Array = []
Input_Length_Array.append(len(X)*8)
CA_X = (ctypes.c_double * len(X))(*X)
length_array_out = (ctypes.c_double * len(Input_Length_Array))(*Input_Length_Array)
hDLL = ctypes.WinDLL("C:/Test_Projects/SimpleTestFunction/SimpleTestFunction.dll")
CallName = hDLL.Main_Entry_fn
CallName.argtypes = [ctypes.POINTER(ctypes.c_double),ctypes.POINTER(ctypes.c_double),ctypes.POINTER(ctypes.c_longlong)]
CallName.restype = ctypes.POINTER(ctypes.c_int64)
#__________
#The callback function
LibraryCB = ctypes.WINFUNCTYPE(ctypes.c_double, ctypes.c_double)
def LibraryCall(ax):
bx = math.ceil(ax)
return (bx)
lib_call = LibraryCB(LibraryCall)
lib_call = ctypes.cast(lib_call,ctypes.POINTER(ctypes.c_longlong))
#__________
ret_ptr = CallName(CA_X,length_array_out,lib_call)
I would really REALLY appreciate any ideas on how to solve this. I hope this simplified post will help.
Thanks very much.
I made some minor changes to your code to make actually run (imports) and added a print to see the addresses of the objects passed and the return value, plus created an equivalent C DLL to ensure the pointers pass correctly and the callback works.
Python:
Test.DLL source:
Output:
You can see the pointers are the same and the callback return value and function return value are correct.
It is likely your NASM code isn't implementing the calling convention correctly or corrupting the stack accessing the arrays. I just did the minimum to make your Python code work. I did think it odd that
length_array_out
is always a length 1 double array with a value 8 times the length of the input arrayX
. How does the NASM code know how long the arrays are?You could be more type-correct and declare the following instead of casting the callback to a
long long *
:@Mark Tolonen, thank you very much for your detailed analysis. I'm posting this as an answer because the formatting of the code won't come out correctly in a comment -- but I chose your answer as the best answer.
I suspected that stack alignment may be the problem, and you eliminated ctypes as the source, so I focused on the stack. Here's what I did to make it work.
In the NASM code, I push rbp and rdi on entry, then restore them on exit. Here, before the call, I set the stack state back by popping rbp and rdi from the stack. Then I subtract 32 bytes (not 40) from rsp. When the call is finished, I restore the stack state:
For an external function call (like to a C library function), I have to subtract 40 bytes, but for this callback I need only 32 bytes. Before your answer I had tried that with 40 bytes and it didn't work. I guess the reason is because it's not calling an external library, it's a callback to the ctypes code that's callling the dll in the first place.
One other thing. The call sends a floating-point value (xmm0) and returns an integer value, but the integer value is returned in the xmm0 register, not rax. Setting the prototype in ctypes to an integer return doesn't do it. It has to stay like this:
Thanks again for your reply. You showed me where to look.
P.S. length_array_out passes the length of the input array to NASM. If I pass more than one array, length_array_out will be longer with one qword for each length; currently I convert the qword to integer on entry.