Slow communication using shared memory between use

I am running a thread in the Windows kernel communicating with an application over shared memory. Everything is working fine except the communication is slow due to a Sleep loop. I have been investigating spin locks, mutexes and interlocked but can't really figure this one out. I have also considered Windows events but don't know about the performance of that one. Please advice on what would be a faster solution keeping the communication over shared memory possibly suggesting Windows events.

KERNEL CODE

typedef struct _SHARED_MEMORY
{
    BOOLEAN mutex;
    CHAR data[BUFFER_SIZE];
} SHARED_MEMORY, *PSHARED_MEMORY;

ZwCreateSection(...)
ZwMapViewOfSection(...)

while (TRUE) {
    if (((PSHARED_MEMORY)SharedSection)->mutex == TRUE) {
      //... do work...
      ((PSHARED_MEMORY)SharedSection)->mutex = FALSE;
    }
    KeDelayExecutionThread(KernelMode, FALSE, &PollingInterval);
}

APPLICATION CODE

OpenFileMapping(...)
MapViewOfFile(...)

...

RtlCopyMemory(&SM->data, WriteData, Size);
SM->mutex = TRUE;

while (SM->mutex != FALSE) {
    Sleep(1); // Slow and removing it will cause an infinite loop
}

RtlCopyMemory(ReadData, &SM->data, Size);

UPDATE 1 Currently this is the fastest solution I have come up with:

while(InterlockedCompareExchange(&SM->mutex, FALSE, FALSE));

However I find it funny that you need to do an exchange and that there is no function for only compare.

You don't want to use InterlockedCompareExchange. It burns the CPU, saturates core resources that might be needed by another thread sharing that physical core, and can saturate inter-core buses.

You do need to do two things:

1) Write an InterlockedGet function and use it.

2) Prevent the loop from burning CPU resources and from taking the mother of all mispredicted branches when it finally gets unblocked.

For 1, this is known to work on all compilers that support InterlockedCompareExchange, at least last time I checked:

__inline static int InterlockedGet(int *val)
{
    return *((volatile int *)val);
}

For 2, put this as the body of the wait loop:

__asm
{
    rep nop
}

For x86 CPUs, this is specified to solve the resource saturation and branch prediction problems.

Putting it together:

while ((*(volatile int *) &SM->mutex) != FALSE) {
    __asm
    {
        rep nop
    }
}

Change int as needed if it's not appropriate.

Slow communication using shared memory between use

问题:

回答1:

收藏的人(0)

Slow communication using shared memory between use

问题:

回答1:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮