How to record state of xbox/gamepad controller in

2019-08-10 04:13发布

问题:

I need to know at a specific time the value of all buttons of an xbox controller. The reason being that I'm building a training set for a neural network, and I'm trying to simultaneously take a snapshot of the screen and take a "snapshot" of the controller state. Note that I was able to successfully do this for a keyboard version of this project, but the xbox controller is giving me difficulty.

What I've tried is creating a dictionary of buttons and values, and updating the dictionary every time I receive an event from the controller. Then I would save the image and dictionary as an instance of training data. However, the inputs end up not at all synced with the images. I'm thinking that the issue might be related to threading or subprocesses in one of the packages used to read the controller, but I'm not skilled enough to know how to fix it.

Below is my code.

from inputs import get_gamepad
import time
import cv2
import numpy as np
from mss.windows import MSS as mss

#Only track relevant inputs
gp_state = {#'ABS_HAT0X' : 0, #-1 to 1
             #'ABS_HAT0Y' : 0, #-1 to 1
             #'ABS_RX' : 0, #-32768 to 32767
             #'ABS_RY' : 0, #-32768 to 32767
             'ABS_RZ' : 0, #0 to 255
             'ABS_X' : 0, #-32768 to 32767
             'ABS_Y' : 0, #-32768 to 32767
             #'ABS_Z' : 0, #0 to 255
             'BTN_EAST' : 0,
             'BTN_NORTH' : 0,
             #'BTN_SELECT' : 0,
             'BTN_SOUTH' : 0,
             #'BTN_START' : 0,
             #'BTN_THUMBL' : 0,
             #'BTN_THUMBR' : 0,
             'BTN_TL' : 0,
             'BTN_TR' : 0,
             'BTN_WEST' : 0,
             #'SYN_REPORT' : 0,
             }

dead_zone = 7500

def screen_record(): 
    last_time = time.time()
    while(True):
        # 800x600 windowed mode
        printscreen =  np.array(ImageGrab.grab(bbox=(0,40,800,640)))
        last_time = time.time()
        cv2.imshow('window',cv2.cvtColor(printscreen, cv2.COLOR_BGR2RGB))
        if cv2.waitKey(25) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            break

def process_img(image):
    original_image = image
    processed_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    contrast = 1
    brightness = 0
    out = cv2.addWeighted(processed_img, contrast, processed_img, 0, brightness)
    return out

def main():

    #Give myself time to switch windows
    #Screen should be in top left
    for _ in range(4):
        time.sleep(1)

    controller_input = np.zeros(5)
    training_data = []
    training_files = 0

    with mss() as sct:
        while True:
            #Get screen and display
            bbox = (150,240,650,490)
            screen =  np.array(sct.grab(bbox))
            new_screen = process_img(screen)
            cv2.imshow('window', new_screen)
            new_screen = cv2.resize(new_screen, (100,50))

            #Map events to dictionary
            events = get_gamepad()
            for event in events:
                gp_state[event.code] = event.state

            #Set to zero if in dead zone
            if abs(gp_state['ABS_X']) < dead_zone:
                gp_state['ABS_X'] = 0 

            if abs(gp_state['ABS_Y']) < dead_zone:
                gp_state['ABS_Y'] = 0 

            #Set values to be between 0 and 1.
            controller_input[0] = (gp_state['ABS_X'] + 32768) / (32767 + 32768)
            controller_input[1] = gp_state['ABS_RZ'] / 255
            controller_input[2] = gp_state['BTN_SOUTH']
            controller_input[3] = gp_state['BTN_EAST']
            controller_input[4] = gp_state['BTN_TR']


            record = gp_state['BTN_NORTH'] #Record while holding y button
            if record:
                training_data.append(np.array([new_screen, controller_input]))
                print(controller_input)
                time.sleep(1)

            if len(training_data) % 500 == 0 and record:
                filename = f"training_data/rlb_XBOXtrain_{time.time()}.npy"
                np.save(filename, training_data)
                training_files += 1
                print(f"Trained {training_files} files!")
                training_data = []


            if cv2.waitKey(25) & 0xFF == ord('q'):
                cv2.destroyAllWindows()
                break

main()

I feel like I am making this way harder than it needs to be. But is there an easier way to just get the state of the controller at a certain point in time?

Note that I've found some solutions that work for Linux, but I am running in Windows 10. Here is an example of a Linux solution: https://github.com/FRC4564/Xbox

回答1:

the TensorKart Project has already solved that problem: https://github.com/kevinhughes27/TensorKart/blob/master/utils.py



回答2:

I feel like I am making this way harder than it needs to be.

No, this is actually hard. It's hard because you don't need to just know what the gamepad state is at a particular time, you also want to know which gamepad state was used to draw a particular frame. The time that the gamepad state was sampled will always be earlier than the time the frame was drawn, and may be delayed due to latency added by the app itself. The added latency might be constant for the whole app or it might vary between different parts of the app. It's not something you can easily account for.

Your python script is recording gamepad inputs as soon as they are received, so I'd expect it to always run at least a frame or two ahead of the screen captures.

I'm thinking that the issue might be related to threading or subprocesses in one of the packages used to read the controller, but I'm not skilled enough to know how to fix it.

It's probably just latency added by the gamepad input code in the app you're measuring and not something that can be fixed. Most apps don't make any attempt to respond to gamepad inputs as soon as they're received and instead handle them all at once during the per-frame update step. On average, that adds latency equal to half the frame rate.

How to fix this? I think measuring gamepad state from another application is going to be difficult due to the latency issues. If you can, it would be best to instrument the app to record the gamepad state during its main loop, that way you know you are recording what was actually used. On Windows it should be possible to do this by providing your own version of the XInput DLL that can record the current state whenever XInputGetState is called.