I'm in the process of building an automated game bot in Python on OS X 10.8.2 and in the process of researching Python GUI automation I discovered autopy. The mouse manipulation API is great, but it seems that the screen capture methods rely on deprecated OpenGL methods...
Are there any efficient ways of getting the color value of a pixel in OS X? The only way I can think of now is to use os.system("screencapture foo.png")
but the process seems to have unneeded overhead as I'll be polling very quickly.
A small improvement, but using the TIFF compression option for
screencapture
is a bit quicker:This does have a lot of overhead, as you say (the subprocess creation, writing/reading from disc, compressing/decompressing).
Instead, you could use PyObjC to capture the screen using
CGWindowListCreateImage
. I found it took about 70ms (~14fps) to capture a 1680x1050 pixel screen, and have the values accessible in memoryA few random notes:
Quartz.CoreGraphics
module is the slowest part, about 1 second. Same is true for importing most of the PyObjC modules. Unlikely to matter in this case, but for short-lived processes you might be better writing the tool in ObjCCGDataProviderCopyData
call - I wonder if there's a way to access the data directly, since we dont need to modify it?ScreenPixel.pixel
function is pretty quick, but accessing large numbers of pixels is still slow (since0.01ms * 1650*1050
is about 17 seconds) - if you need to access lots of pixels, probably quicker tostruct.unpack_from
them all in one go.Here's the code:
I came across this post while searching for a solution to get screenshot in Mac OS X used for real-time processing. I have tried using ImageGrab from PIL as suggested in some other posts but couldn't get the data fast enough (with only about 0.5 fps).
The answer https://stackoverflow.com/a/13024603/3322123 in this post to use PyObjC saved my day! Thanks @dbr!
However, my task requires to get all pixel values rather than just a single pixel, and also to comment on the third note by @dbr, I added a new method in this class to get a full image, in case anyone else might need it.
The image data are returned as a numpy array with dimension of (height, width, 3), which can be directly used for post-processing in numpy or opencv etc… getting individual pixel values from it also becomes pretty trivial using numpy indexing.
I tested the code with a 1600 x 1000 screenshot - getting the data using capture() took ~30 ms and converting it to a np array getimage() takes only ~50 ms on my Macbook. So now I have >10 fps and even faster for smaller regions.
note I throw away the “alpha” channel from the BGRA 4 channel.
This was all so very helpful I had to come back to comment / however I don't have the reputation.. I do, however, have a sample code of a combination of the answers above for a lightning quick screen capture / save thanks to @dbr and @qqg!