I am using OpenCV 2.2 on the iPhone to detect faces. I'm using the IOS 4's AVCaptureSession to get access to the camera stream, as seen in the code that follows.
My challenge is that the video frames come in as CVBufferRef (pointers to CVImageBuffer) objects, and they come in oriented as a landscape, 480px wide by 300px high. This is fine if you are holding the phone sideways, but when the phone is held in the upright position I want to rotate these frames 90 degrees clockwise so that OpenCV can find the faces correctly.
I could convert the CVBufferRef to a CGImage, then to a UIImage, and then rotate, as this person is doing: Rotate CGImage taken from video frame
However that wastes a lot of CPU. I'm looking for a faster way to rotate the images coming in, ideally using the GPU to do this processing if possible.
Any ideas?
Ian
Code Sample:
-(void) startCameraCapture {
// Start up the face detector
faceDetector = [[FaceDetector alloc] initWithCascade:@"haarcascade_frontalface_alt2" withFileExtension:@"xml"];
// Create the AVCapture Session
session = [[AVCaptureSession alloc] init];
// create a preview layer to show the output from the camera
AVCaptureVideoPreviewLayer *previewLayer = [AVCaptureVideoPreviewLayer layerWithSession:session];
previewLayer.frame = previewView.frame;
previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill;
[previewView.layer addSublayer:previewLayer];
// Get the default camera device
AVCaptureDevice* camera = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
// Create a AVCaptureInput with the camera device
NSError *error=nil;
AVCaptureInput* cameraInput = [[AVCaptureDeviceInput alloc] initWithDevice:camera error:&error];
if (cameraInput == nil) {
NSLog(@"Error to create camera capture:%@",error);
}
// Set the output
AVCaptureVideoDataOutput* videoOutput = [[AVCaptureVideoDataOutput alloc] init];
videoOutput.alwaysDiscardsLateVideoFrames = YES;
// create a queue besides the main thread queue to run the capture on
dispatch_queue_t captureQueue = dispatch_queue_create("catpureQueue", NULL);
// setup our delegate
[videoOutput setSampleBufferDelegate:self queue:captureQueue];
// release the queue. I still don't entirely understand why we're releasing it here,
// but the code examples I've found indicate this is the right thing. Hmm...
dispatch_release(captureQueue);
// configure the pixel format
videoOutput.videoSettings = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA],
(id)kCVPixelBufferPixelFormatTypeKey,
nil];
// and the size of the frames we want
// try AVCaptureSessionPresetLow if this is too slow...
[session setSessionPreset:AVCaptureSessionPresetMedium];
// If you wish to cap the frame rate to a known value, such as 10 fps, set
// minFrameDuration.
videoOutput.minFrameDuration = CMTimeMake(1, 10);
// Add the input and output
[session addInput:cameraInput];
[session addOutput:videoOutput];
// Start the session
[session startRunning];
}
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
// only run if we're not already processing an image
if (!faceDetector.imageNeedsProcessing) {
// Get CVImage from sample buffer
CVImageBufferRef cvImage = CMSampleBufferGetImageBuffer(sampleBuffer);
// Send the CVImage to the FaceDetector for later processing
[faceDetector setImageFromCVPixelBufferRef:cvImage];
// Trigger the image processing on the main thread
[self performSelectorOnMainThread:@selector(processImage) withObject:nil waitUntilDone:NO];
}
}
If you rotate at 90 degree stops then you can just do it in memory. Here is example code that just simply copies the data to a new pixel buffer. Doing a brute force rotation should be straight forward.
You can then use AVAssetWriterInputPixelBufferAdaptor if you are writing this back out to an AVAssetWriterInput.
The above is not optimized. You may want to look for a more efficient copy algorithm. A good place to start is with In-place Matrix Transpose. You would also want to use a pixel buffer pool rather then create a new one each time.
Edit. You could use the GPU to do this. This sounds like a lot of data being pushed around. In CVPixelBufferRef there is the key kCVPixelBufferOpenGLCompatibilityKey. I assume you could create a OpenGL compatible image from the CVImageBufferRef (which is just a pixel buffer ref), and push it through a shader. Again, overkill IMO. You may see if BLAS or LAPACK has 'out of place' transpose methods. If they do then you can be assured they are highly optimized.
90 CW where new_width = width ... This will get you a portrait oriented image.
I know this is quite old question, but I've been solving similar problem recently and maybe someone can find my solution useful.
I needed to extract raw image data from image buffer of YCbCr format delivered by iPhone camera (got from [AVCaptureVideoDataOutput.availableVideoCVPixelFormatTypes firstObject]), dropping information such as headers, meta information etc to pass it to further processing.
Also, I needed to extract only small area in the center of captured video frame, so some cropping was needed.
My conditions allowed capturing video only in either landscape orientation, but when a device is positioned in landscape left orientation, image is delivered turned upside down, so I needed to flip it in both axis. In case the image is flipped, my idea was to copy data from the source image buffer in reverse order and reverse bytes in each row of read data to flip image in both axis. That idea really works, and as I needed to copy data from source buffer anyway, it seems there's not much performance penalty if reading from the start or the end (Of course, bigger image = longer processing, but I deal with really small numbers).
I'd like to know what others think about this solution and of course some hints how to improve the code:
Maybe easier to just set the video orientation the way you want:
This way you don't need to do that rotation gimmick at all
vImage is a pretty fast way to do it. Requires ios5 though. The call says ARGB but it works for the BGRA you get from the buffer.
This also has the advantage that you can cut out a part of the buffer and rotate that. See my answer here