I'm recording video and audio using AVCaptureVideoDataOutput
and AVCaptureAudioDataOutput
and in the captureOutput:didOutputSampleBuffer:fromConnection:
delegate method, I want to draw text onto each individual sample buffer I'm receiving from the video connection. The text changes with about every frame (it's a stopwatch label) and I want that to be recorded on top of the video data that's captured.
Here's what I've been able to come up with so far:
//1.
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
//2.
UIImage *textImage = [self createTextImage];
CIImage *maskImage = [CIImage imageWithCGImage:textImage.CGImage];
//3.
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
NSDictionary *options = [NSDictionary dictionaryWithObject:(__bridge id)colorSpace forKey:kCIImageColorSpace];
CIImage *inputImage = [CIImage imageWithCVPixelBuffer:pixelBuffer options:options];
//4.
CIFilter *filter = [CIFilter filterWithName:@"CIBlendWithMask"];
[filter setValue:inputImage forKey:@"inputImage"];
[filter setValue:maskImage forKey:@"inputMaskImage"];
CIImage *outputImage = [filter outputImage];
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
//5.
[self.renderContext render:outputImage toCVPixelBuffer:pixelBuffer bounds:[outputImage extent] colorSpace:CGColorSpaceCreateDeviceRGB()];
//6.
[self.pixelBufferAdaptor appendPixelBuffer:pixelBuffer withPresentationTime:timestamp];
- Here I grab the pixel buffer, easy as pie.
- I use core graphics to write text to a blank UIImage (that's what
createTextImage
does. I was able to verify that this step works; I saved an image with text drawn to it to my photos. - I create a CGImage from the pixel buffer.
- I make a CIFilter for
CIBlendWithMask
, setting the input image as the one created from the original pixel buffer and the input mask as theCIImage
made from the image with text drawn on it. - Finally, I render the filter output image to the pixelBuffer. The
CIContext
was created beforehand with[CIContext contextWithOptions:nil];
. - After all that, I append the pixel buffer to my
pixelBufferAdaptor
with the appropriate timestamp.
The video that's saved at the end of recording has no visible changes to it i.e. no mask image has been drawn onto the pixel buffers.
Anyone have any idea where I'm going wrong here? I've been stuck on this for days, any help would be so appreciated.
EDIT:
- (UIImage *)createTextImage {
UIGraphicsBeginImageContextWithOptions(CGSizeMake(self.view.bounds.size.width, self.view.bounds.size.height), NO, 1.0);
NSMutableAttributedString *timeStamp = [[NSMutableAttributedString alloc]initWithString:self.timeLabel.text attributes:@{NSForegroundColorAttributeName:self.timeLabel.textColor, NSFontAttributeName: self.timeLabel.font}];
NSMutableAttributedString *countDownString = [[NSMutableAttributedString alloc]initWithString:self.cDownLabel.text attributes:@{NSForegroundColorAttributeName:self.cDownLabel.textColor, NSFontAttributeName:self.cDownLabel.font}];
[timeStamp drawAtPoint:self.timeLabel.center];
[countDownString drawAtPoint:self.view.center];
UIImage *blank = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
return blank;
}
Swift version of Bannings's answer.
You can also use
CoreGraphics
andCoreText
to draw directly on top of the existingCVPixelBufferRef
if it's RGBA (or on a copy if it's YUV). I have some sample code in this answer: https://stackoverflow.com/a/46524831/48125Do you want to as below?
Instead of using
CIBlendWithMask
, you should useCISourceOverCompositing
, try this:I asked Apple DTS about this same issue as all approaches I had were running really slow or doing odd things and they sent me this:
https://developer.apple.com/documentation/avfoundation/avasynchronousciimagefilteringrequest?language=objc
Which got me to a working solution really quickly! you can bypass the CVPixelBuffer altogether using CIFilters, which IMHO is much easier to work with. So if you don't actually NEED to use CVPixelBuffer, then this approach will become your new friend quickly.
A combination of CIFilter(s) to composite the source image and the image with the text I generated for each frame did the trick.
I hope this helps someone else!