I'm trying ARKit 1.5 with image recognition and, as we can read in the code of the sample project from Apple:
Image anchors are not tracked after initial detection, so create an
animation that limits the duration for which the plane visualization appears.
An ARImageAnchor
doesn't have a center: vector_float3
like ARPlaneAnchor
has, and I cannot find how I can track the detected image anchors.
I would like to achieve something like in this video, that is, to have a fix image, button, label, whatever, staying on top of the detected image, and I don't understand how I can achieve this.
Here is the code of the image detection result:
// MARK: - ARSCNViewDelegate (Image detection results)
/// - Tag: ARImageAnchor-Visualizing
func renderer(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor) {
guard let imageAnchor = anchor as? ARImageAnchor else { return }
let referenceImage = imageAnchor.referenceImage
updateQueue.async {
// Create a plane to visualize the initial position of the detected image.
let plane = SCNPlane(width: referenceImage.physicalSize.width,
height: referenceImage.physicalSize.height)
plane.materials.first?.diffuse.contents = UIColor.blue.withAlphaComponent(0.20)
self.planeNode = SCNNode(geometry: plane)
self.planeNode?.opacity = 1
/*
`SCNPlane` is vertically oriented in its local coordinate space, but
`ARImageAnchor` assumes the image is horizontal in its local space, so
rotate the plane to match.
*/
self.planeNode?.eulerAngles.x = -.pi / 2
/*
Image anchors are not tracked after initial detection, so create an
animation that limits the duration for which the plane visualization appears.
*/
// Add the plane visualization to the scene.
if let planeNode = self.planeNode {
node.addChildNode(planeNode)
}
if let imageName = referenceImage.name {
plane.materials = [SCNMaterial()]
plane.materials[0].diffuse.contents = UIImage(named: imageName)
}
}
DispatchQueue.main.async {
let imageName = referenceImage.name ?? ""
self.statusViewController.cancelAllScheduledMessages()
self.statusViewController.showMessage("Detected image “\(imageName)”")
}
}
You’re already most of the way there — your code places a plane atop the detected image, so clearly you have something going on there that successfully sets the center position of the plane to that of the image anchor. Perhaps your first step should be to better understand the code you have...
ARPlaneAnchor
has a center
(and extent
) because planes can effectively grow after ARKit initially detects them. When you first get a plane anchor, its transform
tells you the position and orientation of some small patch of flat horizontal (or vertical) surface. That alone is enough for you to place some virtual content in the middle of that small patch of surface.
Over time, ARKit figures out where more of the same flat surface is, so the plane anchor’s extent
gets larger. But you might initially detect, say, one end of a table and then recognize more of the far end — that means the flat surface isn’t centered around the first patch detected. Rather than change the transform
of the anchor, ARKit tells you the new center
(which is relative to that transform).
An ARImageAnchor
doesn’t grow — either ARKit detects the whole image at once or it doesn’t detect the image at all. So when you detect an image, the anchor’s transform
tells you the position and orientation of the center of the image. (And if you want to know the size/extent, you can get that from the physicalSize
of the detected reference image, like the sample code does.)
So, to place some SceneKit content at the position of an ARImageAnchor
(or any other ARAnchor
subclass), you can:
Simply add it as a child node of the SCNNode
ARKit creates for you in that delegate method. If you don’t do something to change them, its position and orientation will match that of the node that owns it. (This is what the Apple sample code you’re quoting does.)
Place it in world space (that is, as a child of the scene’s rootNode
), using the anchor’s transform
to get position or orientation or both.
(You can extract the translation — that is, relative position — from a transform matrix: grab the first three elements of the last column; e.g. transform.columns.3
is a float4
vector whose xyz elements are your position and whose w element is 1.)
The demo video you linked to isn’t putting things in 3D space, though — it’s putting 2D UI elements on the screen, whose positions track the 3D camera-relative movement of anchors in world space.
You can easily get that kind of effect (to a first approximation) by using ARSKView
(ARKit+SpriteKit) instead of ARSCNView
(ARKit+SceneKit). That lets you associate 2D sprites with 3D positions in world space, and then ARSKView
automatically moves and scales them so that they appear to stay attached to those 3D positions. It’s a common 3D graphics trick called “billboarding”, where the 2D sprite is always kept upright and facing the camera, but moved around and scaled to match 3D perspective.
If that’s the effect you’re looking for, there’s an App(le sample code) for that, too. The Using Vision in Real Time with ARKit example is mostly about other topics, but it does show how to use ARSKView
to display labels associated with ARAnchor
positions. (And as you’ve seen above, placing content to match an anchor position is the same no matter which ARAnchor
subclass you’re using.) Here’s the key bit in their code:
func view(_ view: ARSKView, didAdd node: SKNode, for anchor: ARAnchor) {
// ... irrelevant bits omitted...
let label = TemplateLabelNode(text: labelText)
node.addChild(label)
}
That is, just implement the ARSKView
didAdd
delegate method, and add whatever SpriteKit node you want as a child of the one ARKit provides.
However, the demo video does more than just sprite billboarding: the labels it associates with paintings not only stay fixed in 2D orientation, they stay fixed in 2D size (that is, they don’t scale to simulate perspective like a billboarded sprite does). What’s more, they seem to be UIKit controls, with the full set of inherited interactive behaviors that entails, not just 2D images the likes of which are ways to do with SpriteKit.
Apple’s APIs don’t provide a direct way to do this “out of the box”, but it’s not a stretch to imagine some ways one could put API pieces together to get this kind of result. Here are a couple of avenues to explore:
If you don’t need UIKit controls, you can probably do it all in SpriteKit, using constraints to match the position of the “billboarded” nodes ARSKView
provides but not their scale. That’d probably look something like this (untested, caveat emptor):
func view(_ view: ARSKView, didAdd node: SKNode, for anchor: ARAnchor) {
let label = MyLabelNode(text: labelText) // or however you make your label
view.scene.addChild(label)
// constrain label to zero distance from ARSKView-provided, anchor-following node
let zeroDistanceToAnchor = SKConstraint.distance(SKRange(constantValue: 0), to: node)
label.constraints = [ zeroDistanceToAnchor ]
}
If you want UIKit elements, make the ARSKView
a child view of your view controller (not the root view), and make those UIKit elements other child views. Then, in your SpriteKit scene’s update
method, go through your ARAnchor
-following nodes, convert their positions from SpriteKit scene coordinates to UIKit view coordinates, and set the positions of your UIKit elements accordingly. (The demo appears to be using popovers, so those you wouldn’t be managing as child views... you’d probably be updating the sourceRect
for each popover.) That’s a lot more involved, so the details are beyond the scope of this already long answer.
A final note... hopefully this long-winded answer has been helpful with the key issues of your question (understanding anchor positions and placing 3D or 2D content that follows them as the camera moves).
But to clarify and give a warning about some of the key words early in your question:
When ARKit says it doesn’t track images after detection, that means it doesn’t know when/if the image moves (relative to the world around it). ARKit reports an image’s position only once, so that position doesn’t even benefit from how ARKit continues to improve its estimates of the world around you and your position in it. For example, if an image is on a wall, the reported position/orientation of the image might not line up with a vertical plane detection result on the wall (especially over time, as the plane estimate improves).
Update: In iOS 12, you can enable "live" tracking of detected images. But there are limits on how many you can track at once, so the rest of this advice may still apply.
This doesn’t mean that you can’t place content that appears to “track” that static-in-world-space position, in the sense of moving around on the screen to follow it as your camera moves.
But it does mean your user experience may suffer if you try to do things that rely on having a high-precision, real-time estimate of the image’s position. So don’t, say, try to put a virtual frame around your painting, or replace the painting with an animated version of itself. But having a text label with an arrow pointing to roughly where the image is in space is great.
To start I have not done exactly what you are trying to do so this is only where I would start if I were going to implement it…
Based on the code you have listed you are using Apple's sample code for Recognizing Images in an AR Experience. That project is setup to use SceneKit which does not have labels, buttons, or images. This means you need to use SpriteKit which has nodes that can display labels and images.
This means your first step is to create a brand new project and select the Augmented Reality template with the content technology set to SpriteKit.
You can look in resetTracking()
from Apple's image recognition sample for how to setup the image recognition part. You will also need to manually add the AR Resources folder to your asset catalog to hold all the reference images.
For placement of items you would use the SpriteKit version of renderer(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor)
, which is view(_ view: ARSKView, didAdd node: SKNode, for anchor: ARAnchor)
.
With Apple's sample code for image recognition new SceneKit objects are added at the center of the reference image and provided the image doesn't get moved the virtual object stays in the center(-ish).
We also know that we can get the height and width of the reference image (as seen in the code you posted).
Odds are we can get the reference image's dimensions with SpriteKit as well and that newly placed SKNode
s will end up in the center of the detected image the same way SCNNode
s do. This means we should be able to create a new SKNode
(SKSpriteNode
for images or SKLabelNode
for labels) and offset it's transform by one half the reference images height to get the top center of the image. Once the node is placed it should appear to stick to the poster (ARKit isn't perfect so there will be a little movement that happens).