I’m developing an augmented reality iPhone app.
What it should basically do is display pictures assigned to geographical locations when you look at them by the camera. Each such picture may be understood as a billboard which has its geographical position and a heading (understood as an angle between its plane and the north direction axis).
The goal is to make these billboards display more or less like they were physical objects. They should be larger if you are close to them and smaller when farther. They should as well appear in a proper perspective when you don’t stand directly in front of them.
I think I have achieved that goal more or less. By measuring the initial heading from the iPhone to a picture I can decide about the rotation angle of the pictures as viewed by the camera (to see them in a proper perspective).
However, if it comes to scaling them based on the distance from the phone, I think I screwed my approach. I made an assumption that the maximum view distanse is, let’s say, 200 m. Then billboards being 100 m from the phone are displayed in 50% of their original size. That’s it. A linear scaling based on the maximum distance.
What I missed by this approach is the size of billboards (understood as physical objects). The way they display on the screen depends on their size in pixels only. This means that the resolution of the display is a factor that decides how you perceive them. So I assume that if you get two phones with the same screen dimensions but different resolutions, the same pictures will be of different sizes on both of them. Am I right?
Then finally, my question is how to approach scaling pictures to make them look good on the AR view?
I think I should take some camera parameters into consideration. When a 10x10 cm object is just in front of the camera it may cover the whole screen. But when you put it a few metres farther, it becomes a minor detail. Then how to approach scaling? If I decide to assign physical dimensions to my virtual billboards, then how to scale them based on the distance from the camera?
Am I right that I should assign physical dimensions in metres to each picture (no matter what their size in pixels is) and display them based on the dimensions and some camera-dependent scaling factor?
Could you please help me on that? Any clues will be helpful. Thank you!
I think I managed to solve my problem. Let me explain how I did it for if might be of use to others. If you find this approach wrong, I will be grateful for your feedback or any further clues.
I decided to assign physical dimensions in metres to my virtual billboards. This discussion helped me find out the parameters of the iPhone 4 camera: focal length and the dimensions of the CCD sesnor. What's more, these values also helped me calculate a proper FOV for my AR app (see Calculating a camera's angle of view).
This website helped me calculate the size in millimeters of a physical object image produced on a CCD sensor. So if my billboards have width and height in metres and their distance from the camera is known as well as the focal length of the camera, I can calculate their size on the sensor.
(Focal Length * Object Dimension) / Lens-to-object distance = Image Size (on the sensor)
double CalculatePhysicalObjectImageDimensionOnCCD(double cameraFocalLength_mm, double physicalObjectDimension_m, double distanceFromPhysicalObject_m)
{
double physicalObjectDimension_mm = physicalObjectDimension_m * 1000;
double distanceFromPhysicalObject_mm = distanceFromPhysicalObject_m * 1000;
return (cameraFocalLength_mm * physicalObjectDimension_mm) / distanceFromPhysicalObject_mm;
}
I have little knowledge on that matter so I’m not sure if the approach I took then is OK, but I just decided to calculate how much larger the iPhone screen is compared to the dimensions of the CCD sensor. So by a simple mathematical operation I get a sensor-to-screen size ratio. Because the width-to-height ratio of the sensor and the screen seem to be different, I calculated the ratio in a kind of cranky way:
double GetCCDToScreenSizeRatio(double sensorWidth, double sensorHeight, double screenWidth, double screenHeight)
{
return sqrt(screenWidth * screenHeight) / sqrt(sensorWidth * sensorHeight);
}
Then the ratio I get can be treated as a multiplier. First I calculate a dimension of my virtual billboard on the sensor and then multiply it by the ratio. This way I get the actual size of the billboard in pixels. That’s it. So when I call the function below just by providing the width of my billboard and the distance from it, it returns the width in pixels of the billboard as viewed on the screen. Same for the height of the billboard to get both dimensions.
const double CCD_DIM_LONGER_IPHONE4 = 4.592; //mm
const double CCD_DIM_SHORTER_IPHONE4 = 3.450; //mm
const double FOCAL_LENGTH_IPHONE4 = 4.28; //mm
double CalculatePhysicalObjectImageDimensionOnScreen_iPhone4(double physicalObjectDimension_m, double distanceFromPhysicalObject_m)
{
double screenWidth = [UIScreen mainScreen].bounds.size.width;
double screenHeight = [UIScreen mainScreen].bounds.size.height;
return CalculatePhysicalObjectImageDimensionOnScreen(FOCAL_LENGTH_IPHONE4, physicalObjectDimension_m, distanceFromPhysicalObject_m, CCD_DIM_LONGER_IPHONE4, CCD_DIM_SHORTER_IPHONE4, screenWidth, screenHeight);
}
double CalculatePhysicalObjectImageDimensionOnScreen(double cameraFocalLength_mm, double physicalObjectDimension_m, double distanceFromPhysicalObject_m, double ccdSensorWidth, double ccdSensorHeight, double screenWidth, double screenHeight)
{
double ccdToScreenSizeRatio = GetCCDToScreenSizeRatio(ccdSensorWidth, ccdSensorHeight, screenWidth, screenHeight);
double dimensionOnCcd = CalculatePhysicalObjectImageDimensionOnCCD(cameraFocalLength_mm, physicalObjectDimension_m, distanceFromPhysicalObject_m);
return dimensionOnCcd * ccdToScreenSizeRatio;
}
It seems that it works perfect compared ty my previous, stupid approach of linear scaling. I also noticed, by the way, that it is really important to know the FOV for your camera when registering virtual objects on an AR view. Here’s how to calculate the FOV based on CCD sensor dimensions and the focal length.
It’s so difficult to find these values anywhere! I wonder why they are not accessible programmatically (at least my research showed me that they are not). It seems that it is necessary to prepare hard-coded values and then check the model of the device the app is running on to decide which of the values to choose when doing all the calculations above :-/.